Applying several operations in sequence - passing the output of one method to the next - can be useful in writing concise code and allows better understanding of the flow of operations. Abelson et al. (1996) and more specifically to R, Wickham (2015) (functional programming in R), Wickham and Grolemund (2017) (pipes), and Mertz (2004) can provide further reading on this topic. Combining multiple operations is accomplished by method chaining (in object oriented programming), function composition (in functional programming), and piping (in various domains).
Here we will apply an example for obtaining integrated peak areas from baseline correction and peak fitting.
library(APRLspec)
library(APRLssb)
library(APRLmpf)
library(RJSONIO)
specraw <- as.SpecW(ReadSpec(file.path("data", "IMPROVE2011_spec_raw.rds")))
example <- c("OLYMX_20110720", "TRCRX_20110720")
examp2 <- specraw[example,]
Baseline correction. (Details can be found here.)
ssbPath <- ExpandPath(package = "APRLssb")
segments <- fromJSON(ssbPath("$extdata/segmentsKDT.json")) # segment definitions
ssbenv <- new.env() # environment for saving functions
source(ssbPath(segments[["Rfile"]]), ssbenv) # source functions into environment
Oftentimes, a single baseline correction algorithm is not suitable for the entire spectrum. The segments to be fitted and fit parameters are defined in whichsegment
. How to combine them into a single spectrum is defined by stitchmethod
(arguments to Stitch
).
params.ssb <- list(
whichsegment = list(
segm1 = list(df=8),
segm2 = list(df=6)
),
stitchmethod = list(
method="x"
)
)
Peak fitting files. (Details can be found here.)
mpfPath <- ExpandPath(package = "APRLmpf")
maskbounds <- fromJSON(mpfPath("$extdata/maskbounds.json")) # mask bound definitions
initconstr <- fromJSON(mpfPath("$extdata/initconstr.json")) # initial values and constraints
profiles.inp <- ReadSpec(mpfPath("$extdata/profilesTJR.csv"), header=TRUE) # spectral profiles
abscoef <- fromJSON(mpfPath("$extdata/abscoef.json")) # absorption coefficient definitions
colors.fg <- fromJSON(mpfPath("$extdata/fgcolors.json")) # color definitions
The peaks to be fitted are defined in peaksequence
. Additional flags can be passed to flags
.
params.mpf <- list(
peaksequence = list(
"carbonylCO+amineNH" = list(peak = c("carbonylCO", "amineNH"), mask = "carbonylCO+amineNH"),
"carboxylicCOH" = list(scale = "carboxylicCOH", mask = "carboxylicCOH"),
"ammoniumNH" = list(scale = "ammoniumNH", mask = "ammoniumNH"),
"alcoholCOH2" = list(peak = c("alcoholCOH1", "alcoholCOH2"), mask = "alcoholCOH2"),
"alkaneCH4" = list(peak = c("alkaneCH1", "alkaneCH2", "alkaneCH3", "alkaneCH4"), mask = "alkaneCH4"),
"alkeneCH+aromaticCH" = list(peak = c("unid1", "unid2", "unid3", "unid4", "alkeneCH", "aromaticCH"), mask = "alkeneCH+aromaticCH")
),
flags = NULL #"limitcCOH"
)
For reference, we show the result we will eventually want to obtain by method chaining or piping.
Here we move to an example with two spectra.
bl <- FitBaseline(examp2, params.ssb$whichsegment, segments, ssbenv, params.ssb$stitchmethod)
par(mfrow=c(1, 2))
plot(examp2)
plot(bl)
As before, we can align the fixed profiles for scaling and generate the peak parameter list for all spectra prior to fitting.
profiles <- Align(profiles.inp, Wavenumbers(bl))
peakparam <- FitPeaksPrep(params.mpf$peaksequence, initconstr, maskbounds, profiles)
fp <- FitPeaks(bl, params.mpf$peaksequence, params.mpf$flag, peakparam, abscoef)
kable(fp)
carbonylCO.K1 | carbonylCO.K2 | carbonylCO.K3 | amineNH.K4 | amineNH.K5 | amineNH.K6 | carboxylicCOH.P1 | ammoniumNH.P1 | alcoholCOH1.K1 | alcoholCOH1.K2 | alcoholCOH1.K3 | alcoholCOH2.K4 | alcoholCOH2.K5 | alcoholCOH2.K6 | alkaneCH1.K1 | alkaneCH1.K2 | alkaneCH1.K3 | alkaneCH2.K4 | alkaneCH2.K5 | alkaneCH2.K6 | alkaneCH3.K7 | alkaneCH3.K8 | alkaneCH3.K9 | alkaneCH4.K10 | alkaneCH4.K11 | alkaneCH4.K12 | unid1.K1 | unid1.K2 | unid1.K3 | unid2.K4 | unid2.K5 | unid2.K6 | unid3.K7 | unid3.K8 | unid3.K9 | unid4.K10 | unid4.K11 | unid4.K12 | alkeneCH.K13 | alkeneCH.K14 | alkeneCH.K15 | aromaticCH.K16 | aromaticCH.K17 | aromaticCH.K18 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
OLYMX_20110720 | 0.003328451 | 1720 | 42.426 | 0.0014635558 | 1630 | 17.52125 | 0.0009633942 | 1.4903585 | 0.0023318025 | 3395.446 | 85.23663 | 0.0004806102 | 3200.000 | 106.07000 | 0.0004069318 | 2855 | 60.10400 | 0.0015844832 | 2876 | 53.033 | 0.001547315 | 2932.000 | 35.355 | 0.0007200382 | 2815.000 | 31.82 | 0.0010418436 | 3139.200 | 27.16675 | 0.001176322 | 3073.100 | 27.93100 | 0.0000000000 | 3011.3 | 33.07078 | 0.000000e+00 | 2954.7 | 16.72981 | 3.551609e-05 | 3050 | 4.9497 | 0.0000000000 | 2980 | 4.9497 |
TRCRX_20110720 | 0.003673597 | 1720 | 42.426 | 0.0005467325 | 1630 | 11.40536 | 0.0000000000 | 0.4531038 | 0.0003958191 | 3500.000 | 34.35647 | 0.0016880525 | 3368.049 | 75.68774 | 0.0008378948 | 2855 | 22.63279 | 0.0001111515 | 2886 | 14.142 | 0.002093618 | 2928.573 | 35.355 | 0.0002685722 | 2801.712 | 31.82 | 0.0009209802 | 3139.177 | 27.19679 | 0.000884007 | 3073.129 | 27.96134 | 0.0005520669 | 3008.3 | 25.90420 | 9.843879e-06 | 2957.7 | 17.43700 | 0.000000e+00 | 3050 | 4.9497 | 0.0002795912 | 2980 | 4.9497 |
par(mfrow=c(1, 2))
for(i in rownames(bl))
PlotFitsTJR(bl[i,], fp[i,], profiles, colors.fg, main=i, auto=FALSE)
areas <- Integrate(fp, profiles)
kable(areas)
carboxylicCOH | ammoniumNH | alcoholCOH | alkaneCH | alkeneCH | amineNH | aromaticCH | carbonylCO | unid | |
---|---|---|---|---|---|---|---|---|---|
OLYMX_20110720 | 0.6256897 | 6.129138 | 0.6259886 | 0.4664962 | 0.0004407 | 0.0642783 | 0.0000000 | 0.3539681 | 0.1533037 |
TRCRX_20110720 | 0.0000000 | 1.863401 | 0.3543465 | 0.2584375 | 0.0000000 | 0.0156305 | 0.0034689 | 0.3906732 | 0.1610213 |
In principle, we can sequentially apply peak fitting to the baseline corrected spectra - but the baseline corrected spectra (or, more precisely, their wavenumbers) have to be used for 1) aligning fixed profiles and 2) fitting peaks. We can define a function in which the baseline corrected spectrum is used to align (wavenumbers of) the input profiles and also fit peaks.
FitPeaks2 <- function(bl, peakseq, flag, initconstr, maskbounds, profiles, abscoef=NULL) {
FitPeaks(bl, peakseq, flag, FitPeaksPrep(peakseq, initconstr, maskbounds, Align(profiles, Wavenumbers(bl))), abscoef)
}
An example of function composition can be shown below. Function composition redefines the combination of two functions \(f\) and \(g\) to be applied successively on argument \(x\) to be defined as \(c = f \circ g\), so that \(c(x) = f(g(x))\). The binary operator \(\circ\) is generalized to a function called purrr::compose
in R. In this effort, it is useful to redefine the functions involved to accept as their main argument the output of the previous function. This can be accomplished with an operation known as currying or partial function application implemented by purrr::partial
in R.
library(purrr)
Combinedfn <- compose(
partial(Integrate, profiles, .first=FALSE),
partial(FitPeaks2, params.mpf$peaksequence, params.mpf$flag, initconstr, maskbounds, profiles.inp, .first=FALSE),
partial(FitBaseline, params.ssb$whichsegment, segments, ssbenv, params.ssb$stitchmethod, .first=FALSE)
)
areas2 <- Combinedfn(examp2)
Piping has become more commonplace in R, and its syntax is perhaps more relatable in that the sequence of operations to be applied are necessarily applied from left to right. No additional mechanism is required for currying in the way that R’s magrittr
package implements pipes, where the first argument of the function to be applied is the return value of the previous function.
library(magrittr)
areas3 <- examp2 %>%
FitBaseline(params.ssb$whichsegment, segments, ssbenv, params.ssb$stitchmethod) %>%
FitPeaks2(params.mpf$peaksequence, params.mpf$flag, initconstr, maskbounds, profiles.inp) %>%
Integrate(profiles)
The obtained peak parameters are shown to be identical to what are shown above.
identical(areas, areas2)
## [1] TRUE
identical(areas, areas3)
## [1] TRUE
In practice, we do want to save the baseline corrected spectra as a separate for visualization - so the piping shown in this example is not particularly advantageous, but illustrates how sequence of operations can be connected together to produce interpretable results from raw spectra.
Abelson, H., Sussman, G. and Sussman, J.: Structure and interpretation of computer programs, MIT Press., 1996.
Mertz, D.: The R Programming Language, Part 2: Functional Programming and Data Explorations, [online] Available from: http://gnosis.cx/publish/programming/R2.html, 2004.
Wickham, H.: Advanced r, CRC Press., 2015.
Wickham, H. and Grolemund, G.: R for data science, O’Reilly Media., 2017.