An object-oriented approach to programming provides convenient syntax to access common operations used for analysis of spectra. In addition, functions written with possibility for function composition / piping (similar to method chaining) in mind can make the flow of operations more clear.
Objects defined by APRLspec include:
Spec
(inherits from matrix
class)SpecW
(inherits from Spec
and matrix
class)SpecDB
(database specification - db file and table names)SpecDF
(inherits from data.frame
class)PeakParam
(peak parameter representation) (inherits from matrix
class)The main object class is Spec
, which is a matrix with wavenumber attributes that are perserved and accordingly modified through subsetting, etc. The problem that the creation of this object solves is the need to pass around a wavenumber vector together with the spectra matrix and explicitly manage their dimensions together (floating-point numbers should not be used for column keys/labels). Methods for Spec
can be defined to operate on single spectrum (a row vector) - examples include spectra-specific functions such as baseline correction and peak fitting - and these same functions can be applied row-wise on spectra matrices through SpecApply
. In addition, Spec
objects can be used with functions that operate on matrices more generally (PCA, PMF, cluster analysis, etc.). The data frame representation SpecDF
is provided for use with dplyr
and ggplot
. SpecDB
provides an interface in which a spectra residing in a relational database can be used in place of a Spec
object, though current implementation is mostly useful for saving to and extracting from an SQL database.
We implement object oriented programming (OOP) with R’s S3 object class (Chambers and Hastie, 1992). The main idea of this style of OOP is that the dispatched function is determined by the object class of the first argument.
Wavenumbers
can return a vector of wavenumbers regardless of whether the data is stored as a list, matrix, or relational table by different objects.Since we follow Google’s R Style Guide conventions in which user-defined function names are written in Pascal case (first letters of joined words are capitalized), functions associated with the APRLspec and related packages often have names distinct from typical R functions (which are written in lower case). However, we additionally add class methods to built-in functions such as [
(subsetting operator), plot
, summary
- and methods in custom packages such as reshape2::melt
- to take advantage of common language idioms.
Additional resources on OOP in R can be found, among elsewhere, in writings by Wickham (2015) (1, 2) and Mertz (2005).
Conventions:
library(APRLspec)
library(APRLssb)
library(APRLmpf)
library(RJSONIO)
Read raw spectra and select two samples for this example.
specraw <- ReadSpec(file.path("data", "IMPROVE2011_spec_raw.rds"))
example <- c("OLYMX_20110720", "TRCRX_20110720")
The original spectra matrix is of class Spec
, which inherits from matrix
.
class(specraw)
## [1] "Spec" "matrix"
We can create a SpecW
version of the spectra matrix, which inherits from both Spec
and matrix
. This means that methods not defined specifically for SpecW
will use those defined for Spec
.
specrawW <- as.SpecW(specraw)
class(specrawW)
## [1] "SpecW" "Spec" "matrix"
When we compare object attributes, we see that they differ in the attribute "wavenum"
and "wscale"
, which designate how wavenumber information is stored.
names(attributes(specraw))
## [1] "dim" "dimnames" "class" "wavenum"
names(attributes(specrawW))
## [1] "dim" "dimnames" "class" "wscale"
"wavenum"
is the vector of wavenumbers. "wscale"
contains only the range of wavenumbers, and the interval is determined by the number of columns of the spectra matrix of SpecW
(which can be used uniquely to generate a unique vector of wavenumbers). This convention (inspired by the Wave object and its “x scale” attribute of Igor Pro; (Wavemetrics, Inc., 2018)) takes advantage of the fact that spectral values are spaced at even frequencies.
head(attributes(specraw)$wavenum)
## W1 W2 W3 W4 W5 W6
## 3998.423 3997.138 3995.852 3994.566 3993.281 3991.995
attributes(specrawW)$wscale
## [1] 3998.423 420.413
This is internally handled by the Wavenumbers
method, which is defined separately for the Spec
and SpecW
objects.
getS3method("Wavenumbers", "Spec")
## function (x)
## attr(x, ATTR.WAVENUMBERS)
## <bytecode: 0x7fbcb3d2b708>
## <environment: namespace:APRLspec>
getS3method("Wavenumbers", "SpecW")
## function (x)
## {
## w <- attr(x, ATTR.WSCALE)
## setNames(seq(w[1], w[2], , dim(x)[2]), dimnames(x)[[2]])
## }
## <bytecode: 0x7fbcaf5628e8>
## <environment: namespace:APRLspec>
w <- Wavenumbers(specraw)
dim(specraw[,w > 1500]) # is possible
## [1] 744 1944
dim(specraw[,w > 1300 | w < 1000]) # is possible
## [1] 744 2550
w <- Wavenumbers(specrawW)
dim(specrawW[,w > 1500]) # is possible
## [1] 744 1944
dim(specrawW[,w > 1300 | w < 1000]) # is not possible
## Error in `Wavenumbers<-.SpecW`(`*tmp*`, value = Wavenumbers(x)[dimnames(xnew)[[2]]]): [USER ERROR]: Wavenumbers are not contiguous.
In the latter case, it may be better to mask the undesired region with NA
s.
specrawW[,!(w > 1300 | w < 1000)] <- NA
Or convert to Spec
.
names(attributes(as.Spec(specrawW)))
## [1] "dim" "dimnames" "class" "wavenum"
SpecW
is more predictable in that it is consistent with the recorded spectra. Spec
can be more generally useful for chemometric analyses where non-contiguous wavenumbers are selected for analysis.
Both spectra objects can be converted to matrices using as.matrix
. Attributes associated with wavenumbers will be lost.
names(attributes(as.matrix(specraw)))
## [1] "dim" "dimnames"
names(attributes(as.matrix(specrawW)))
## [1] "dim" "dimnames"
The next part is common to Spec
and SpecW
. Selecting a single row (sample) or column (wavenumber) does not return a vector (as with a matrix
) but another Spec
object with one row.
dim(specraw[example[1],1:6])
## [1] 1 6
In contrast,
dim(as.matrix(specraw)[example[1],1:6])
## NULL
Alternatively, adding the drop=TRUE
argument to the subsetting operator [
will return a vector if only one row or column is selected.
out <- specraw[example[1],1:6,drop=TRUE]
class(out)
## [1] "numeric"
dim(out)
## NULL
Functions to apply to a single spectrum are typically written and tested to accept wavenumbers and absorbances and return transformed absorbances or another arbitrary output (vector, or list). “X-Y data” is common in spectroscopy and other fields; writing a function to accept two vectors (e.g., x
and y
) does not require the user to know the details of the Spec
or SpecW
objects to implement new operations.
Detrend <- function(x, y, interval=c(3700, 2200)) {
## x = wavenumbers
## y = absorbances
i <- apply(abs(outer(interval, x, `-`)), 1, which.min)
line <- approx(x[i], y[i], x)$y
y - line # numeric vector
}
Using the wrapper SpecFUN
, such functions can be converted to another function that accepts a single Spec
sample. The resulting function can then be applied to multiple spectra using SpecApply
. The output in this case is a list, which can be combined (using rbind
) to form another Spec
object. Here is an example:
specmod <- Spec(do.call(rbind, SpecApply(specraw[example,], SpecFUN(Detrend))), Wavenumbers(specraw))
par(mfrow=c(2, 1))
plot(specraw[example,], xlim=eval(formals(Detrend)$interval))
plot(specmod, xlim=eval(formals(Detrend)$interval))
Create a data frame from the spectra matrix.
library(reshape2)
wf <- specraw[example,]
lf <- melt(wf)
kable(head(lf))
sample | iw | spec | wavenum |
---|---|---|---|
OLYMX_20110720 | W1 | 0.5094895 | 3998.423 |
TRCRX_20110720 | W1 | 0.6247311 | 3998.423 |
OLYMX_20110720 | W2 | 0.5092130 | 3997.138 |
TRCRX_20110720 | W2 | 0.6244755 | 3997.138 |
OLYMX_20110720 | W3 | 0.5089356 | 3995.852 |
TRCRX_20110720 | W3 | 0.6242157 | 3995.852 |
ggplot
is useful for plotting if additional attributes are added (e.g., spectra type based on sample label). Here we just show grouping by the sample label only.
library(ggplot2)
ggplot(lf) +
geom_line(aes(wavenum, spec, color=sample)) +
scale_x_reverse() +
labs(x = expression(Wavenumber~(cm^-1)), y = "Absorbance") +
theme_bw()
We can convert the spectrum back to the original Spec
matrix.
wfc <- scast(lf)
plot(wfc)
identical(wf, wfc)
## [1] TRUE
Chambers, J. M. and Hastie, T. J., Eds.: Statistical models in s, Wadsworth & Brooks/Cole, Pacific Grove, CA., 1992.
Mertz, D.: The r programming language, part 3: Reusable and object oriented programming, [online] Available from: http://gnosis.cx/publish/programming/R3.html, 2005.
Wavemetrics, Inc.: Volume ii - user’s guide part 1. [online] Available from: http://www.wavemetrics.net/doc/igorman/II-05%20Waves.pdf, 2018.
Wickham, H.: Advanced r, CRC Press., 2015.