Introduction

An object-oriented approach to programming provides convenient syntax to access common operations used for analysis of spectra. In addition, functions written with possibility for function composition / piping (similar to method chaining) in mind can make the flow of operations more clear.

Objects defined by APRLspec include:

The main object class is Spec, which is a matrix with wavenumber attributes that are perserved and accordingly modified through subsetting, etc. The problem that the creation of this object solves is the need to pass around a wavenumber vector together with the spectra matrix and explicitly manage their dimensions together (floating-point numbers should not be used for column keys/labels). Methods for Spec can be defined to operate on single spectrum (a row vector) - examples include spectra-specific functions such as baseline correction and peak fitting - and these same functions can be applied row-wise on spectra matrices through SpecApply. In addition, Spec objects can be used with functions that operate on matrices more generally (PCA, PMF, cluster analysis, etc.). The data frame representation SpecDF is provided for use with dplyr and ggplot. SpecDB provides an interface in which a spectra residing in a relational database can be used in place of a Spec object, though current implementation is mostly useful for saving to and extracting from an SQL database.

We implement object oriented programming (OOP) with R’s S3 object class (Chambers and Hastie, 1992). The main idea of this style of OOP is that the dispatched function is determined by the object class of the first argument.

Since we follow Google’s R Style Guide conventions in which user-defined function names are written in Pascal case (first letters of joined words are capitalized), functions associated with the APRLspec and related packages often have names distinct from typical R functions (which are written in lower case). However, we additionally add class methods to built-in functions such as [ (subsetting operator), plot, summary - and methods in custom packages such as reshape2::melt - to take advantage of common language idioms.

Additional resources on OOP in R can be found, among elsewhere, in writings by Wickham (2015) (1, 2) and Mertz (2005).

Conventions:

R examples

library(APRLspec)
library(APRLssb)
library(APRLmpf)
library(RJSONIO)

I/O

Read raw spectra and select two samples for this example.

specraw <- ReadSpec(file.path("data", "IMPROVE2011_spec_raw.rds"))
example <- c("OLYMX_20110720", "TRCRX_20110720")

Spec and SpecW

The original spectra matrix is of class Spec, which inherits from matrix.

class(specraw)
## [1] "Spec"   "matrix"

We can create a SpecW version of the spectra matrix, which inherits from both Spec and matrix. This means that methods not defined specifically for SpecW will use those defined for Spec.

specrawW <- as.SpecW(specraw)
class(specrawW)
## [1] "SpecW"  "Spec"   "matrix"

Attributes (wavenumbers)

When we compare object attributes, we see that they differ in the attribute "wavenum" and "wscale", which designate how wavenumber information is stored.

names(attributes(specraw))
## [1] "dim"      "dimnames" "class"    "wavenum"
names(attributes(specrawW))
## [1] "dim"      "dimnames" "class"    "wscale"

"wavenum" is the vector of wavenumbers. "wscale" contains only the range of wavenumbers, and the interval is determined by the number of columns of the spectra matrix of SpecW (which can be used uniquely to generate a unique vector of wavenumbers). This convention (inspired by the Wave object and its “x scale” attribute of Igor Pro; (Wavemetrics, Inc., 2018)) takes advantage of the fact that spectral values are spaced at even frequencies.

head(attributes(specraw)$wavenum)
##       W1       W2       W3       W4       W5       W6 
## 3998.423 3997.138 3995.852 3994.566 3993.281 3991.995
attributes(specrawW)$wscale
## [1] 3998.423  420.413

This is internally handled by the Wavenumbers method, which is defined separately for the Spec and SpecW objects.

getS3method("Wavenumbers", "Spec")
## function (x) 
## attr(x, ATTR.WAVENUMBERS)
## <bytecode: 0x7fbcb3d2b708>
## <environment: namespace:APRLspec>
getS3method("Wavenumbers", "SpecW")
## function (x) 
## {
##     w <- attr(x, ATTR.WSCALE)
##     setNames(seq(w[1], w[2], , dim(x)[2]), dimnames(x)[[2]])
## }
## <bytecode: 0x7fbcaf5628e8>
## <environment: namespace:APRLspec>
w <- Wavenumbers(specraw)
dim(specraw[,w > 1500])             # is possible
## [1]  744 1944
dim(specraw[,w > 1300 | w < 1000])  # is possible
## [1]  744 2550
w <- Wavenumbers(specrawW)
dim(specrawW[,w > 1500])            # is possible
## [1]  744 1944
dim(specrawW[,w > 1300 | w < 1000]) # is not possible
## Error in `Wavenumbers<-.SpecW`(`*tmp*`, value = Wavenumbers(x)[dimnames(xnew)[[2]]]): [USER ERROR]: Wavenumbers are not contiguous.

In the latter case, it may be better to mask the undesired region with NAs.

specrawW[,!(w > 1300 | w < 1000)] <- NA 

Or convert to Spec.

names(attributes(as.Spec(specrawW)))
## [1] "dim"      "dimnames" "class"    "wavenum"

SpecW is more predictable in that it is consistent with the recorded spectra. Spec can be more generally useful for chemometric analyses where non-contiguous wavenumbers are selected for analysis.

Both spectra objects can be converted to matrices using as.matrix. Attributes associated with wavenumbers will be lost.

names(attributes(as.matrix(specraw)))
## [1] "dim"      "dimnames"
names(attributes(as.matrix(specrawW)))
## [1] "dim"      "dimnames"

Subsetting

The next part is common to Spec and SpecW. Selecting a single row (sample) or column (wavenumber) does not return a vector (as with a matrix) but another Spec object with one row.

dim(specraw[example[1],1:6])
## [1] 1 6

In contrast,

dim(as.matrix(specraw)[example[1],1:6])
## NULL

Alternatively, adding the drop=TRUE argument to the subsetting operator [ will return a vector if only one row or column is selected.

out  <- specraw[example[1],1:6,drop=TRUE]
class(out)
## [1] "numeric"
dim(out)
## NULL

Building Spec object from spectrum functions

Functions to apply to a single spectrum are typically written and tested to accept wavenumbers and absorbances and return transformed absorbances or another arbitrary output (vector, or list). “X-Y data” is common in spectroscopy and other fields; writing a function to accept two vectors (e.g., x and y) does not require the user to know the details of the Spec or SpecW objects to implement new operations.

Detrend <- function(x, y, interval=c(3700, 2200)) {
  ## x = wavenumbers
  ## y = absorbances
  i <- apply(abs(outer(interval, x, `-`)), 1, which.min)
  line <- approx(x[i], y[i], x)$y
  y - line # numeric vector
}

Using the wrapper SpecFUN, such functions can be converted to another function that accepts a single Spec sample. The resulting function can then be applied to multiple spectra using SpecApply. The output in this case is a list, which can be combined (using rbind) to form another Spec object. Here is an example:

specmod <- Spec(do.call(rbind, SpecApply(specraw[example,], SpecFUN(Detrend))), Wavenumbers(specraw))
par(mfrow=c(2, 1))
plot(specraw[example,], xlim=eval(formals(Detrend)$interval))
plot(specmod, xlim=eval(formals(Detrend)$interval))

SpecDF illustration

Create a data frame from the spectra matrix.

library(reshape2)
wf <- specraw[example,]
lf <- melt(wf)
kable(head(lf))
sample iw spec wavenum
OLYMX_20110720 W1 0.5094895 3998.423
TRCRX_20110720 W1 0.6247311 3998.423
OLYMX_20110720 W2 0.5092130 3997.138
TRCRX_20110720 W2 0.6244755 3997.138
OLYMX_20110720 W3 0.5089356 3995.852
TRCRX_20110720 W3 0.6242157 3995.852

ggplot is useful for plotting if additional attributes are added (e.g., spectra type based on sample label). Here we just show grouping by the sample label only.

library(ggplot2)
ggplot(lf) +
  geom_line(aes(wavenum, spec, color=sample)) +
  scale_x_reverse() +
  labs(x = expression(Wavenumber~(cm^-1)), y = "Absorbance") +
  theme_bw()

We can convert the spectrum back to the original Spec matrix.

wfc <- scast(lf)
plot(wfc)

identical(wf, wfc)
## [1] TRUE

SpecDB illustration

References

Chambers, J. M. and Hastie, T. J., Eds.: Statistical models in s, Wadsworth & Brooks/Cole, Pacific Grove, CA., 1992.

Mertz, D.: The r programming language, part 3: Reusable and object oriented programming, [online] Available from: http://gnosis.cx/publish/programming/R3.html, 2005.

Wavemetrics, Inc.: Volume ii - user’s guide part 1. [online] Available from: http://www.wavemetrics.net/doc/igorman/II-05%20Waves.pdf, 2018.

Wickham, H.: Advanced r, CRC Press., 2015.