\documentclass[9pt]{article} \usepackage{spconf,amsfonts,amsmath,graphicx} \usepackage[english]{babel} \usepackage[T1]{fontenc} \usepackage[utf8]{inputenc} %\usepackage{cite} \usepackage{xspace} \usepackage{breqn} \usepackage{url} % definitions % -------------------- \newcommand{\bs}[1]{\boldsymbol{#1}} \newcommand{\parm}{{\theta}} \newcommand{\parms}{{\bs{\parm}}} \newcommand{\Exppdf}[1]{{\mathcal{E}}\left({#1}\right)} \newcommand{\Gammapdf}[1]{\ensuremath{\mathcal{G}}\left(#1\right)\xspace} \newcommand{\Normalpdf}[1]{\ensuremath{\mathcal{N}}\left(#1\right)\xspace} \newcommand{\NormalTpdf}[1]{\ensuremath{\mathcal{N}^T}\left(#1\right)\xspace} \newcommand{\Uniformpdf}[1]{\ensuremath{U}\left(#1\right)\xspace} \newcommand{\figdir}{../figures} \newcommand{\scsum}[2]{{\displaystyle\mathop{\textstyle\sum}_{\makebox[0pt][c] {$\scriptscriptstyle #1$}}^{\makebox[0pt][c]{$\scriptscriptstyle #2$}}}} \title{A pseudo-Voigt component model for high-resolution recovery of constituent spectra in Raman spectroscopy} \name{ Tommy S. Alstrøm$^{\star}$, Mikkel N. Schmidt$^{\star}$, Tomas Rindzevicius$^{\dagger}$, Anja Boisen$^{\dagger}$, Jan Larsen$^{\star}$ \thanks{This research was funded by the IDUN Center of Excellence supported by the Danish National Research Foundation (DNRF122) and the Velux Foundations (Grant No. 9301).}} \address{$^\star$Department of Applied Mathematics and Computer Science, Technical University of Denmark\\ Richard Petersens Plads 324, 2800 Kgs. Lyngby, Denmark \\\\ $^\dagger$Department of Micro and Nanotechnology, Technical University of Denmark \\ Ørsteds Plads 345C, DK-2800, Kgs. Lyngby, Denmark\\ } \begin{document} %\ninept % \maketitle % \begin{abstract} %1. Motivation. Why do we care? \\ Raman spectroscopy is a well-known analytical technique for identifying and analyzing chemical species. Since Raman scattering is a weak effect, surface-enhanced Raman spectroscopy (SERS) is often employed to amplify the signal. % SERS signal surface mapping is a common method for detecting trace amounts of target molecules. Since the method produce large amounts of data and, in the case of very low concentrations, low signal-to-noise (SNR) ratio, ability to extract relevant spectral features is crucial. %2. Problem formulation. What problem will we solve?\\ %3. Approach. How did we solve it?\\ We propose a pseudo-Voigt model as a constrained source separation model, %4. Results. What is the answer?\\ that is able to directly and reliably identify the Raman modes, with overall performance similar to the state of the art non-negative matrix factorization approach. %5. Conclusions. What are the implications?\\ However, the model provides better interpretation and is a step towards enabling the use of SERS in detection of trace amounts of molecules in real-life settings. \end{abstract} % \begin{keywords} Raman Spectroscopy, Non-negative matrix factorization (NMF), Bayesian Modeling, Pseudo-Voigt, Multivariate Curve Resolution (MCR) \end{keywords} % \section{Introduction} \label{sec:intro} An important physical phenomenon that can be utilized for chemical analysis via molecular ``fingerprinting'' is the so-called Raman scattering. A key issue with Raman scattering is that it is a very weak process, i.e., only 0.1\% or less incident photons on a sample are inelastically (or Raman) scattered while the rest remain elastically (or Rayleigh) scattered. One straightforward method to amplify the Raman scattering signal is using corrugated, nanosized noble metal surfaces, and the effect is known as surface-enhanced Raman scattering. SERS was discovered in 1973 \cite{Fleischmann1974} and explained by Jeanmaire \& Van Duyne in 1977 \cite{Jeanmaire1977a}. The ``new'' chapter for SERS begun in the late 1990s when researchers showed that in some cases a strong surface enhancement can even enable single molecule detection \cite{Nie1997}. Today surface-enhanced Raman scattering (SERS) is one of the most relevant and rapidly advancing spectroscopy methods for label-free and sensitive detection of target molecules \cite{Schlucker2014}. A good illustration is a record low attomolar detection of TNT and DNT explosives that few methods if any can match \cite{Hakonen2015}. Another advantage is that Raman instrumentation for SERS-based analysis is becoming better, cheaper, and portable \cite{serstech_com}. A number of challenges in using the SERS technique for quantitative molecular detection in complex bioassays still remain \cite{Prochazka2016}. One path for quantitative SERS is using SERS tags and it usually requires development of a reliable quantification methodology based on an ensemble of SERS spectra, e.g.\ SERS signal mapping \cite{YPBA13}. Mapping of SERS signals is necessary because of generally poor reproducibility and uniformity of SERS substrates, which produces significant Raman signal variations. Typically, underlying reasons for SERS signal variations are fabrication driven, i.e., it is very difficult to control accurately nanoparticle morphology and especially inter-particle separation at <10 nm distances which affect significantly electromagnetic enhancement (the so-called “hot spots”) \cite{Petryayeva2011}. Hence, SERS intensities vary between hot spots with even smallest differences in morphology and junction dimensions. Another issue is that target molecules are deposited or adsorbed at random locations on the SERS substrate surface area. Since the coverage is non uniform, significant SERS intensity variations over the substrate area are expected. At extremely low analyte concentrations (<nM) the probability to find target molecule exactly at the hot spot is extremely low, i.e., most hot spots are unoccupied and only few are expected to produce measurable SERS signals \cite{Palla2015a}. And finally, each SERS substrate type coupled with a target molecule display unique characteristics in terms of surface coverage, electromagnetic enhancement factors, distribution of hot spots, etc. Therefore, the statistical interpretation of SERS signals is a crucial problem that needs to be addressed more rigorously. When analyzing complex Raman spectra it is important to resolve overlapping line shapes \cite{Weatherston2016}. While the true lineshape of a vibrational mode can be approximated by a Lorentzian function, the measured lineshape is often the convolution of a Lorentzian and a Gaussian function \cite{Sundius1973} due to the instrument distortion. Therefore, to accurately determine line widths and peak positions, Raman spectra are fitted using a Voigt function, which is a convolution of a Lorentzian with a Gaussian line shape. Herein, we developed a method to automatically analyze large-area Raman or SERS signal maps for more reliable and reproducible identification of Raman modes (analyte molecules). The model is compared to the current state of the art approaches of extracting spectra, which are multivariate factor models with non-negativity constraints. \section{Relation to prior work} Blind source separation techniques such as NMF has been used for spectral data on numerous occasions \cite{Sajda2003,Alstrom2014,Li2005}. In the field of general spectroscopy the method of Multivariate Curve Resolution (MCR) using non-negativity constraints has also been applied~\cite{Tauler1993}. The NMF and MCR models are identical but the inference algorithms differ. MCR uses principal component analysis (PCA) as initialization and the model is then estimated using alternating least squares. NMF can also be learned in a number of ways, and was demonstrated for extracting facial features by Lee and Seung \cite{LeSu99}. In the case of very low signal to noise ratios, blind source separation algorithms fail to recover meaningful and interpretable spectral components. Solutions involve the application of additional constraints on the search for basis vectors \cite{Li2007}, or to include some prelearned basis vectors in the model \cite{Alstrom2014}. Another approach is to fit a power-law distribution at a certain wavenumber \cite{Palla2015a,YPBA13}. However, this method requires knowledge of the exact location of the Raman mode, and in addition assumes that the data is free from outliers. The multivariate approach of NMF/MCR does not suffer from these downsides, so a combination of the two approaches might be ideal. The proposed pseudo-Voigt model is such a combination. \section{Methods} Let $X$ denote the observed data: An array of positive real numbers, where $X_{w,n}$ is the measured amplitude at wave number $w$ in the observed spectrum $n$. We denote the number of observed wavelengths by $W$ and the number of observed spectra by $N$. We will compare two approaches for modeling such spectra: 1) A non-negative matrix factorization approach in which the spectra are modeled as scaled sums of $K$ basis spectra which are learned from the observed data, and 2) a related approach where the basis spectra are constrained to have a pseudo-Voigt shape. \subsection{Non-negative matrix factorization} The non-negative matrix factorization approach to modeling spectroscopy data assumes that each observed spectrum can be decomposed as a weighted sum of constituent basis spectra \cite{Sajda2003,Alstrom2014,Li2005}. Using a Normal observation noise model \begin{equation} X_{w,n}|S,A,\tau \sim \text{N}([SA]_{w,n},\tau^{-1}), \end{equation} where $\tau$ is the noise precision, the data array is modeled as product of two matrices, $S$ and $A$, where the $K$ columns of $S$ are the basis spectra, and each column of $A$ contain their amplitudes for a given observation spectrum. Following \cite{ScWH09}, we assume a standard conjugate Gamma prior for the noise precision, \begin{align} \tau \sim \text{Gamma}(a_{\tau_0},b_{\tau_0}), \end{align} with shape $a_{\tau_0}$ and rate $b_{\tau_0}$. We use independent exponential priors for both the spectra and their amplitudes, \begin{align} A_{k,n} \stackrel{i.i.d}{\sim}\text{Exp}(A_0), \quad S_{k,n} \stackrel{i.i.d}{\sim}\text{Exp}(S_0), \end{align} with rates $A_0$ and $S_0$ respectively. % These priors reflect the constraints that the spectra and their amplitudes are non-negative and corresponds to a Bayesian sparse NMF model. \subsection{Pseudo-Voigt spectral peak model} As in the NMF approach, we will assume an independent Normal noise model, \begin{equation} X_{w,n}|I_{w,n},\tau \sim \text{N}(I_{w,n},\tau^{-1}), \end{equation} with precision $\tau$ and mean $I_{w,n}$ modeled as a sum of $K$ pseudo-Voigt shaped component plus a background component, \begin{equation} I_{w,n} = \sum_{k=1}^K \alpha_{k,n} V_w(c_{k},\gamma_k,\eta_k) + \beta_n B_w. \end{equation} Here, $V$ is a pseudo-Voigt shaped component centered at wave number $w$ with parameters $c$, $\gamma$, and $\eta$; $\alpha_{k,n}$ is a component and observation specific amplitude; $B_w$ is a common background spectrum; and $\beta_n$ is a observation specific background amplitude. The pseudo-Voigt \cite{Wertheim1974} profile is given by \begin{equation} V_w(c,\gamma,\eta) = \eta L_w(c, \gamma) + (1-\eta)G_w(c, \gamma), \end{equation} where $L_w = \frac{\gamma\pi^{-1}}{(w-c)^2+\gamma^2}$ and $G_w = \frac{1}{\sqrt{2\pi}\gamma}\exp\Big(\frac{-(w-c)^2}{2\gamma^2}\Big)$. As in the NMF approach, we assume a standard conjugate Gamma prior for the noise precision. As priors for the pseudo-Voigt profile parameters we use the following: \begin{align} \eta_k &\stackrel{i.i.d}{\sim}\text{Unif}(0,1), & \alpha_{k,n} &\stackrel{i.i.d}{\sim}\text{Exp}(\alpha_0),\\ c_k &\stackrel{i.i.d}{\sim} \text{N}^+(\mu_{c_0},\tau^{-1}_{c_0}), & \gamma_k &\stackrel{i.i.d}{\sim} \text{N}^+(\mu_{\gamma_0},\tau^{-1}_{\gamma_0}), \end{align} where $\text{N}^+$ denotes the Normal distribution truncated to the positive reals. Here $\eta_k$ is given an vague uniform prior, $\alpha_{k,n}$ is exponential to reflect the anticipation that a few observed spectra might contain ``hot spots'' with high amplitudes, and the $c_k$ and $\gamma_k$ parameters are given fairly flexible truncated Normal priors, reflecting that the parameters must be positive while making it possible to choose their hyperparameters to reflect locations and widths of the spectral peaks, or in lack of prior knowledge to set them as vague. The prior for the background component is chosen as \begin{align} \beta_n &\stackrel{i.i.d}{\sim} \text{N}^+(\mu_{\beta_0},\tau^{-1}_{\beta_0}), & B_w &\stackrel{i.i.d}{\sim} \text{N}^+(\mu_{B_0},\tau^{-1}_{B_0}), \end{align} allowing the hyperparameters to be chosen in accordance with prior knowledge. \subsection{Inference} The sparse NMF model is learned using a Gibbs sampler as specified in~\cite{ScWH09}. Following the same approach, we develop a similar algorithm for learning the pseudo-Voigt model. The conditional distributions for the parameters with conjugate priors ($\bs{\alpha}$, $\bs{\beta}$, $\bs{B}$ and $\tau$) are sampled using Gibbs sampling. First define $V_{w,k}=V(w,c_k,\gamma_k,\eta_k)$. The conditional distribution for the weights of the pseudo-Voigt functions is proportional to the product a normal distribution and an exponential distribution and the resulting distribution is then a truncated Normal distribution limited to positive values with center and scale \begin{dgroup} \begin{dmath} \alpha_{k,n}\lvert\bs{X},\bs{\alpha}_{\setminus{k,n}},\bs{\beta},\bs{B},\tau \sim \text{N}^+(\mu_{\alpha_{k,n}},\tau_{\alpha_{k,n}}^{-1}) \end{dmath} \begin{dmath} \tau_{\alpha_{k,n}}^{-1}=\tau\sum_{w=1}^{W}V_{w,k}^2 \end{dmath} \begin{dmath} \mu_{\alpha_{k,n}}=\frac{\!\!\!\tau\scsum{w=1}{W} V_{w,k}\!\!\left(\!\!X_{w,n}\!\!-\!b_nB_w\!\!-\!\!\scsum{k'\neq k}{K}a_{k',n}V_{w,k'}\right)\!\!-\!\alpha_0\!\!\!\!} {\tau\scsum{w=1}{W}V_{w,k}^2} \end{dmath} \end{dgroup} The conditional distributions of the background signals are on a similar account also truncated Normals, \begin{dgroup} \begin{dmath} \beta_{n}\lvert\bs{X},\bs{\alpha},\bs{\beta}_{\setminus{n}},\bs{B},\tau \sim \text{N}^+(\mu_{\beta_n},\tau_{\beta_n}^{-1}) \end{dmath} \begin{dmath} \tau_{\beta_n}^{-1}=\tau_ {\beta_0}+\tau\sum_{w=1}^{W}B_{w}^2 \end{dmath} \begin{dmath} \mu_{\beta_n}=\frac{\tau\scsum{w=1}{W}B_w\left( X_{w,n}-\scsum{k=1}{K}a_{k,n}V_{w,k}\right)+\tau_{\beta_0}\mu_{\beta_0}} {\tau_{\beta_0}+\tau\scsum{w=1}{W}B_{w}^2} \end{dmath} \end{dgroup} The conditional distribution of $B_w$ is almost identical due to symmetry in the likelihood. For the precision we get the standard result \begin{dgroup} \begin{dmath} \tau\lvert\bs{X},\bs{\alpha},\bs{\beta},\bs{B} \sim \text{Gamma}(a_{\tau},b_{\tau}) \end{dmath} \begin{dmath} a_{\tau}=\frac{WN}{2}+a_{\tau_0} \end{dmath} \begin{dmath} b_{\tau}=b_{\tau_0}+\frac{1}{2}\sum_{w=1}^{W}\sum_{n=1}^{N}(X_{w,n}-I_{w_n})^2 \end{dmath} \end{dgroup} The pseudo-Voigt parameters are sampled using Metropolis Hastings. Expanded derivations and implementation details can be found in the supplementary material \section{Results and discussion} \begin{figure}[tb] \begin{minipage}[b]{1.0\linewidth} \centering \centerline{\includegraphics[width=8.5cm]{\figdir/simulated_data.pdf}} % \vspace{2.0cm} \centerline{(a) Data generated by simulation.}\medskip \end{minipage} \begin{minipage}[b]{1.0\linewidth} \centering \centerline{\includegraphics[width=8.5cm]{\figdir/simulated_spectrum.pdf}} % \vspace{2.0cm} \centerline{(b) Example spectrum with a SNR=2.7.}\medskip \end{minipage} \caption{Data generated by simulation. The blue line is an example spectrum with added noise and the dotted red is the true underlying spectrum.} \label{fig:simulated} \end{figure} The algorithm is tested on simulated data where the true parameters are known. The data is generated using a fixed pseudo-Voigt profile added to a randomly generated a smooth background added to zero mean Gaussian distributed noise with precision equal to one. We generated a map of 625 measured spectra at 300 wavenumbers. We created the weights for the pseudo-Voigt profiles by simulating hot-spot behavior. A single hotspot was randomly placed on a grid and the weights of this hotspot were calculated using a 2-d Gaussian curve. At the maximum peak location the SNR is 2.7. The generated data is shown in Fig.~\ref{fig:simulated}. \begin{figure}[t!] \centering \includegraphics[width=8.5cm]{\figdir/simulated_recovered_spectrum.pdf} \caption{Spectrum as identified by the pseudo-Voigt model (blue) and the NMF model (red) compared to the true underlying signal (dotted orange).} \label{fig:recovered_spectrum} \end{figure} \begin{figure}[t!] \centering \includegraphics[width=8.5cm]{\figdir/simulated_background.pdf} \caption{Background signal recovered by the pseudo-Voigt model (blue) and the NMF model (red) compared to the true underlying background signal (dotted orange).} \label{fig:recovered_background} \end{figure} \begin{figure}[t!] \centering \includegraphics[width=8.5cm]{\figdir/simulated_loadings.pdf} \caption{Intensities as identified by the pseudo-Voigt model (blue) and the NMF model (red) compared to the true intensities (dotted orange).} \label{fig:recovered_loadings} \end{figure} The pseudo-Voigt model is able to recover the exact center of the Raman peak, whereas NMF recovers a noisy spectrum with the highest intensity located very close to the correct Raman peak (Fig.~\ref{fig:recovered_spectrum}). The most important attribute of a Raman peak is the location of the center as it can be directly mapped to a specific molecular binding, and thus it is essential to extract the center of a Raman mode reliably. Recovery of the background signal is displayed in Fig. \ref{fig:recovered_background} where the peak clearly interferes with the background signal estimate. The pseudo-Voigt model overestimates the background signal in the region of the Raman peak whereas the NMF model underestimates the background signal here. This is further reflected in the recovery of the Raman peak intensities which are displayed in Fig. \ref{fig:recovered_loadings}. The presence of a molecule is in essence determined by setting a threshold on the intensity. The threshold should ideally be inferred from data but setting it to one gives similar performance for both methods. In the presence of mixtures of peaks with overlap, the posterior densities of both models become multimodal. Determining the hyperparameters is thus essential to obtain the solution which makes physical sense. This can easily and readily be achieved in the pseudo-Voigt model, as the experimenter can intuitively select potential areas of Raman peaks. The algorithm had similar performance for mixtures ($K=3$, not shown) as for a single Raman peak, and was able to recover the centers of Raman peaks. \section{Conclusion} The pseudo-Voigt approach works well for the simulated data, demonstrating comparable performance to current state of the art approaches such as NMF (and MCR). The advantage of the pseudo-Voigt model is that the center of a Raman peak is directly estimated as a model parameter. In addition, the pseudo-Voigt model offers a degree of confidence that the recovered spectrum has a Voigt shape which means that a true Raman peak has been recovered. Extracting the same information from the spectrum recovered by the NMF model requires additional post-processing. Supplementary material including code and data can be found at http://archive.compute.dtu.dk/p/1. \vfill\clearpage\pagebreak \bibliographystyle{IEEEbib} \bibliography{ICASSP2017} \end{document}