Parallel factor analysis (PARAFAC) of target analytes in GC x GC-TOFMS data: automated selection of a model with an appropriate number of factors

Anal Chem. 2007 Feb 15;79(4):1611-9. doi: 10.1021/ac061710b.

Abstract

PARAFAC (parallel factor analysis) is a powerful chemometric method that has been demonstrated as a useful deconvolution technique in dealing with data obtained using comprehensive two-dimensional gas chromatography combined with time-of-flight mass spectrometry (GC x GC-TOFMS). However, selection of a PARAFAC model having an appropriate number of factors can be challenging, especially at low S/N or for analytes in the presence of chromatographic and spectral overlapping compounds (interferences). Herein, we present a method for the automated selection of a PARAFAC model with an appropriate number of factors in GC x GC-TOFMS data, demonstrated for a target analyte of interest. The approach taken in the methodology is as follows. PARAFAC models are automatically generated having an incrementally higher number of factors until mass spectral matching of the corresponding loadings in the model against a target analyte mass spectrum indicates overfitting has occurred. Then, the model selected simply has one less factor than the overfit model. Results indicate this model selection approach is viable across the detection range of the instrument from overloaded analyte signal down to low S/N analyte signal (total ion current signal intensity at analyte peak maximum S/N < 1). While the methodology is generally applicable to comprehensive two-dimensional separations using multichannel spectral detection, we evaluated it with several target analytes using GC x GC-TOFMS. For brevity in this report, only results for bromobenzene as target analyte are presented. Alternatively, instead of using the model with one less factor than the overfit model, one can select the model with the highest mass spectral match for the target analyte from among all the models generated (excluding the overfit model). Both model selection approaches gave essentially identical results.