Orthogonality constrained inverse regression to improve model selectivity and analyte predictions from vibrational spectroscopic measurements

Open Access
Authors
Publication date 15-11-2021
Journal Analytica Chimica Acta
Article number 339073
Volume | Issue number 1185
Number of pages 12
Organisations
  • Faculty of Science (FNWI) - Swammerdam Institute for Life Sciences (SILS)
  • Faculty of Science (FNWI) - Institute for Biodiversity and Ecosystem Dynamics (IBED)
Abstract
In analytical chemistry spectroscopy is attractive for high-throughput quantification, which often relies on inverse regression, like partial least squares regression. Due to a multivariate nature of spectroscopic measurements an analyte can be quantified in presence of interferences. However, if the model is not fully selective against interferences, analyte predictions may be biased. The degree of model selectivity against an interferent is defined by the inner relation between the regression vector and the pure interfering signal. If the regression vector is orthogonal to the signal, this inner relation equals zero and the model is fully selective. The degree of model selectivity largely depends on calibration data quality. Strong correlations may deteriorate calibration data resulting in poorly selective models. We show this using a fructose-maltose model system. Furthermore, we modify the NIPALS algorithm to improve model selectivity when calibration data are deteriorated. This modification is done by incorporating a projection matrix into the algorithm, which constrains regression vector estimation to the null-space of known interfering signals. This way known interfering signals are handled, while unknown signals are accounted for by latent variables. We test the modified algorithm and compare it to the conventional NIPALS algorithm using both simulated and industrial process data. The industrial process data consist of mid-infrared measurements obtained on mixtures of beta-lactoglobulin (analyte of interest), and alpha-lactalbumin and caseinoglycomacropeptide (interfering species). The root mean squared error of beta-lactoglobulin (% w/w) predictions of a test set was 0.92 and 0.33 when applying the conventional and the modified NIPALS algorithm, respectively. Our modification of the algorithm returns simpler models with improved selectivity and analyte predictions. This paper shows how known interfering signals may be utilized in a direct fashion, while benefitting from a latent variable approach. The modified algorithm can be viewed as a fusion between ordinary least squares regression and partial least squares regression and may be very useful when knowledge of some (but not all) interfering species is available.
Document type Article
Language English
Published at https://doi.org/10.1016/j.aca.2021.339073
Downloads
1-s2.0-S0003267021008990-main (Final published version)
Permalink to this page
Back