BECAS
VALLESE Federico Danilo
artículos
Título:
An improved successive projections algorithm version to variable selection in multiple linear regression
Autor/es:
CANOVA, LUCIANA DOS SANTOS; VALLESE, FEDERICO DANILO; PISTONESI, MARCELO FABIAN; DE ARAÚJO GOMES, ADRIANO
Revista:
ANALYTICA CHIMICA ACTA
Editorial:
ELSEVIER SCIENCE BV
Referencias:
Año: 2023 vol. 1274
ISSN:
0003-2670
Resumen:
The aim of the successive projections algorithm (SPA) is to enhance the accuracy of multiple linear regressions (MLR) by minimizing the impact of collinearity effects in the calibration data set. Combining SPA with MLR as a variable selection approach has resulted in the SPA-MLR method, which has been reported in literature to produce models with good prediction ability compared to conventional full-spectrum models obtained with partial-least-squares (PLS) in some cases. This paper proposes the addition of a filter step to the current version of the SPA algorithm to reduce the number of uninformative variables before the projection phase and assist the algorithm in selecting the best variables on subsequent steps. The proposed fSPA-MLR algorithm is evaluated in two case studies involving the near-infrared spectrometric analysis of pharmaceutical tablet and diesel/biodiesel mixture samples. Compared to PLS, the fSPA-MLR models demonstrate similar or better performance. Moreover, the fSPA-MLR models outperform the original SPA-MLR in both cross-validation and external prediction. The fSPA-MLR models deliver superior results regardless of the pre-processing algorithm tested, including first-derivative Savitzky-Golay (SG) and Standard Normal Variate (SNV), or even in raw spectra data.