INVESTIGADORES
ALCARAZ Mirta Raquel
congresos y reuniones científicas
Título:
MULTIWAY DATA MODELING FOR ENHANCING CLASSIFICATION PERFORMANCE: FLUORESCENCE DATA AS CASE OF STUDY
Autor/es:
AZCARATE, SILVANA M.; ZALDARRIAGA-HEREDIA, JORGELINA; ALCARAZ, MIRTA RAQUEL; CAMIÑA, JOSÉ M.; GOICOECHEA, HÉCTOR CASIMIRO
Reunión:
Conferencia; 18th Chemometrics in Analytical Chemistry Conference; 2022
Resumen:
In the framework of multivariate classification, there is a continuous need for improving methods for identification and characterization. For quantitative purposes, the increase of the order of the data has proved certain benefits concerning the performance of the analytical method as the improvement of selectivity and sensitivity [1]. However, the advantages gained by increasing the order of the data to solve a classification problem have not been deeply studied yet [2].This work aims to explore the data acquisition, feature extraction, and analysis methods of multi-way data arrays to improve the performance of the method to classify olive oils according to different purposes (variety, extraction process, and origin), as proof of concept. For the analysis, 21 olive oil samples, including virgin olive oils (VOO) and extra virgin olive oils (EVOO) of different commercial brands, were evaluated. Third-order data was obtained by acquiring an excitation-emission matrix (EEM) over the excitation range of 300-600 nm and the emission range of 400-700 nm for different periods of infrared heating (t0=no IR heating, original sample). With the acquired data, different data arrays were built and subjected to several chemometric models to evaluate the properties and advantages of each data structure, as well as the performance of the modelling: (1) First-order data analysis using the emission spectrum registered at 348 nm excitation wavelength of each sample at t0; (2a) Second order data analysis using the EEMs acquired for each sample at t0; (2b) Second order data analysis using an emission-IR heating matrix (the emission spectra - λexc 348nm – acquired at different IR heating time) obtained for each sample; (2c) Second order data analysis using an excitation-IR heating matrix (the excitation spectra – λem 450 nm – acquired at different IR heating time) obtained for each sample; (3) Third-order data analysis of the emission-excitation-IR heating array obtained for each sample. Two supervised pattern recognition methods were used to classify the studied oils, partial least squares discriminant analysis (PLS-DA) and its multi-way (NPLS-DA) and unfolding (UPLS-DA) extensions. Moreover, for cases (2) and (3), PARAFAC was implemented as a first decomposition model to extract features and scores that were then used for further classification analysis. Classification results were evaluated through global indices, such as average sensitivity, non-error rate, and average precision. The results revealed a high-class error rate when first-order data was used, higher than 30 %. Notwithstanding, different degrees of improvement were observed by the inclusion of an additional mode to the data structure. The obtained results shed light on the fact that the use of higher-order data is an attractive approach to be explored in the classification field, particularly, in the study of samples with very similar spectral profiles, for which no evident classification patterns are observed. In addition, it is noteworthy to highlight that third-order data modelling profits from the chemical information of the system, in the direction of bettering the performance of the classification analysis.