INVESTIGADORES
RINALDI Carlos Alberto
congresos y reuniones científicas
Título:
Data Mining and Variable Selection of LIBS Measurements to Predict Metal Concentrations
Autor/es:
JUAN VOROBIOF; BOGGIO ,NORBERTO GABRIEL; CHECOZZI, F. R.; RINALDI, C. A.
Lugar:
Iguazú
Reunión:
Congreso; XIII World Conference on Laser Induced Breakdown Spectroscopy 2024; 2024
Institución organizadora:
Comisión Nacional de Energía Atómica
Resumen:
This study aims to predict metal alloy concentrations from extensive Laser-Induced BreakdownSpectroscopy (LIBS) datasets, specifically from the LIBS 2022 conference [1]. The dataset includesa large training file (1.25 GB, 2101x40006 values) containing spectra and correspondingconcentrations of Cr, Mn, Mo, and Ni, and a test file (0.57 GB, 751x40003 values) with spectra forconcentration prediction. The training set consists of 50 spectra of 42 targets (totalling 2100spectra), while the test set contains 50 spectra of 15 targets (totalling 750 spectra). Data collectioninvolved using a Nd-Yag laser (1064 nm, 95 mJ pulse energy, 10 ns pulse width). The studycompares two data dimensionality reduction techniques: variable selection and PrincipalComponent Analysis (PCA). Variable selection is highlighted for its advantages in resultinterpretability and maintaining original variables, while PCA is noted for reducing dimensionalitywith minimal information loss [2]. Results obtained through variable selection reveal the mostsignificant features in the data. Strong correlations between different variables are identified,highlighting those with the greatest influence on contamination levels and environmental changes.By reducing the dataset´s dimensionality and focusing on the most important variables, resultinterpretation is facilitated, and the efficiency of the developed predictive models is improved.Figures 1 show the most relevant Variable Importance in Projection (VIP) scores as a function of thevariables (wavelength) and the predicted concentrations of the test group for four elements. Inconclusion, this study demonstrates the importance of variable selection as an integral part of LIBSanalysis using complex multivariate data.

