INVESTIGADORES
GOICOECHEA Hector Casimiro
artículos
Título:
Ant colony optimization for variable selection in discriminant linear analysis
Autor/es:
PONTES, ALINE S.; ARAÚJO, ALISSON; MARINHO, WEVERTON; GONÇALVES DIAS DINIZ, PAULO H.; ARAÚJO GOMES, ADRIANO; GOICOECHEA, HECTOR C.; SILVA, EDVAN C.; ARAÚJO, MARIO C.U.
Revista:
JOURNAL OF CHEMOMETRICS
Editorial:
JOHN WILEY & SONS LTD
Referencias:
Año: 2020
ISSN:
0886-9383
Resumen:
A new algorithm using ant colony optimization (ACO) for selection of variables in linear discriminant analysis (LDA) is presented. The role of ACO isexplored in the context of LDA classification in which spectral variablemulticollinearity is a known cause of generalization problems. The proposedACO-LDA presents a metaheuristic that mimics the ant´s cooperative behavior,randomly depositing pheromones at vector elements corresponding to themost relevant variables. Such cooperative ant-like behavior, which is absent inthe genetic algorithm, increases the probability of discarding noninformativevariables, favoring construction of more parsimonious models than geneticalgorithm?linear discriminate analysis (GA-LDA). The classification performance of ACO-LDA is assessed in two case studies: (i) classification of ediblevegetable oils (with respect to base oil) via ultraviolet?visible (UV-Vis) spectrometry and (ii) simultaneous classification of tea samples with respect to typeand geographic origin via near-infrared (NIR) spectrometry. In the first study,ACO-LDA was tested in a data set involving wide absorption bands in the UVregion with low-resolution and strong spectral overlapping. In the secondstudy, its capacity to manage a data matrix with high dimensionality was evaluated. In both studies, ACO-LDA selected a small subset of variables, whichled to correct classifications for almost all of the samples, achieving a performance level similar to the well-established partial least squares?discriminantanalysis (PLS-DA), and considerably better than GA-LDA. The use of ACO toselect LDA classification variables can minimize generalization problems commonly associated with multicollinearity