ZUNINO SUAREZ Alejandro Octavio
An empirical comparison of feature selection methods in problem transformation multi-label classification
RODRIGUEZ, J. M.; GODOY, D.; ZUNINO, A.
IEEE LATIN AMERICA TRANSACTIONS
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Año: 2016 vol. 14 p. 3784 - 3791
Multi-label classification (MLC) is a supervised learning problem in which a particular example can be associated with a set of labels instead of a single one as in traditional classification. Many real-world applications, such as Web page classification or resource tagging on the Social Web, are challenging for existing MLC algorithms, because the label space grows exponentially as instance space increases. Under the problem transformation approach, the most common alternative for MLC, multi-label problems are transformed into several single label problems, whose outputs are then aggregated into a prediction to the whole classification problem. Feature selection techniques become crucial in large-scale MLC problems to help reducing dimensionality. However, the impact of feature selection in multi-label setting has not been as extensively studied as in the case of single-label data. In this paper, we present an empirical evaluation of feature selection techniques in the context of the three main problem transformation MLC methods: Binary Relevance, Pair-wise and Label power-set. Experimentation was performed across a number of benchmark datasets for multi-label classification exhibiting varied characteristics, which allows observing the behavior of techniques and assessing their impact according to multiple metrics.