IIIE   20352
INSTITUTO DE INVESTIGACIONES EN INGENIERIA ELECTRICA "ALFREDO DESAGES"
Unidad Ejecutora - UE
artículos
Título:
Interactive Exploration of Parameter Space in Data Mining: Comprehending the Predictive Quality of Large Decision Tree Collections
Autor/es:
KRESIMIR MATKOVIC, LUCIANA PADUA, CLAUDIO DELRIEUX
Revista:
COMPUTERS & GRAPHICS
Editorial:
PERGAMON-ELSEVIER SCIENCE LTD
Referencias:
Lugar: Amsterdam; Año: 2014 vol. 41 p. 99 - 113
ISSN:
0097-8493
Resumen:
Decision trees are an intuitive yet powerful tool for performing predictive data analysis in data mining. In order to generate an adequate predictive model from a data set, a data analyst has to assess the predictive quality of the decision trees derived from severalcombination of working parameters. Except in very simple cases, this may be a tedious and error prone supervised task, since the parameter space is frequently huge. Analysts rely on their intuition and usually test just a few di erent parameter settings. In this work we present an interactive approach to facilitate the comprehension of the predictive power of large collections of decision trees by exploring large portions of the parameter space. For this, we developed novel views that allow to visualize and analyze the predictive quality of hundreds of trees, working together with coordinated multiple views of tree representations (needed to understand the tree shapes and actual information herein), and aggregates of Receiver Operating Characteristic (ROC) and lift Curves for assessing the predictive quality of the models. We developed a worked example using a data set from a Telecommunications company, showing how easy and natural it is to gain insight on the behavior of the data within our exploration tool, as compared with the traditional and widespread common practice of data analysts.