ICYTE   26279
INSTITUTO DE INVESTIGACIONES CIENTIFICAS Y TECNOLOGICAS EN ELECTRONICA
Unidad Ejecutora - UE
artículos
Título:
Discovering knowledge from data clustering using automatically defined interval type-2 fuzzy predicates
Autor/es:
COMAS, DIEGO S.; BALLARIN, VIRGINIA L.; MESCHINO, GUSTAVO J.; NOWÉ, ANN
Revista:
EXPERT SYSTEMS WITH APPLICATIONS
Editorial:
PERGAMON-ELSEVIER SCIENCE LTD
Referencias:
Lugar: Amsterdam; Año: 2017 vol. 68 p. 136 - 150
ISSN:
0957-4174
Resumen:
In data clustering fuzzy predicates act as cluster descriptors providing linguistically expressed knowledge which indicates how features are related to each cluster. Fuzzy predicates directly and automatically ob- tained from data enable discovering knowledge inside clusters, even when there is no prior-information about the clustering problem. In this work a new method for automatic discovering of interval type-2 fuzzy predicates in data clustering is proposed, called Type-2 Data-based Fuzzy Predicate Clustering (T2- DFPC). In a first stage, a data analysis is performed by making a random partition of the original data and running a clustering scheme that automatically determines the suitable number of clusters. From this stage, interval type-2 fuzzy predicates are discovered. Results obtained on very different cluster- ing datasets show that the T2-DFPC method was consistently one of the best in terms of accuracy. The method preserves all known advantages of the interval type-2 FL to deal with problems with vagueness, quantifying the degree of truth of the fuzzy predicates and modelling the variability of the data inside the clusters. The proposed method is a fast, useful, general, and unsupervised approach for interpretable data clustering, being the knowledge-extracting capabilities one of the main contributions. Linguistic ex- pressions can be easily adapted to match the terminology used in the field the data are related to. The predicates are able to generalize the knowledge for new cases (new data), as an intelligent system. This new approach might be surprisingly useful in contexts where, besides the clustering partition, summary information from data is of interest.