INVESTIGADORES
STEGMAYER Georgina Silvia
congresos y reuniones científicas
Título:
Cluster Ensembles for Big Data Mining Problems.
Autor/es:
M. PIVIDORI, G. STEGMAYER, D.H. MILONE
Lugar:
Rosario
Reunión:
Simposio; AGRANDA: Simposio de Grandes Datos; 2015
Institución organizadora:
SADIO
Resumen:
Big data represents a real challenge for data mining algorithms, where new approachesfor discovering useful knowledge are being proposed. Mining this type of data involves severalproblems [5], not only the huge volume of information. As this data generally come fromautonomous and decentralized sources, its dimensionality is heterogeneous and diverse, andgenerally involves privacy issues.Each clustering algorithms nowadays has particular characteristics that make it usefulfor different types of problems. For instance, k-means is a popular algorithm that assumesspherical clusters in data; hierarchical approaches are used when there is interest in findingthis type of data structure; expectation-maximization, on the other hand, iteratively adjuststhe parameters of a statistical model to fit the observed data. However, all these methodswork properly only with relatively small data sets. Large-volume data often makes theirapplication unfeasible, not to mention if data come from heterogeneous and autonomoussources and are constantly growing and evolving.