INVESTIGADORES
SOTO Axel Juan
congresos y reuniones científicas
Título:
Interactive Clustering of Semi-Structured Documents
Autor/es:
AXEL J. SOTO; VLADO KESELJ; EVANGELOS MILIOS
Lugar:
Toronto
Reunión:
Workshop; IBM CASCON 2012; 2012
Institución organizadora:
IBM
Resumen:
Text analytics techniques, such as clustering methods applied to text documents, have been extensively applied in the past years. However, the capacity of extracting useful information from massive amounts of text is still in its first steps. In this paper, we address this problem using two different strategies. The first one is the use of low dimensional embeddings, which allow representing the data documents in a more compact way and hence yielding the efficient application of interactive and visual analysis methods on text corpora. The second strategy comes from the fact that documents usually come with some structure or associated information. This information can be correlated with the patterns or clusters found in the corpus to drive the user towards gaining insight or finding unexpected relationships.