ICIC   25583
INSTITUTO DE CIENCIAS E INGENIERIA DE LA COMPUTACION
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
A Supervised Term-Weighting Method and its Application to Variable Extraction from Digital Media
Autor/es:
MAISONNAVE, MARIANO; ANA MAGUITMAN; DELBIANCO, FERNANDO; FERNANDO TOHMÉ
Lugar:
Buenos Aires
Reunión:
Congreso; 47 JAIIO; 2018
Institución organizadora:
SADIO
Resumen:
Successful modeling and prediction depend on effective methods for the extraction of domain-relevant variables. This paper proposes a methodology for identifying domain-specific terms. The proposed methodology relies on a collection of documents labeled as relevant or irrelevant to the domain under analysis. Based on the labeled document collection, we propose a supervised technique that weights terms based on their descriptive and discriminating power. Finally, the descriptive and discriminating values are combined into a general measure that, through the use of an adjustable parameter, allows to independently favor different aspects of retrieval such as maximizing precision or recall, or achieving a balance between both of them. The proposed technique is applied to the economic domain and is empirically evaluated through a human-subject experiment involving experts and non-experts in Economy. It is also evaluated as a term-weighting technique for query-term selection showing promising results. We finally illustrate the potential of the proposal as a first step for identifying different types of associations between words.