ICIC   25583
INSTITUTO DE CIENCIAS E INGENIERIA DE LA COMPUTACION
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
MOGP Strategies for Topical Search Using Wikipedia
Autor/es:
CECCHINI, ROCÍO LUJÁN; BAGGIO, MARIA CECILIA; MILIOS, EVANGELOS; MAGUITMAN, ANA GABRIELA
Lugar:
Berlin
Reunión:
Simposio; The 19th ACM Symposium on Document Engineering (DocEng 2019); 2019
Resumen:
Genetic Programming techniques have demonstrated great potential in dealing with the problem of query generation. In order to assist the user with thematic recommendations, this work explores different Multi-Objective Genetic Programming strategies for evolving a collection of topical Boolean queries. This study compares three approaches to build topical Boolean queries: using terms, incorporating Wikipedia semantics (Wikipedia concepts) and a hybrid approach, using a combination of both terms and concepts. In addition, different fitness functions are combined giving rise to seven multi-objective schemes. In particular, we propose novel fitness functions aimed at attaining high diversity based on the information-theoretic notion of entropy and Jaccard similarity. Experiments were completed using 25 topics from a dataset consisting of approximately 350,000 webpages classified into 448 topics. The results reveal that there are no statistically significant improvements in efficiency when terms, concepts or a combination h of both is used. However, the use of terms allows to discover rartificial queries that are hard to interpret by the humans. On the contrary, the use of concepts have a positive effect on interpretability and simplicity (considering the number of operands), resulting in better execution times. In ddition, p several differences are observed when using different combinations of fitness o functions.