ICIC   25583
INSTITUTO DE CIENCIAS E INGENIERIA DE LA COMPUTACION
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
On the assessment of Information Quality in Spanish Wikipedia
Autor/es:
SEBASTIAN PEREZ CASSEIGNAU; MARCELO ERRECALDE; GUIDO URQUIZA; EDGARDO FERRETTI; MATIAS SORIA; SERGIO ALEJANDRO GÓMEZ
Lugar:
San Luis
Reunión:
Congreso; XXII Congreso Argentino de Ciencias de la Computación; 2016
Institución organizadora:
Universidad Nacional de San Luis
Resumen:
Featured Articles (FA) are considered to be the best articles that Wikipedia has to offer and in the last years, researchers have foundinteresting to analyze whether and how they can be distinguished from ``ordinary´´ articles. Likewise, identifying what issues have to beenhanced or fixed in ordinary articles to improve their quality is a recent key research trend.Most of the approaches developed to face these information qualityproblems have been proposed for the English Wikipedia. However, few efforts havebeen accomplished in Spanish Wikipedia, despite being Spanish, oneof the most spoken languages in the world by native speakers. In this respect, we present a first breakdown of Spanish Wikipedia´s quality flaw structure. Besides, we carry out a study to automatically assess information quality in Spanish Wikipedia, where FA identification isevaluated as a binary classification task. Theobtained results show that FA identification can be performed with anF1 score of 0.81, using a document model consisting of only twenty six features and AdaBoosted C4.5 decision trees as classification algorithm.