ICIC   25583
INSTITUTO DE CIENCIAS E INGENIERIA DE LA COMPUTACION
Unidad Ejecutora - UE
artículos
Título:
Towards Information Quality Assurance in Spanish Wikipedia
Autor/es:
FERRETTI EDGARDO; POHN L.; ERRECALDE M; SORIA M.; URQUIZA G.; PEREZ-CASSEIGNAU S.; SERGIO ALEJANDRO GÓMEZ
Revista:
Journal of Computer Science & Technology
Editorial:
ISTEC (Iberoamerican Science & Technology Education Consortium)
Referencias:
Lugar: Albuquerque; Año: 2017 vol. 17 p. 29 - 36
ISSN:
1666-6046
Resumen:
Featured Articles (FA) are considered to be thebest articles that Wikipedia has to offer and inthe last years, researchers have found interestingto analyze whether and how they can be distinguishedfrom ?ordinary? articles. Likewise, identifyingwhat issues have to be enhanced or fixed inordinary articles in order to improve their qualityis a recent key research trend. Most of theapproaches developed to face these informationquality problems have been proposed for the EnglishWikipedia. However, few efforts have beenaccomplished in Spanish Wikipedia, despite beingSpanish, one of the most spoken languagesin the world by native speakers. In this respect,we present a breakdown of Spanish Wikipedia?squality flaw structure. Besides, we carry out studieswith three different corpora to automaticallyassess information quality in Spanish Wikipedia,where FA identification is evaluated as a binaryclassification task. Our evaluation on a unifiedsetting allows to compare with the English version,the performance achieved by our approachon the Spanish version. The best results obtainedshow that FA identification in Spanish, can beperformed with an F1 score of 0.88 using a documentmodel consisting of only twenty six featuresand Support Vector Machine as classification algorithm.