IEGEBA   24053
INSTITUTO DE ECOLOGIA, GENETICA Y EVOLUCION DE BUENOS AIRES
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
A Standardized Reference Data Set for Vertebrate Taxon Name Resolution
Autor/es:
ZERMOGLIO, PAULA F.; GURALNICK, ROBERT P.; WIECZOREK, JOHN R.
Lugar:
Nairobi
Reunión:
Congreso; Biodiversity Information Standards (TDWG) 2015 Annual Conference; 2015
Institución organizadora:
Biodiversity Information Standards (TDWG)
Resumen:
Taxonomic names associated with digitized biocollections labels have flooded into repositories such as GBIF, iDigBio and VertNet. These names often present myriad issues that need to be resolved before associated records are reliably usable in research. To date, no systematic assessment of the scope of the problem and the effort needed to solve it has been performed. In our study, we first identified and characterized the types of issues present in a random sample of 1000 verbatim names published via the data aggregator VertNet, and we provide the first rigorously reviewed, human-vetted, reference validation data set for vertebrate names. In particular, we focused on detecting misspelling, synonymy, and incorrect use of the Darwin Core standard. Our results reveal that less than 47% of name strings were found to be currently valid, while nearly 97% of name combinations could be resolved to a currently valid name. By associating names back to biocollections records, we then fit logistic models in order to test how certain predictors, such as geographic region, year collected, higher-level clade, and the institutional digitally accessible data volume, affect the prevalence of issues in taxon names. The effects of the predictors, as well as the implications for taxon names cleaning both in databases and on the original labels are discussed. Also, we discuss how our reference validation data set contributes to the development of automated tools, as it constitutes a solid reference against which to test and compare performance of taxon name resolvers.