INVESTIGADORES
DI PERSIA Leandro Ezequiel
congresos y reuniones científicas
Título:
Annotation pipeline for inferring gene functions integrating GO annotations and expression data
Autor/es:
DI PERSIA, LEANDRO E.; STEGMAYER, GEORGINA; MILONE, DIEGO H
Lugar:
Mendoza
Reunión:
Congreso; 10mo Congreso Argentino de Bioinformática y Biología Computacional; 2019
Institución organizadora:
Asociación Argentina de Bioinformática y Biología Computacional
Resumen:
This work proposes a novel pipeline for inferring gene annotations based on the automatic reconstruction of the semantic similarity between genes. The semantic similarity is a metric defined over a set of terms, where the distance between them is based on the likeness of their meaning or semantic content. We benchmarked the proposal against state-of-the-art methods on three published data sets (Arabidopsis thaliana, Saccharomyces cerevisiae and Dictyostelium discoideum). Independent experiments have shown that the proportion between annotated and unannotated genes does not influences the model accuracy. We have used a leave-one-out cross-validation technique. Being the state-of-the-art an average F 1 = 15% for related methods, we have achieved a F 1 = 30% inaverage, for all 3 species. It can be stated that our proposal has shown the most balanced results, not missing true GO labels and not assigning, either, a large number of false GO terms to un-annotated genes.