CIFASIS   20631
CENTRO INTERNACIONAL FRANCO ARGENTINO DE CIENCIAS DE LA INFORMACION Y DE SISTEMAS
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Reliable Electronic GO Annotations with True Path Rule
Autor/es:
FLAVIO E. SPETALE; ELIZABETH TAPIA; JAVIER MURILLO; LAURA ANGELONE; GIORGIO VALENTINI; PILAR BULACIO
Lugar:
Rosario
Reunión:
Congreso; IV Congreso Argentino de Bioinformática y Biología Computacional & IV Conferencia de la Sociedad Iberoamericana de Bioinformática SOIBIO.; 2013
Resumen:
Gene annotation is an important problem in bioinformatics research. Possible gene functions and relationships between them can be described by Gene Ontology (GO) [1]. GO provides a controlled vocabulary of terms across three branches, Cellular Component (CC), Molecular Function (MF) and Biological Process (BP). Gene annotation aims the association between biological data and GO concepts, here called GO terms. Gene annotation can be performed experimentally using the EXP GO evidence code (Inferred from Experiment) to tag biological knowledge evidence. Alternatively, to narrow down candidate gene annotations for further experimental work, gene annotation can be performed electronically using the IEA GO evidence code (Inferred from Electronic Annotation).Current IEA annotations are mostly performed by BLAST similarity searches. But in many cases, e.g., for non-model organisms, BLAST similarity scores may be too weak. To overcome this problem, we consider the design of machine learning methods for reliable IEA gene annotations. Without lack of generality, we focus on the prediction of GO BP classes. For this purpose, IEA gene annotation predictions are modeled as a hierarchical multilabel classi cation problem. Under this baseline, we consider the True Path Rule (TPR) [2] method for predicting BP class nodes. TPR predictions may su er from a \starting" problem, i.e., predictions may signi cantly di er depending on the selection of the starting node at each level of the ontology graph