INVESTIGADORES
STEGMAYER Georgina Silvia
congresos y reuniones científicas
Título:
Non-negative matrix factorization for prediction of gene annotations
Autor/es:
L. DI PERSIA, G. LEALE, G. STEGMAYER, D.H. MILONE
Reunión:
Conferencia; 4th International Society for Computational Biology Latin America Bioinformatics Conference (ISCB-LA); 2016
Institución organizadora:
International Society for Computational Biology (ISCB)
Resumen:
The accurate prediction of gene annotations is currently an important issue in moderncomputational biology. A list of putative terms/labels can be provided by the Gene Ontology(GO) and used to design targeted biological experiments in order to generate novel andvalidated knowledge. However, the handmade curation process of novel annotations is verytime-consuming and costly. Thus novel computational tools are needed to reliably predictlikely annotations and quicken the discovery of new gene functions. The proximity betweenGO terms (semantic similarity) can be measured through any of existing semantic measuresavailable, in order to build a distance matrix of GO annotations (dGO) between a group ofgenes of interest. However, for the case of novel or non-annotated genes, this matrix willhave many empty positions. Thus their similarity to annotated genes in order to infersemantically closed annotations could not be calculated. We will show how it is possible tofully reconstruct dGO by using other available information source for the genes (such asexpression levels), and afterwards infer their GO labels. We have presented a novel approach to the of prediction of gene annotations based on NNMFfusion of the semantic and expression distances among genes, that uses the fusion tocomplete the unknown part of the semantic distance matrix. The reconstructed semanticmatrix can be then used to infer candidates terms for the unknown genes. This approachcan yield a sensitivity and precision comparable and extremely close to the one obtained byusing the real semantic distance information, which shows that the NMF fusion approachwas successful in capturing the information structure of the dGO matrix.