IIBBA   05544
INSTITUTO DE INVESTIGACIONES BIOQUIMICAS DE BUENOS AIRES
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Comparing background reference results in proteomics-based Gene Ontology analysis
Autor/es:
CRISTOBAL FRESNO; ANDREA LLERA; MARIA ROMINA GIROTTI; MARIA FLORENCIA STRAMINSKY; MARÍA PÍA VALACCO; MONICA BALZARINI; FEDERICO PRADA; ELMER FERNÁNDEZ; OSVALDO PODHAJCER
Lugar:
Córdoba
Reunión:
Congreso; Congreso Argentino de Bioinformática y Biología Computacional; 2011
Institución organizadora:
Asociacion Argentina de Bioinformática y Biología Computacional
Resumen:
Background: In genomics or proteomics experiments there is often the need to identify biological processes or functions that are relevantly affected by a stimulus. Set enrichment analysis (SEA) is a bioinformatics tool that intends to give an answer to this need. SEA evaluates the proportion of differentially expressed genes/proteins against a background reference (BR) in order to identify enriched biological categories/terms [1]. It is acknowledged that different BRs may produce different results [2]. This fact led us to think that, depending on the used BR, the researcher might miss biologically relevant terms. Here we contrast the results obtained with two references for a proteomic study in which not all possible proteins could be seen due to biological and/or experimental constrains. Materials and methods: Experiments were done with a human melanoma cell line in which the protumoral protein SPARC was knocked out by RNA interference (RNAi). A quantitative comparison among proteins expressed by control (SPARC-expressing) and treated (RNAi-expressing) cells was done using DIGE. The dataset of differentially expressed proteins was subjected to SEA using DAVID with complete Gene Ontology (GO) annotations [3]. Two different BRs were used for this analysis: DAVID´s human genome (default choice) and an assay-based reference (user-defined). This user-defined reference contained less than 17% of the genes in the genome reference. Enriched terms were displayed on different GO graphs using a color-code schema for direct visual contrast of the results. Results: A consensus of 78.8% was reached among enriched terms obtained using any of the BRs (see Table 1). Among the discrepant enriched terms, processes or functions already related to SPARC (leukocyte chemotaxis and migration, cytoskeleton organization, protein ubiquitination, cell differentiation and response to stress, nervous system development, etc.) were found that could be validated by the literature. Table 1: Enriched terms and total population distribution over Gene Ontology categories. Gene Ontology category Enriched terms Total population   Consensus Genome User list Total Genome User List Biological Process 178 47 3 228 14116(100%) 2381(16.9%) Molecular Function 39 15 0 54 15143(100%) 2561(16.9%) Celullar Component 55 8 0 63 15908(100%) 2583(16.2%) Total 272(78.8%) 70(20.3%) 3(0.9%) 345(100%)   Conclusions: Ontology analysis results using the two references showed major agreement despite the great difference in total members, suggesting reliability in the differential gene list. However, specific relevant enriched terms also emerged by contrasting BRs; these could be missed when using only one reference. Our results suggest that much more information is provided from ontology analysis when simultaneous contrast of BRs is performed. Reference: 1. Huang, D. W.; Sherman, B. T. & Lempicki, R. A.: Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists, Nucleic Acids Res 2009, n37: 1-13. 2. Khatri, P. & Drăghici, S.: Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics (2005), 21:3587-3595. 3. DAVID Bioinformatics Resources 2008 [http://david.abcc.ncifcrf.gov/]