INBA   12521
INSTITUTO DE INVESTIGACIONES EN BIOCIENCIAS AGRICOLAS Y AMBIENTALES
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Using co-occurrence networks techniques: Meta-analysis of soil metagenomic data
Autor/es:
ORLOWSKI J.F.; SORIA M.A.
Lugar:
Bahía Blanca, Buenos Aires
Reunión:
Congreso; VI Argentinian Congress of Bioinformatics and Computational Biology (CAB2C) - VI Argentinian Conference of Bioinformatics and Computational Biology.; 2015
Institución organizadora:
Asociación Argentina de Bioinformática y Biología Computacional (A2B2C) y el Instituto de Ciencias e Ingeniería de la Computación de la Universidad Nacional del Sur
Resumen:
Using co-occurrence networks techniques: Meta-analysis of soil metagenomic dataJuan Orlowski, Marcelo SoriaCátedra de Microbiología Agrícola Ambiental. Facultad de Agronomía UBA. INBA-CONICET. Buenos Aires,Argentina, C1417DSEBackgroundSoil has a large diversity and quantity of microbial taxa. An alternative to the classic study of alpha and betadiversity is analysing the co-occurrence of bacterial taxa across different types of soils and applying networkanalyses describing their topological properties [1].Material and methodsThis meta-analysis was performed with metagenomic data of six studies that used the 16S rRNA gene asmarker. The consolidated dataset include 218 samples from six different soil environments: grasslands,crops, shrublands, and tropical, coniferous and mixed forests. After data filtering and normalizing, severalclustering techniques revealed a sequencing bias. To correct it we rarefied the data at 800 reads per sampleand with the Similarity Fusion Matrix (SFM) [2] method we obtained a consensus network. The final datasethad 96 bacterial families and 202 samples.ResultsThe resulting co-occurrence network had a single giant component, with 96 vertices and 8836 edges. Tokeep only the strongest relationships we remove edges with a weight less than 0.0115, which reduced thenumber of edges to 481 (see Figure 1). Several centrality measures were calculated: node degree,intermediation, betweenness and closeness. The pattern of node degree distribution resembled those ofother biological and social networks. Most of the nodes had few connection and a few nodes had manyconnections. We clustered taxa with the fast-greedy community algorithm. We detected five groupings, andtheir significance was confirmed by a permutation test. The family-level data was consolidated by class andthe resulting histograms are shown in Figure 2.Figure 1Co-occurrence network. Coloured by community: 1(cyan), 2(yellow), 3(green), 4(grey), 5(red).Figure 2Class composition by community.ConclusionsThe use of rarefaction combined with consensus network clustering eliminated the bias effect observed withconventional clustering techniques applied to the families by sample matrix. The simultaneous analysis ofhundreds of samples allowed us to discover and characterize five bacterial communities.Reference1. Barberán A, Bates ST, Casamayor EO, Fierer N: Using network analysis to explore co-occurrencepatterns in soil microbial communities. The ISME J., 2012, 6:343-351.2. Wang B, Mezlini A M, Demir F, Fiume M, Tu Z, Brudno M, Haibe-Kains B, Goldenberg A: Similaritynetwork fusion for aggregating data types on a genomic scale. Nature Methods 2014, 11: 333-337.