DOPAZO Hernan Javier
congresos y reuniones científicas
Two Universals in Genomics: Information Content and Specie?s Abundance Diversity
HERNAN DOPAZO; VERÓNICA BECHER
Oro Verde, entre Rios
Congreso; III Congreso Argentino de Bioinformática y Biología Computacional; 2012
Asociación Argentina de Bioinformática y Biología Computacional (A2B2C) y Sociedad Iberoamericana de Bioinformática (SolBio)
Two Universals in Genomics: Information Content and Specie?s Abundance DiversityVerónica Becher1 & Hernán Dopazo21 Computation Department & 2 Ecology, Genetics & Evolution Department. Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Argentina.In this talk we analyse two hypotheses: H1- that there is a common combinatorial structure of DNA along all diversity of life, and H2- that a common rule governing species abundance and diversity (SAD) exists in genomes. H1- Our first hypothesis is that there is a random-like structure of DNA along all diversity of life. To test it, we define a complexity measure based on a classical method used in data compression  and applicable to arbitrarily large sequences introducing no fragmentation. The method detects regularities due to repeats of any length, at any distance, and other structural correlations. As the main result we report that the ratio of genome complexity to size remained almost maximal and unchanged along six orders of magnitude in genome size, covering all biological diversity. We observe a uniform complexity increases with genome size for phages, bacteria, unicellular eukaryotes, fungi, plants, and animals. Major deviations from maximal genome complexity correspond to polyploid species. We formulate two general hypotheses: 1- almost maximal combinatorial structure of DNA sequence is a common characteristic of genomes throughout biological diversity; 2- increases in the combinatorial complexity of DNA only occur by mechanisms of genome amplification, and subsequent accumulation of DNA sequence mutations, transpositions and/or deletions of genetic material. Our hypothesis can be falsified if a single recent polyploid genome with a random-like DNA structure is found; or if a non-polyploid genome shows a non-random DNA structure.H2- Our second hypothesis is that there is a common rule governing species abundance and diversity (SAD) in genomics. To what extent SAD reflects adaptive or stochastic outcomes? Ideal models for genomics would consider all diversity of elements populating eukaryote genomes. However, such model does not exist . In ecology, the unified neutral theory of biodiversity (UNTB)  assumes interactions among tropically similar species equivalent on an individual ?per capita? basis. UNTB assumes that these individuals, regardless of the species, appear to be controlled by similar birth, death, dispersal, and speciation rates. Biodiversity composition therefore emerges randomly in the community . Here, taking advantage of the UNTB and the general framework posed by ecological genomics  we ask for the relative SAD of genetic elements of ~500 chromosomes in 30 eukaryote genomes. After ML adjustment of UNTB parameters and hypothesis testing we found that most chromosomes follow relative SAD according to the expected by UNTB. While ecologists found natural selection an irrelevant component to explain relative SAD in forests, we found that the same simple neutral model fits SAD of genetic elements in genomes. We suggest that the random-like structure and the observed SAD are universals in genomes along all diversity of life.