IBIOBA - MPSP   22718
INSTITUTO DE INVESTIGACION EN BIOMEDICINA DE BUENOS AIRES - INSTITUTO PARTNER DE LA SOCIEDAD MAX PLANCK
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
The human genome data analysis plataform
Autor/es:
DANIEL KOILE; MAXIMILIANO DE SOUSA SERRO; DIEGO WALLACE; PATRICIO YANKILEVICH
Reunión:
Congreso; V Congreso Argentino de Bioinformática y Biología Computacional; 2014
Institución organizadora:
Asociación Argentina de Bioinformática y Biología Computacional
Resumen:
Background The health of an individual depends upon their DNA as well as upon environmental factors. The genome is the blueprint of an individual, and its analysis with additional biological information, such as the DNA methylome, the transcriptome, the proteome, and the metabolome, will further provide a dynamic assessment of the physiology and health state of an individual (1). The personal genome interpretation can be used to identify molecular and genetic variations within the population. This genetic screening information will allow us to elucidate disease pathways and identify new drug targets. In clinical trials this information will speed up time and reduce risks of trials by recruiting participants based on their genetic profile. The trial results combined with genetic profiles will allow to inform therapeutic development and identify genetic causes in drug response and side effects. Finally, this human genome analysis platform may help us to better understand the genetic basis of diseases, to make more accurate diagnosis, to have a better understanding of prognosis and to take better treatment decisions. Materials and methods The plataform we are building consits in a computer cluster, an Next Generation Sequencing (NGS) data analysis pipeline, a set of biological knowledge databases and a platform website. The software pipeline is the key component of the platform. It is made of state of the art methods for NGS data analysis. Over 15 public open source algorithms, developed by research groups from leading institutions, which conform today?s best practices are being used in our pipeline. This guarantees a transparent data analysis and reproducibility. The pipeline is designed as independent modules which sequentially execute the different genome analysis tasks. The Genome Analysis Toolkit (GATK) developed by the Broad Institute (2) is widely used in our pipeline, complementing other analysis and visualization tools. Conclusions This human genome general analysis pipeline provides us the basis to participate in different biomedical projects which include patient genetic profiles and allow us to start collaborations with experimental research groups working with human diseases. Eventually, this basic framework can be customized to provide further important applications such as cancer diagnosis, non-invasive prenatal tests or newborn screening. In future work we aim to extend the platform to integrate transcriptome and epigenome data into the analysis. References (1) Chen R, Mias R, Li-Pook-Than J, Jiang L, Lam H, Chen R, Miriami E, Karczewski K, Hariharan M, Dewey F, et al. Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell 2012, 148(6): 1293-1307. (2) McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010, 20:1297-1303.