INVESTIGADORES
BAILLIET Graciela
congresos y reuniones científicas
Título:
Full genome sequencing of Native American ancestry individuals reveals adaptation to high altitude in the Andes
Autor/es:
MUZZIO M; DOBBYN A; MOTTI JMB; YEE M-C; KOHLI S; SLIVINSKY K; SANTOS MR; RAMALLO V; ALFARO GOMEZ EL; DIPIERRI JE; BAILLIET G; BRAVI CM; KENNY EE
Lugar:
San Diego
Reunión:
Congreso; ASHG2018; 2018
Institución organizadora:
The American Society of Human Genetics
Resumen:
We carried out Whole Genome Sequencing to 4X coverage on blood samples from 14 high-altitude Andean individuals and 11 low-altitude Chaqueno individuals with over 90% Native American ancestry on the Illumina HiSeq2500 platform. We called variants using a pooled analysis in GATK, implementing Unified Genotyper (UG) as the variant calling method. We identified samples with high missingness rates and re-ran the GATK pipeline excluding these individuals, leaving us with 13 highlanders and 9 lowlanders for all subsequent analyses. After filtering, we had a total of 5,232,922 variants remaining. We annotated our variants for genic and functional status with ANNOVAR.FST statistics were calculated using vcftools. The resulting FST statistics were winsorized, and p-values were generated for the resultant exponentially distributed statistics in R. We used DEPICT (Data-driven Expression-Prioritized Integration for Complex Traits) to conduct gene set prioritization and tissue enrichment analyses. We used input variants at multiple FST FDR thresholds (0.01, 0.1, 1 and 5%) and ANNOVAR annotations. At an FDR threshold of 0.01 percent, we identified four genomic regions that are highly differentiated between the low and high altitude populations and a number of genes in these regions could potentially be relevant in the response to high altitude hypoxia. Expanding the analysis to include windows at an FDR threshold of 0.02 percent, we identified an additional nine genomic regions, one of which has previously been implicated in a study of Chronic Mountain Sickness in an Andean population. Finally, in order to gain further biological insights from our high FST variants, we conducted tissue and gene set enrichment analyses using DEPICT. Taking variants at an FST FDR threshold of 1 percent, using all nonsynonymous/stoploss/stopgain variants as input, the tissue enrichment analysis shows the most significant (nominal P-value FDR