INVESTIGADORES
RAMALLO Virginia
congresos y reuniones científicas
Título:
Full genome sequencing of Native American ancestry individuals reveals adaptation to high altitude in the Andes
Autor/es:
MUZZIO M; DOBBYN A; MOTTI JMB; YEE, MUH-CHING; KOHLI, S; SLIVINSKI K. ; SANTOS MR; RAMALLO V; ALFARO EL; DIPIERRI JE; BAILLET G; BRAVI C; KENNY, EIMEAR E.
Lugar:
San Diego
Reunión:
Congreso; American Society of Human Genetics 68th Annual Meeting; 2018
Institución organizadora:
American Society of Human Genetics
Resumen:
We carried out Whole Genome Sequencing to 4X coverage on blood samples from 14 high-altitude Andean individuals and 11 low-altitude Chaqueno individuals with over 90% Native American ancestry on the Illumina HiSeq2500 platform. We called variants using a pooled analysis in GATK, implementing Unifi ed Genotyper (UG) as the variant calling method. We identified samples with high missingness rates and re-ran the GATK pipeline excluding these individuals, leaving us with 13 highlanders and 9 lowlanders for all subsequent analyses. After filtering, we had a total of 5,232,922 variants remaining. We annotated our variants for genic and functional status with ANNOVAR. FST statistics were calculated using vcftools. The resulting FST statistics were winsorized, and p-values were generated for the resultant exponentially distributed statistics in R. We used DEPICT (Data-driven Expression-Prioritized Integration for Complex Traits) to conduct gene set prioritization and tissue enrichment analyses. We used input variants at multiple FST FDR thresholds (0.01, 0.1, 1 and 5%) and ANNOVAR annotations. At an FDR threshold of 0.01 percent, we identified four genomic regions that are highly differentiated between the low and high altitude populations and a number of genes in these regions could potentially be relevant in the response to high altitude hypoxia. Expanding the analysis to include windows at an FDR threshold of 0.02 percent, we identified an additional nine genomic regions, one of which has previously been implicated in a study of Chronic Mountain Sickness in an Andean population. Finally, in order to gain further biological insights from our high FST variants, we conducted tissue and gene set enrichment analyses using DEPICT. Taking variants at an FST FDR threshold of 1 percent, using all non synonymous/stoploss/stopgain variants as input, the tissue enrichment analysis shows the most significant (nominal P-value FDR <5%) enriched tissue to be fetal blood. Taking variants at an FST FDR threshold of 5 percent, again using all non synonymous/stoploss/stopgain variants as input, the tissue enrichment analysis shows the three most significantly enriched tissues to be fetal blood, esophagus, and blood, and the most significant enriched gene sets to include multiple lung and oxygen-related processes, such as reactive oxygen species metabolic process (GO:0072593), abnormal vascular endothelial cell physiology (MP:0004003), and superoxide metabolic process (GO:0006801).