IBS   24490
INSTITUTO DE BIOLOGIA SUBTROPICAL
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Full transcriptome assembly of the tree crop yerba mate (Ilex paraguariensis A. St.-Hil., Aquifoliaceae) and systematic characterization of protein coding genes
Autor/es:
AGUILERA PM; DEBAT HJ; GRABIELE M
Reunión:
Congreso; 1st Congress of Women in Bioinformatics and Data Science LA; 2020
Institución organizadora:
1st Congress of Women in Bioinformatics and Data Science LA
Resumen:
Yerba mate is the most important native subtropical crop tree in Argentina, Brazil, and Paraguay. Its leaves and twigs are used to prepare a nutritional and stimulant hot beverage named mate. The aim was to provide an original resource encompassing the full transcriptome assembled sequences of yerba mate and comprehensively annotate its protein coding genes. Total RNA was isolated from breeding line Pg538 leaves, Illumina sequenced and assembled at Trinity platform. Transcripts were submitted to in-house batch homology BlastX/tBlastn searches, using Arabidopsis thaliana L. as reference to characterize the protein coding genes of yerba mate. BlastX and tBlastn results were merged and curated, and the reciprocal best hit (RBH) strategy was performed to find out putative orthologous genes between these species. Yerba mate transcriptome-wide assembled sequences yielded 44,840 transcripts involving ca. 31,694 genes and their respective 13,146 isoforms. This data was launched at TSA GenBank/DDBJ/EMBL under the accession GFHV00000000, as the first reference for Ilex L. BlastX revealed 32,480 hits of yerba mate (21,370 genes, 11,110 isoforms) which targeted 12,435 gene models of Arabidopsis; and through tBlastn approach, 30,476 sequences of A. thaliana proteome database (23,033 gene models) found yerba mate hits (9,885 genes). In sum, our approach resulted in a comprehensive annotation of over 21,387 yerba mate genes and prediction of 9,874 orthologous genes among both species. Results on the annotation of yerba mate assembled sequences are presented in a user friendly spreadsheets format and include additional criteria, i.e. e-value, bit-score, % pairwise identity, cumulated total alignment length and RBH, that should be considered to finally decide on the orthology of pair-wise aligned sequences. These data are applicable for the characterization of agronomic important genes in yerba mate and related taxa in Ilex, and allow scientific community to implement additional analysis via original approaches.