INVESTIGADORES
COSTA TARTARA Sabrina Maria
congresos y reuniones científicas
Título:
Evaluation of RNA-Seq assemblies of Matricaria chamomilla for the definition of a workflow in the construction of de Novo transcriptomes
Autor/es:
MAGGIO, JULIÁN; GARCIA, LAURA E,; COSTA TÁRTARA SABRINA M.
Lugar:
Risario, Santa Fé
Reunión:
Congreso; XIII Argentine Congress of Bioinformatics and Computational Biology, XIII International Conference of the Iberoamerican Society of Bioinformatics and III Annual Meeting of the Ibero-American Artificial Intelligence Network for Big Biodata; 2023
Institución organizadora:
Asociación Argentina de Bioinformática y Biología Computacional
Resumen:
BACKGROUNDAn organism's transcriptome represents the portion of its genome expressed at a specific moment under particular conditions and can be constructed using RNA sequences (RNA-Seq). This approach helps to generate primary gene expression information of a species, like some medicinal plants. Various tools exist to assemble short RNA reads in consensus sequences and generate de Novo transcriptomes (without a reference genome). Although the workflow for this analysis is defined, the parameters under different tools used are only sometimes reported. The work presents the analysis of different Matricaria chamomilla transcriptome constructions beginning from reads of Roche's 454 using four combinations of quality control parameters for pre-assembly processing and three assemblers (NEWBLER, SOAPDeNovo-Trans, and Trinity) that are based on two types of assembling algorithms (de Bruijn Graphs and Overlap Layout Consensus). We analyze the quality of outputs through different metrics using rnaQUAST, DETONATE and BUSCO tools.RESULTSThe results showed the same BUSCO pattern for every assembler transcriptome. Stricter combinations of quality values in trimming raw reads impacted generating possible losses of key sequences from the library. Completed and Single-copy genes and Completed and Duplicated genes from the Viridiplantae database were the minor classes represented compared to the Fragmented genes and Lost genes. Transcriptomes assembled with Trinity showed higher frequencies for all classes, followed by NEWBLER and the last SOAPDeNovo-Trans. The rnaQUAST indicates longer consensus sequences and greater N50 values for Trinity transcriptomes than other assemblers. RSEM-EVAL values calculated with DETONATE indicate stricter trimmed levels positively in the level of reads transformation during the transcriptome construction. These results also reflect better transcriptomes constructed using Trinity. CONCLUSIONSDespite the small size of the library, it was a helpful input to test different pre-assembly treatments and assemblers. The output analysis with different tools allows for analyzing sequence parameters, the transformation of the library at the time of assembly and the integrity of the transcriptome. Trinity proved to be the most effective tool for assembling the library, surpassing even NEWBLER, the assembler recommended by Roche.