IPATEC   26054
INSTITUTO ANDINO PATAGONICO DE TECNOLOGIAS BIOLOGICAS Y GEOAMBIENTALES
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Cleaning ?Nwanted regions: A novel approach to getting rid of N content by assemblies combination
Autor/es:
MOLINÉ M; NIZOVOY P; LIBKIND D; BELLORA N
Lugar:
Ciudad Autónoma de Buenos Airtes
Reunión:
Simposio; 2° Simposio Argentino de Jóvenes Investigadores en Bioinformática; 2017
Institución organizadora:
RSG
Resumen:
BackgroundAdvances in sequencing technology allow genomes to be sequenced at decresed cost enabling the creation of the bedrock of genome research. However, bad quality assemblies impair genomic predictions and inferences based upon them. For this reason, quality control over assembly process and comparison between different algorithms and available tools can't be neglected in order to guarantee a solid inicial step.Moreover, availability of genomes in GenBank does not implies correctness of genome sequences. Ideal of an assembly pipeline that suits any down step objective does not exist as each has its own pros and cons. Combination of outputs may be a valid strategy to get the best of each approach.ObjectivesThe aim of this project was to complement available genomic information by reassembling and combining of outputs. Special emphasis was put in merging overlapping scaffolds in order to reduce N contents, as different assemblers may enable distinct resolution of certain regions.Methods and ResultsDrafts genome sequences, as well as sequencing reads of Naganishia vishniacii ANT03-052 and Dioszegia cryoxerica ANT03-071 were publicly available at JGI Genome Portal (2001).Using a pipeline already prooved in our laboratory for a variety of yeasts, SPAdes de novo assemblies of both species were performed. These yielded new sets of scaffolds used to improve JGI genome by 1) joining and extending scaffolds and 2) replacing previously undefined regions (Ns' islands). To achieve 2), Python script was specially designed by integrating blat search of Ns' contiguos regions in the new assemblies, extraction of region defined between those flanks and incorporation of fragment over Ns' islands in JGI versions, thus generating assemblies with a lesser content of undefined zones. Indeed, 57 and 43 percent of previously undefined regions were resolved by this approach in Naganishia and Dioszegia assemblies respectively. Also extension of scaffolds' lengths and reduction in quantity was achieved in Naganishia's draft genome, resulting in an assembly with 7 scaffolds less than JGI version.