INVESTIGADORES
BELLORA Nicolas
congresos y reuniones científicas
Título:
Cleaning 'Nwanted regions: A novel approach to getting rid of N content by assemblies combination
Autor/es:
PAULA NIZOVOY; DIEGO LIBKIND; MARTIN MOLINE; NICOLAS BELLORA
Lugar:
Buenos Aires
Reunión:
Simposio; Argentine Symposium of Young Bioinformatics Researchers (2SAJIB); 2017
Institución organizadora:
ISCB RSG-Argentina
Resumen:
P { margin-bottom: 0.08in; }Advances in sequencing technology allow genomes to be sequenced at decreased cost enabling the creation of the bedrock of genome research. However, poor quality assemblies impair genomic predictions and inferences based upon them, hampering search of common genes, syntenic regions, etc. Moreover,the large availability of genomes deposited in GenBank does not imply correctness of genome sequences. For these reasons, quality controlover assembly process and comparison between different algorithms and available tools can not be neglected in order to guarantee a solidinitial step. Combination of outputs is a valid strategy to get the best of each approach.Inorder to do so, we present a Python based script developed to reduce assemblies´ (N)n tracts by merging overlapping scaffolds obtainedfrom different assemblers.Drafts genome sequences and sequencing reads of two psychrotolerant basidiomycetous yeasts used as models (Naganishiavishniacii ANT03-052and Dioszegiacryoxerica ANT03-071)were downloaded from JGI Genome Portal. Denovo assemblyof both yeast genomes was performed with SPAdes under an approachtested in our laboratory for a variety of yeasts genomes. These newscaffolds were used to improve downloaded versions by joining andextending overlapping scaffolds and replacing previously undefinedregions.Toachieve the later both set of scaffolds (older and newly assembled)were used as input for our pipeline. Briefly, it consisted in BLAT search of (N)n tracts´ contiguous regions in the new scaffolds,extraction of nucleotide sequences defined between those flanking regions and re-incorporation of resulting fragment over N(n) tractsin older versions, generating assemblies with a lesser content of undefined zones.Bymeans of this approach, 57% and 43% of previously undefined regionswere resolved in Naganishia and Dioszegia assemblies respectively.Extensionof scaffolds´ lengths and reduction in quantity was achieved inNaganishia´s draft genome, resulting in an assembly with 7 scaffoldsless than the downloaded version (37 vs 31 units).Yeastsisolated from extreme environments are of interest for their ability to utilize a broad range of carbon sources and for the production of considerable amounts of biotechnological relevant metabolites.Improvements achieved in draft genomes enable a more confident genomic study of these species aiming to find genes involved in those promising pathways.Pipeline here presented may be also of interest for the improvement of larger genomes.