INVESTIGADORES
WILKOWSKY Silvina Elizabeth
congresos y reuniones científicas
Título:
MSLT-pipeline: a web-based workflow for the analysis of Multilocus Sequence Typing schemes.
Autor/es:
PRINCIPI D; DELFINO S; GUILLEMI E; RUYBAL P; WILKOWSKY S; FARBER M
Lugar:
Las Termas de Chillán
Reunión:
Congreso; Primer Congreso de la Sociedad Iberoamericana de Bioinformática SoIBIO 2010; 2010
Institución organizadora:
Sociedad Iberoamericana de Bioinformática (SoIBIO)
Resumen:
Multilocus sequence typing (MLST) is an unambiguous procedure for characterizing isolates of microorganism species using the sequences of internal fragments of seven house-keeping genes. Approx. 450-500 bp internal fragments of each gene are used, as these can be accurately sequenced on both strands using an automated DNA sequencer. For each house-keeping gene, the different sequences present within a microorganism species are assigned as distinct alleles and, for each isolate, the alleles at each of the seven loci define the allelic profile or sequence type (ST) (1). MLST has been used successfully to study population genetics and reconstruct micro-evolution of epidemic bacteria, fungus and protozoa. Highly reproducible data together with the availability of low-cost sequence services, makes MLST a powerful tool. However, manually intensive steps of processing the raw sequence data files and the downstream analysis hampered the application of the methodology. To overcome these limitations we present MLST-pipeline, a web-based tool for dealing with data and end up with the proper ST. The pipeline accepts raw chromatogram trace files from both strands, named in a standardized way so as to identify the trace files that has to be assembly together (forward and reverse strand). The system offer a first stage user-customize analysis tool including the following processes: base calling and cleaning (PHRED), (2), clustering and assembling (CAP3), (3). In the following stage an alignment step is implemented, considering the sequence start site from a user-defined primer, and the sequence end site from user-defined sequence length. Finally, ST assignment is performed by in-house developed Perl scripts and isolate genotypes are stored in a MSQL database. In addition, the workflow is flexible enough to allow the user the manual loading of assembled sequences. The pipeline can be accessed at http://bioinformatica.inta.gov.ar/mlst_pipeline. Validation using bacterial (Anaplasma marginale) and protozoa (Babesia) dataset revealed complete agreement between the results generated by manual and automated workflows.