INVESTIGADORES
AGÜERO Fernan Gonzalo
congresos y reuniones científicas
Título:
FastqCleaner: a Shiny web application for cleaning Illumina FASTQ files with R
Autor/es:
ROSER L; AGÜERO F; O SÁNCHEZ, DANIEL
Lugar:
Ciudad Autonoma de Buenos Aires
Reunión:
Conferencia; 4th ISCB Latin America Conference / 7th Argentinian Congress of Bioinformatics; 2016
Institución organizadora:
International Society for Computational Biology (ISCB) / Asociacion Argentina de Bioinformatica y Biologia Computacional (A2B2C)
Resumen:
FastqCleaner: a Shiny web application for cleaning Illumina FASTQ files with RL Roser1, F Agüero1, D Sánchez11 Instituto de Investigaciones Biotecnológicas ?Dr. Rodolfo Ugalde? (IIB-INTECH) UNSAM-CONICET.Next generation sequencing (NGS) technologies are capable of producing large amounts of genetic data, which has significant impact in clinical studies and research. For Illumina sequencing, FASTQ files are the raw starting material of subsequent analyses. A portion of the reads can include adapters or contaminants; the quality of the sequences becomes generally lower towards the end of the reads; ambiguous base calls may be present. The correction of these and other artifacts are important steps for downstream analysis. R and Bioconductor are gold standards for NGS analysis. The massive use of these tools is, however, limited by the learning curve that users need to go through for working with code routines. In the last years, the integration of R with web tools, in particular javascript APIs, has dramatically increased the R potential for a more interactive and dynamic experience of data analysis. We present a Shiny application that provides a step-by-step pipeline for preprocessing of FASTQ files. The interface supports the selection of a series of cleaning operations, using the ShortReadQ class to pass the data between filters, showing diagnostic plots for the input and output files (as per-cycle mean base quality and base proportion). This is the first of a series of two R-based web applications devoted to the analysis of NGS data. These user-friendly tools would be able to extend the power of R to a broader audience of users, that can now focus on the interpretation of the data, rather than on the underlying R code.Supported by: Agencia Nacional de Promoción Científica y Tecnológica