INIBIOMA   20415
INSTITUTO DE INVESTIGACIONES EN BIODIVERSIDAD Y MEDIOAMBIENTE
Unidad Ejecutora - UE
artículos
Título:
A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and Pre-mRNA Processing
Autor/es:
ISAAC KREMSKY; NICOLAS BELLORA; EDUARDO EYRAS
Revista:
PLOS ONE
Editorial:
PUBLIC LIBRARY SCIENCE
Referencias:
Lugar: San Francisco; Año: 2015
ISSN:
1932-6203
Resumen:
High-throughput sequencing, and genome-based datasets in general, are often represented as profiles centered at reference points to study the association of protein binding and other signals to particular regulatory mechanisms. Although these profiles often provide compelling evidence of these associations, they do not provide a quantitative assessmentof the enrichment, which makes the comparison between signals and conditions difficult. In addition, a number of biases can confound profiles, but are rarely accounted for in the tools currently available. We present a novel computational method, ProfileSeq, for the quantitative assessment of biological profiles to provide an exact, nonparametric test that specific regions of the test profile have higher or lower signal densities than a control set. The method is applicable to high-throughput sequencing data (ChIP-Seq, GRO-Seq, CLIP-Seq, etc.) and to genome-based datasets (motifs, etc.). We validate ProfileSeq by recovering and providing a quantitative assessment of several results reported before in the literature using independent datasets. We show that input signal and mappability have confounding effects on the profile results, but that normalizing the signal by input reads can eliminatethese biases while preserving the biological signal. Moreover, we apply ProfileSeq to ChIPSeq data for transcription factors, as well as for motif and CLIP-Seq data for splicing factors. In all examples considered, the profiles were robust to biases in mappability of sequencing reads. Furthermore, analyses performed with ProfileSeq reveal a number of putative relationships between transcription factor binding to DNA and splicing factor binding to premRNA, adding to the growing body of evidence relating chromatin and pre-mRNA processing. ProfileSeq provides a robust way to quantify genome-wide coordinate-based signal. Software and documentation are freely available for academic use at https://bitbucket.org/regulatorygenomicsupf/profileseq/.