INVESTIGADORES
MACCHIAROLI Natalia
artículos
Título:
HextractoR: an R package for automatic extraction of hairpins from genome-wide data
Autor/es:
YONES, CRISTIAN; MACCHIAROLI, NATALIA; KAMENETZKY, LAURA; STEGMAYER, GEORGINA; MILONE, DIEGO HUMBERTO
Revista:
bioRxiv
Editorial:
bioRxiv
Referencias:
Año: 2020
ISSN:
2692-8205
Resumen:
Extracting stem-loop sequences (hairpins) from genome-wide data is very important nowadays for some data mining tasks in bioinformatics. The genome preprocessing is very important because it has a strong influence on the later steps and the final results. For example, for novel miRNA prediction, all well-known hairpins must be properly located. Although there are some scripts that can be adapted and put together to achieve this task, they are outdated, none of them guarantees finding correspondence to well-known structures in the genome under analysis, and they do not take advantage of the latest advances in secondary structure prediction. We present here an R package for automatic extraction of hairpins from genome-wide data (HextractorR). HextractoR makes an exhaustive and smart analysis of the genome in order to obtain a very good set of short sequences for further processing. Moreover, genomes can be processed in parallel and with low memory requirements. Results obtained showed that HextractoR has effectively outperformed other methods.