INVESTIGADORES
BUGNON Leandro Ariel
congresos y reuniones científicas
Título:
End-to-end sequence to structure: Learning to fold non-coding RNA
Autor/es:
BUGNON, L.; DI PERSIA, L; GERARD, M.; RAAD, J.; FENOY, E.; PROCHETTO, S.; EDERA, A.; STEGMAYER, G.; MILONE, D. H.
Lugar:
Montevideo
Reunión:
Encuentro; Khipu 2023; 2023
Institución organizadora:
Khipu
Resumen:
There are several challenges in learning sequence representations, especially in the context of few labeled data, high class imbalances and domain variability. In bioinformatics, the determination of secondary structures from biological sequences (such as RNA) is a very costly process, which cannot be scaled up efficiently, limiting our ability to functionally characterize such molecules. Non-coding RNAs are relevant for numerous biological processes in medical and agroindustrial applications. Computational methods are promising for the prediction of RNA structures, but show limited capacity for modeling their wide structural diversity. We present new end-to-end approaches for secondary structure prediction from the sequence alone. To harness larger datasets with unlabeled sequences, a self-supervised encoder is proposed. Then, an architecture based on RenNet is used to encode information about each sequence position and its neighbors, and then use element pair information to predict a connection matrix. We have compared several recent methods for secondary structure prediction, with special focus on how to measure generalization capabilities, using benchmark datasets and experimentally validated sequences. By using biophysical constraints to guide the learning, results are improved for different types of RNAs.