BECAS
FENOY Luis Emilio
congresos y reuniones científicas
Título:
Integration of RNA-Seq Expression Data for Improved Prediction of MHC Class I Antigen Presentation.
Autor/es:
EMILIO FENOY; MORTEN NIELSEN
Lugar:
Mendoza
Reunión:
Congreso; 10th Argentinian Meeting on Bioinformatics and Computational Biology; 2019
Institución organizadora:
Asociación Argentina de Biotecnología y Biología Molecular
Resumen:
Background: Binding of peptides to the Major Histocompatibility Complex class I (MHC-I) molecules is the most selective event in the processing and presentation of epitopes to CTL. The identification of those epitopes is crucial for personalized immunotherapies and vaccine development. Peptide?MHC-I interactions have traditionally been quantified by the strength of the interaction, that is, the binding affinity (BA), but in recent years the use of mass spectrometry (MS) to profile eluted HLA ligands (EL) has provided unbiased, high-throughput measurements of HLA associated peptides. In addition, gene expression profiles measured by RNA-seq data have been reported to significantly improve the performance of epitope prediction tools. However, the integration of data sets of such a diverse nature to train a single neural network have proven to be a difficult challenge. Methods: Around 100.000 peptides identified as ligands with MS were used. The data set was enriched with peptide-MHC-I binding affinity measurements, random natural negatives and RNA-seq information of 68 samples containing the source proteins. The data set was encoded to be used to train a feed-forward neural with three layers and two output neurons, one for BA and another for EL predictions. In this way, the weights from input to the hidden layer are shared between both data-types, allowing learning-transfer while the connections from hidden to output layer are unique for fine-tuning. The RNA-seq measurement added a new neuron in the input layer after normalization. The model training was performed as a typical 5-fold cross-validation where the elements in each partition were selected through a modified version of the hobohm algorithm to minimize the overlapping between them. Results: We were able to train a new method with an improved prediction of MHC Class I antigen presentation, that incorporates EL and BA data from single-allele and multi-allele sources, along with RNA-seq data. The addition of expression levels allows the model to identify ligands with a low binding affinity that are still presented due to their high disponibility boosting its sensitivity