INVESTIGADORES
TALEVI Alan
congresos y reuniones científicas
Título:
Application of Ensemble Learning Approaches to Identify Inhibitors of the Main Protease of SARS-CoV-2
Autor/es:
PRADA GORI, D.; ALBERCA, L.N.; ALICE, J. I.; CARAM ROMERO, F. N.; MEDEIROS, A. ; FLÓ, M.; ESTEBAN-LÓPEZ, V.; RUATTA, S. ; COMINI, M. ; BELLERA, C. L. ; TALEVI, A.
Reunión:
Congreso; XI Congreso Argentino de Bioinformática y Biología Computacional 2021; 2021
Resumen:
Background:The Main Protease (Mpro, also called 3CLpro) of SARS-CoV-2 is a druggable cysteine protease witha crucial role in processing CoV-encoded polyproteins which mediate the assembly ofreplication-transcription machinery, thus representing an excellent target for the development of antiviral compounds. In this work, data from 414 chemical diverse compounds previously tested against the MPro of SARS-CoV-2 (including 109 with in-house acquired data), were compiled and partitioned into representative training and test sets, employing an in-house recursive clustering method known as iRaPCA. Using in house Python routines combining feature bagging and forward stepwise feature selection, and conformation-independent molecular descriptors linear classifiers were generated. The best classifiers were subsequently combined in meta-classifiers and validated using retrospective screening experiments. At last, a prospective screening campaign was implemented.Results: The compiled molecules were divided into a balanced training set (with 80 active and 80 inactive) and a test set (consisting of 54 active compounds and 200 inactive compounds). This test set was split in two stratified subsets which were complemented with 1450 putative decoys each, to evaluate the performance of the models against in pilot virtual screening experiments. After generating 1000 individual linear models, the top 22 models with the best performance were combined using the MIN_SCORE operator, enhancing their predictivity and robustness. By analyzing Positive Predictive Value (PPV), surfaces, a cutoff value of 0.546 was chosen, associated with a specificity of 0.998 and a PPV value of 0.634 for a hypothetic yield of active compounds of 1%.The ensemble of 22 models was applied in the virtual screening of different chemical librariesincluding DrugBank, DRH, NuBBe, as well as in house libraries. As an additional selection criterion,we evaluated if the hits also belonged to the applicability domain of the model, thus selecting 49molecules, potential inhibitors of MPro.Conclusions:We generated a computational ligand-based model ensemble associated to excellent enrichment metrics in the retrospective screens; the ensemble is able to identify Mpro of SARS-CoV-2 inhibitors.After a prospective virtual screen, 49 in silico hits were selected, which are to be confirmed in vitro in the near future.