ICIC   25583
INSTITUTO DE CIENCIAS E INGENIERIA DE LA COMPUTACION
Unidad Ejecutora - UE
artículos
Título:
Hybridizing Feature Selection and Feature Learning Approaches in QSAR Modeling for Drug Discovery
Autor/es:
GOMEZ ARRAYAS R.; CRAVERO F.; REQUENA C.; CAMPILLO N.E.; PAEZ J.A.; MARTINEZ M. J.; SEBASTIAN V. ; CAMPILLO N.E.; PAEZ J.A.; MARTINEZ M. J.; SEBASTIAN V. ; ADRIO J.; DIAZ M. F.; ROCA C.; PONZONI I. ; ADRIO J.; DIAZ M. F.; ROCA C.; PONZONI I. ; GOMEZ ARRAYAS R.; CRAVERO F.; REQUENA C.
Revista:
Nature. Scientific Reports
Editorial:
Nature Pub. Group
Referencias:
Lugar: Londres; Año: 2017 p. 1 - 19
Resumen:
Quantitative structure?activity relationship modeling using machine learning techniques constitutes a complex computationalproblem, where the identification of the most informative molecular descriptors for predicting a specific target property playsa critical role. Two main general approaches can be used for this modeling procedure: feature selection and feature learning.In this paper, a performance comparative study of two state-of-art methods related to these two approaches is carried out.In particular, regression and classification models for three different issues are inferred using both methods under differentexperimental scenarios: two drug-like properties, such as blood-brain-barrier and human intestinal absorption, andenantiomeric excess, as a measurement of purity used for chiral substances. Beyond the contrastive analysis of featureselection and feature learning methods as competitive approaches, the hybridization of these strategies is also evaluatedbased on previous results obtained in material sciences. From the experimental results, it can be concluded that there is nota clear winner between both approaches because the performance depends on the characteristics of the compounddatabases used for modeling. Nevertheless, in several cases, it was observed that the accuracy of the models can beimproved by combining both approaches when the molecular descriptor sets provided by feature selection and featurelearning contain complementary information.