INVESTIGADORES
GARRO MARTINEZ Juan Ceferino
artículos
Título:
Impact assessment of the rational selection of training and test sets on the predictive ability of QSAR models
Autor/es:
ANDRADA, M. F.; VEGA-HISSI, E. G.; ESTRADA, M. R.; GARRO MARTINEZ, J. C.
Revista:
SAR AND QSAR IN ENVIRONMENTAL RESEARCH
Editorial:
TAYLOR & FRANCIS LTD
Referencias:
Año: 2017 p. 1 - 13
ISSN:
1062-936X
Resumen:
This study performed an analysis of the influence of the trainingand test set rational selection on the quality and predictively of thequantitative structure?activity relationship (QSAR) model. The studywas carried out on three different datasets of Influenza Neuraminidase(H1N1) inhibitors. The three datasets were divided into training andtest sets using three rational selection methods: based on k-means,Kennard?Stone algorithm and Activity and the results were comparedwith Random selection. Then, a total of 31,490 mathematical modelswere developed and those models that presented a determinationcoefficient higher than: r2train > 0.8, r2loo > 0.7, r2test > 0.5 and minimumstandard deviation (SD) and minimum root-mean square error (RMS)were selected. The selected models were validated using the internalleave-one-out method and the predictive capacity was evaluatedby the external test set. The results indicate that random selectioncould lead to erroneous results. In return, a rational selection allowsfor obtaining more reliable conclusions. The QSAR models withmajor predictive power were found using the k-means algorithmand selection by activity.