INVESTIGADORES
TALEVI Alan
artículos
Título:
Optimal Partition of Datasets of QSPR Studies: A Sampling Problem
Autor/es:
ALAN TALEVI; CAROLINA L. BELLERA; EDUARDO A. CASTRO; LUIS E. BRUNO-BLANCH
Revista:
MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY
Editorial:
UNIV KRAGUJEVAC
Referencias:
Año: 2010 vol. 63 p. 585 - 599
ISSN:
0340-6253
Resumen:
Starting from different partitions of a 160-compounds dataset into training and test sets, we developed discriminant functions to classify drugs into different categories of human intestinal absorption rate. For each partition of the dataset, models that included up to ten Dragon descriptors were built, and the performance of each discriminant function in the classification of the training and test sets was assessed. The classification ability of the model on both the training and test sets of each partition was assessed and explored graphically through divergence diagrams. Results suggest that external validation tends to underestimate the predictive capability of QSAR models and that the more reliable results from external validation are obtained with even partitions of small and medium size datasets.

