INVESTIGADORES
CRAVERO Fiorella
congresos y reuniones científicas
Título:
A Confidence Estimation Approach for Applicability Domain Assessment of QSAR Classification Models
Autor/es:
MARTÍNEZ, MARÍA JIMENA; CRAVERO, FIORELLA; SCHUSTIK, SANTIAGO; DÍAZ, MÓNICA F.; PONZONI, IGNACIO
Lugar:
Posadas
Reunión:
Congreso; VIII Congreso Argentino de Bioinformática y Biología Computacional; 2017
Institución organizadora:
A2B2C
Resumen:
The applicability domain (AD) is a crucial step in the modeling QSAR/QSPR by which we can estimate the reliability of a model. Once the model has been trained, it should be possible to determine if, for a new compound, the prediction will be reliable or not. In other words, what we want to find is the AD of the QSAR/QSPR classification model. There are several techniques for its definition. A strategy for its definition is to propose a first instance in which the structural similarity between the new compound (NC) and the compounds of the training set is analyzed. Once this is done, in a second instance, the confidence estimation of the model is evaluated.The methodology has been evaluated using databases of two properties, namely Ready Biodegradation (RB) and log P liver , which had two and three class labels respectively. For RB, the percentage of compounds in each class was: 46.4% (reliable), 53.3% (not reliable) and 0.3% (not categorized). In the case of log P liver , the results were: 27% (reliable), 36% (not reliable) and 37% (not categorized).The main contribution of this work was the implementation of an AD strategy for QSAR/QSPR classification models, obtaining estimation accuracy above 44% in average, and achieving a decidability average rate around 81%. Furthermore, an average of 80% of the compounds labeled as reliable, were classified correctly by the methodology.