CONICET | Buscador de Institutos y Recursos Humanos

Molecular signatures are sets of genes that could be used to diagnose or classify disease status on subjects. Due to the need of a great amount of samples and/or the different overlapping characteristics of the classes in the feature space building a successful diagnostic tool is still a wish [1]. Artificial Neural Networks (ANN) were not extensively used in gene expression signatures classification, basically because of ?the curse of dimensionality problem?, where the amount of variables (genes) is greater than the number of samples (subjects). ANNs usually solve multiclass problems by means of setting a large structure with at most as many output neurons as classes exist in the domain. This implies adjusting a great number of weights, which in essence requires a lot of samples for the algorithm to converge [2]. By means of a ?divide and conquer? strategy one can split a complex problem into several ?easier? problems. One of these strategies is the One vs One classification through binary classifiers. This implies, for K>2 classes, solve K(K-1)/2 binary classification problems. Here we present some preliminary results on solving multiclass gene expression signature classification through K(K-1)/2 binary ANNs with a voting schema for class prediction, called OVONN. The proposed methodology was tested on 3 gene expression data bases preprocessed as in [6]. For each data base those genes showing a standard deviation greater than 95 percent were selected as predictor variables. In table 1 it is possible to see the performance of the OVONN compared to the traditional ANN approach with as many output nodes as classes. The models were cross validated by a Leave One Out by Class strategy and the number of Hidden Units optimized in each case.

enviar mensaje