IMBECU   20882
INSTITUTO DE MEDICINA Y BIOLOGIA EXPERIMENTAL DE CUYO
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Genetic algorithm for the search of cancer subtypes with clinical significance according to their gene expression patterns
Autor/es:
GUERRERO GIMENEZ, MARTIN EDUARDO; CIOCCA, DANIEL RAMON; CATANIA, CARLOS; ZOPPINO, FELIPE CARLOS MARTIN; FERNANDEZ-MUÑOZ, JUAN MANUEL
Lugar:
Chicago
Reunión:
Congreso; 26th Annual meeting of ISCB.; 2018
Institución organizadora:
International Society for Computational Biology (ISCB)
Resumen:
Genetic algorithm for the search of cancer subtypes with clinical significance according to their gene expression patternsMotivation: Clustering analysis has been long used to find underlying structures in different omics data such as gene expression profiles. This data typically presents high number of dimensions and has been used successfully to find co-expressed genes in samples that share similar molecular and clinical characteristics. Nevertheless, the clustering results are highly dependent of the features used and the number of clusters considered, while the partition obtained does not guarantee clinically relevant findings. Methods: We propose a multi-objective optimization algorithm for disease subtype discovery based on a non-dominated sorting genetic algorithm. Our proposed framework combines the advantages of clustering algorithms for grouping heterogeneous omics data and the searching properties of genetic algorithms for feature selection and optimal number of clusters determination to find features that maximize the survival difference between subtypes while keeping cluster consistency high. Results: Two breast cancer datasets were divided into a training and testing set to test our model. In both cases our method identified clinically relevant sub-groups in the training sets (log-rank test = 0 & 0.0004). The features obtained were used to create nearest-centroid classifiers which were tested in the test sets with significant survival differences between groups (log-rank test = 1.22E-15 & 0.028).