INVESTIGADORES
CHERNOMORETZ Ariel
congresos y reuniones científicas
Título:
Gene Ontology guided clustering of gene expression profiles
Autor/es:
ARIEL BERENSTEIN; ARIEL CHERNOMORETZ
Lugar:
Montevideo, Uruguay
Reunión:
Conferencia; ISCB Latin America 2010; 2010
Institución organizadora:
International Society for Computational Biology
Resumen:
BackgroundDNA microarrays are powerful devices for simultaneously monitoring the expression ofthousands of genes under different conditions. A typical DNA microarray experimentproduces a huge amount of information, and clustering techniques are usually employed toreveal common patterns of gene expression across different samples. The rationale behindthis approach is a 'guilty-by-association' scenario, where genes with similar profiles of activityare supposed to have related functions or to be regulated by common mechanisms.Material and MethodsIn this communication we introduce a variant of a clustering procedure rooted in statisticalmechanics paradigms: the Super Paramagnetic Clustering [Domany]. The SPC algorithmmaps the clustering problem onto the statistical physics of granular ferromagnets. Each geneis associated to a spin variable which can appear in one of q possible states. The algorithmthen samples the set of all possible spin states taking into account that nearest neighborspins, representing genes with similar transcriptional profiles, are linked and interact. Thislocal interaction favors the tendency of directly connected spins to be in the same spin state.The cluster structure of the original problem is then probed by analyzing spin correlations (i.e.analyzing how many times a given couple of spins appeared in the same state).We implement a generalized SPC algorithm. The proposed novel algorithm seeks to findbiologically meaningful gene clusters, integrating microarray transcriptional data withbiological information about gene functions, as provided by the Gene Ontology database. Inthis way, gene similarity is inferred from a combined metric that takes into account not onlytranscriptional similitude, but also functional distances between gene pairs. These biologicaldistances are inferred based on semantic similarities [Couto] of Gene Ontology annotations,and quantified considering a vector space model.ResultsThe new algorithm was applied to several gene expression data sets. We found out that itgreatly improves the biological coherence of cluster partitions, compared to the the naive SPCimplementation, and other traditional clustering methods.ConclusionsWe present a new clustering procedure, based on the SPC algorithm. Using a combinedmetric, the method can simultaneously mine for meaningful structure in both, expression andfunctional spaces. The algorithm shows a robust behavior and succeeds in recognizesensible gene clusters that might be used to identify relevant biological process for thephenotypes of interest.