ININFA   02677
INSTITUTO DE INVESTIGACIONES FARMACOLOGICAS
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Facing Affymetrix chips analysis through the Data Mining Framework: An application in Parkinson's gene expression experiment
Autor/es:
GERMÁN GONZÁLEZ; CELIA LARRAMENDY; OSCAR S. GERSHANIK; ELMER FERNÁNDEZ
Lugar:
Universidad Nacional de Quilmes
Reunión:
Congreso; 1er Congreso Argentino de Bioinformática y Biología computacional; 2010
Institución organizadora:
Universidad de Quilmes
Resumen:
<!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;
mso-font-charset:0;
mso-generic-font-family:roman;
mso-font-pitch:variable;
mso-font-signature:-1610611985 1107304683 0 0 159 0;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;
mso-font-charset:0;
mso-generic-font-family:swiss;
mso-font-pitch:variable;
mso-font-signature:-1610611985 1073750139 0 0 159 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-parent:"";
margin-top:0cm;
margin-right:0cm;
margin-bottom:10.0pt;
margin-left:0cm;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-fareast-font-family:Calibri;
mso-bidi-font-family:"Times New Roman";
mso-ansi-language:EN-US;
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-noshow:yes;
mso-style-priority:99;
color:blue;
text-decoration:underline;
text-underline:single;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-noshow:yes;
mso-style-priority:99;
color:purple;
mso-themecolor:followedhyperlink;
text-decoration:underline;
text-underline:single;}
p
{mso-style-noshow:yes;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:"Times New Roman","serif";
mso-fareast-font-family:"Times New Roman";
mso-ansi-language:EN-US;
mso-fareast-language:EN-US;}
.MsoChpDefault
{mso-style-type:export-only;
mso-default-props:yes;
font-size:10.0pt;
mso-ansi-font-size:10.0pt;
mso-bidi-font-size:10.0pt;
mso-ascii-font-family:Calibri;
mso-fareast-font-family:Calibri;
mso-hansi-font-family:Calibri;}
@page Section1
{size:612.0pt 792.0pt;
margin:70.85pt 3.0cm 70.85pt 3.0cm;
mso-header-margin:35.4pt;
mso-footer-margin:35.4pt;
mso-paper-source:0;}
div.Section1
{page:Section1;}
-->
Background
The Knowledge Discovery in Databases
process provides a suitable framework for data analysis in biology. Nowadays,
High-throughput genomic technologies, such as microarrays, generate a massive
amount of data with complex structure which requires the use of appropriate
methods and tools to find relevant biological knowledge from them. Here we show
the use of the Unified Analytical Process (UAP) for Data Mining (DM) to analyze
a microarray experiment conducted to compare the gene effects of two known antiparkinsonian
(L-Dopa and Pramipexole (PRA)).Both
are widely used in Parkinson's disease (PD) clinics. Initially L-dopa works
sufficiently against the symptoms of PD but after some years of treatment
dyskinesias appear. Lower risks of dyskinesias are observed when D2 dopamine
receptor agonists are used. PRA has been successfully used to treat the
symptoms of PD at its early stages but its therapeutic benefit tends to be less
when compared to L-dopa. Although there have been several studies published on
the gene expression profile induced by nigrostriatal
denervation and by L-dopa therapy in patients, and in animal models, only a few
have dealt with the gene expression profile induced by dopamine agonists.
Methods
Rats with severe motor deficiencies were randomly
assigned to receive during 3 weeks water (LA), L-dopa (LL), or PRA (LP). Normal
rats were treated with water (NA, control group). Messenger RNA was isolated
from the lesioned striata to interrogate Affymetrix Rat Gene 1.0 ST.
The
UAP-DM is framework based on CRISP [1] modified to fit the requirements of
biological studies. Here we show the workflow of such methodology showing the
different steps and their results.
Results
UAP-DM organizes the data mining process into five
phases:
Business Understanding: The first step is to understand the biological problem that drives the
research and then define the data mining problem. Also we have to define
the tool used to analyze the data. We chose open-source R programming
environment [2] in conjunction with the Bioconductor software.
Data Understanding: The data understanding
phase focuses on exploring the data from microarrays, assess the quality of the
experiment by means of different visualization techniques (boxplot, histograms,
M vs. A plots, etc). This process yields to the final dataset. Here we describe
the different available techniques appropriate for the microarray technology used
in these experiments.
Modeling: In this phase, the
modeling techniques are chosen. Differentially expressed genes between
conditions and controls were determined by means of a moderated t-test [3]. Parameters such as p-value and M threshold are
calibrated to optimal values.
Microarray profiling identified several genes that were
differentially expressed between the LL and the LP groups. Sets of co-regulated
genes are visualized in a heatmap. Between these genes we have identified some
dopamine related genes and genes related to mitochondrion, ribosomal and
proteosome function.
Evaluation: At this stage it is
important to more thoroughly evaluate the model, and determine whether it achieves
all the objectives.
Deployment: The analysis results are
organized and presented in a report that the biologist can easy read and
understand.
Conclussion
We
find the UAP-DM to be a very useful framework to face microarray experiments,
yielding to an ordered workflow of steps that allow a comprehensive analysis of
all the aspects related to the microarray experiment, such as design, quality
assurance, technical feedback, differential expression analysis and knowledge
deployment. In particular it was possible to find relevant and useful gene sets
that bring light to new hypothesis in the way those drugs affect Parkinson's
disease.
References
1. CRISP-DM [http://www.crisp-dm.org/]
2. R project page [http://www.r-project.org/]
3. Smyth GK: Linear models and empirical Bayes
methods for assessing differential expression in microarray experiments. Statistical
Applications in Genetics and Molecular Biology3, No. 1, Article 3.