ININFA   02677
INSTITUTO DE INVESTIGACIONES FARMACOLOGICAS
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Facing Affymetrix chips analysis through the Data Mining Framework: An application in Parkinson's gene expression experiment
Autor/es:
GERMÁN GONZÁLEZ; CELIA LARRAMENDY; OSCAR S. GERSHANIK; ELMER FERNÁNDEZ
Lugar:
Universidad Nacional de Quilmes
Reunión:
Congreso; 1er Congreso Argentino de Bioinformática y Biología computacional; 2010
Institución organizadora:
Universidad de Quilmes
Resumen:
<!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso-generic-font-family:roman; mso-font-pitch:variable; mso-font-signature:-1610611985 1107304683 0 0 159 0;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-1610611985 1073750139 0 0 159 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin-top:0cm; margin-right:0cm; margin-bottom:10.0pt; margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-fareast-font-family:Calibri; mso-bidi-font-family:"Times New Roman"; mso-ansi-language:EN-US; mso-fareast-language:EN-US;} a:link, span.MsoHyperlink {mso-style-noshow:yes; mso-style-priority:99; color:blue; text-decoration:underline; text-underline:single;} a:visited, span.MsoHyperlinkFollowed {mso-style-noshow:yes; mso-style-priority:99; color:purple; mso-themecolor:followedhyperlink; text-decoration:underline; text-underline:single;} p {mso-style-noshow:yes; mso-style-priority:99; mso-margin-top-alt:auto; margin-right:0cm; mso-margin-bottom-alt:auto; margin-left:0cm; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman","serif"; mso-fareast-font-family:"Times New Roman"; mso-ansi-language:EN-US; mso-fareast-language:EN-US;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-size:10.0pt; mso-ansi-font-size:10.0pt; mso-bidi-font-size:10.0pt; mso-ascii-font-family:Calibri; mso-fareast-font-family:Calibri; mso-hansi-font-family:Calibri;} @page Section1 {size:612.0pt 792.0pt; margin:70.85pt 3.0cm 70.85pt 3.0cm; mso-header-margin:35.4pt; mso-footer-margin:35.4pt; mso-paper-source:0;} div.Section1 {page:Section1;} --> Background The Knowledge Discovery in Databases process provides a suitable framework for data analysis in biology. Nowadays, High-throughput genomic technologies, such as microarrays, generate a massive amount of data with complex structure which requires the use of appropriate methods and tools to find relevant biological knowledge from them. Here we show the use of the Unified Analytical Process (UAP) for Data Mining (DM) to analyze a microarray experiment conducted to compare the gene effects of two known antiparkinsonian (L-Dopa and Pramipexole (PRA)).Both are widely used in Parkinson's disease (PD) clinics. Initially L-dopa works sufficiently against the symptoms of PD but after some years of treatment dyskinesias appear. Lower risks of dyskinesias are observed when D2 dopamine receptor agonists are used. PRA has been successfully used to treat the symptoms of PD at its early stages but its therapeutic benefit tends to be less when compared to L-dopa. Although there have been several studies published on the gene expression profile induced by  nigrostriatal denervation and by L-dopa therapy in patients, and in animal models, only a few have dealt with the gene expression profile induced by dopamine agonists.   Methods Rats with severe motor deficiencies were randomly assigned to receive during 3 weeks water (LA), L-dopa (LL), or PRA (LP). Normal rats were treated with water (NA, control group). Messenger RNA was isolated from the lesioned striata to interrogate Affymetrix Rat Gene 1.0 ST. The UAP-DM is framework based on CRISP [1] modified to fit the requirements of biological studies. Here we show the workflow of such methodology showing the different steps and their results. Results UAP-DM organizes the data mining process into five phases: Business Understanding: The first step is to understand the biological problem that drives the research and then define the data mining problem. Also we have to define the tool used to analyze the data. We chose open-source R programming environment [2] in conjunction with the Bioconductor software. Data Understanding: The data understanding phase focuses on exploring the data from microarrays, assess the quality of the experiment by means of different visualization techniques (boxplot, histograms, M vs. A plots, etc). This process yields to the final dataset. Here we describe the different available techniques appropriate for the microarray technology used in these experiments. Modeling: In this phase, the modeling techniques are chosen. Differentially expressed genes between conditions and controls were determined by means of a moderated t-test [3]. Parameters such as p-value and M threshold are calibrated to optimal values. Microarray profiling identified several genes that were differentially expressed between the LL and the LP groups. Sets of co-regulated genes are visualized in a heatmap. Between these genes we have identified some dopamine related genes and genes related to mitochondrion, ribosomal and proteosome function. Evaluation: At this stage it is important to more thoroughly evaluate the model, and determine whether it achieves all the objectives. Deployment: The analysis results are organized and presented in a report that the biologist can easy read and understand.   Conclussion We find the UAP-DM to be a very useful framework to face microarray experiments, yielding to an ordered workflow of steps that allow a comprehensive analysis of all the aspects related to the microarray experiment, such as design, quality assurance, technical feedback, differential expression analysis and knowledge deployment. In particular it was possible to find relevant and useful gene sets that bring light to new hypothesis in the way those drugs affect Parkinson's disease. References 1. CRISP-DM [http://www.crisp-dm.org/] 2. R project page [http://www.r-project.org/] 3. Smyth GK: Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology3, No. 1, Article 3.