INVESTIGADORES
AGÜERO Fernan Gonzalo
congresos y reuniones científicas
Título:
Designing and implementing chemoinformatic approaches in the TDR Targets Database: linking genes to chemical compounds in tropical disease causing pathogens
Autor/es:
MAGARIÑOS MP, OVERINGTON JP, CARMONA SJ, SHANMUGAM D, DOYLE M, RALPH SA, CROWTHER GJ, HERTZ-FOWLER C, NWAKA S, BERRIMAN M, ROOS D, VAN VOORHIS W, AGÜERO F
Lugar:
Montevideo, Uruguay
Reunión:
Conferencia; ISCB Latin America; 2010
Institución organizadora:
International Society for Computational Biology (ISCB)
Resumen:
<!-- @page { margin: 0.79in } P { margin-bottom: 0.08in } A:link { so-language: zxx } --> Development of a cheminformatics component into the TDR Targets Database María Paula Magariños1, John Overington2, Santiago Carmona1, Dhanasekaran Shanmugam3, Maria Doyle4, Stuart Ralph4, Greg Crowther5, Christiane Hertz-Fowler6, Solomon Nwaka7, Matt Berriman6, David Roos3, Wes Van Voorhis5, Fernán Agüero1 1 Instituto de Investigaciones Biotecnológicas, Universidad de San Martín, San Martín, Argentina 2 European Bioinformatics Institute, EBML Outstation, Hinxton, Cambridge, UK3 University of Pennsylvania, Philadelphia PA (USA)4 University of Melbourne, Victoria (Australia)5 University of Washington, Seattle WA (USA)6 Wellcome Trust Sanger Institute, Hinxton, Cambridge (UK)7 WHO/TDR, Geneva (Switzerland) Background TDR Targets (tdrtargets.org) is a database that associates gene information from human pathogens with genomic and functional information from various sources [1]. Users of this resource can numerically weight the evidence available for each target, obtaining ranked lists of prioritized candidates. Currently, information about chemical compounds and their activity against molecular targets is accessible in the literature or in specialized databases. However, there is no site integrating this type of information for neglected diseases. The objective of this work is to integrate various chemical compound data sets obtained from different sources into TDR Targets, and to develop different ways in which users could query chemical data. Materials and methods Information about molecules and compounds was obtained from three different databases: DrugBank, PubChem and Starlite (ChEMBL). We developed a cheminformatics pipeline to calculate a number of properties and descriptors for each molecule, in order to facilitate searches and cross-talk to other databases. Descriptors included : InChi (IUPAC´s standard and open chemical identifiers); SMILES; molecular formula; number of flexible bonds; polar surface area; molecular weight; H bond donors; H bond acceptors; and predicted octanol/water partition coefficient. We have also stored a number of binary fingerprints and molecular statistics based on these descriptors to accelerate searches. Results We integrated into the TDR Targets database information on 504,020 compounds, enriched in drugs and drug-like molecules. In the Starlite database 438,791 of these are associated with 3,512 druggable targets, 2,224 of which could be linked to 3,043 pathogen targets based on similarity. Different search types have been developed in the TDR Targets web application (Figure 1): a textual search on molecular descriptors or chemical properties; a substructure search that will find molecules containing the query molecule; and a similarity search that will find similar molecules (using the Tanimoto distance) . Conclusions Information about 504,020 compounds was integrated into TDR Targets Database. These data can be queried in the TDR Targets website through different search types, resulting in a list of compounds with information regarding chemical properties, descriptors, synonyms, and links to associated genes. Acknowledgements. This work was funded by the “Special Programme for Research and Training in Tropical Diseases (UNICEF/UNDP/World Bank/WHO)”. María Paula Magariños is supported by an NIH-Fogarty Training Programme in Infectious Diseases. References [1] Agüero, F. et al. Genomic-scale prioritization of drug targets: the TDR Targets database. Nat Rev Drug Discov 7, 900–907 (2008). URL http://dx.doi.org/10.1038/nrd2684