INVESTIGADORES
CAVASOTTO Claudio Norberto
congresos y reuniones científicas
Título:
Quantum-mechanical simulation of biological macromolecules and its application in structure-based drug design
Autor/es:
ANISIMOV, VICTOR M.; BUGAENKO, V.L.; CAVASOTTO, CLAUDIO N.
Lugar:
Washington, DC
Reunión:
Conferencia; 238th National Meeting of the American-Chemical-Society; 2009
Resumen:
Computer simulation of biological systems is an invaluable tool aiding in interpretation of experimental data and expanding the limits of instrumental techniques by means of virtual or thought experiments. Classical molecular mechanics methods being the workhorse of such simulations proved to be remarkably accurate in addressing many chemical and biological problems at atomic resolution. Seeking additional degree of details from a computer simulation effectively leads to quantum-mechanical (QM) methods. Application of QM methods to the study of biological systems made significant progress in the past due to the development of linear scaling methods; however the remaining high computational cost of the linear scaling QM methods is still a serious bottleneck. This effectively places a barrier on application of ?ab initio? methods to the study of biological macromolecules. Therefore approximate semiempirical QM methods become an attractive alternative. These methods are thousand fold faster than the chemically-sound density functional methods and offer wide possibilities for accuracy improvement owing to their parametric nature. Following the notion that in computer simulation of biological macromolecules speed is the paramount factor we formulated the semiempirical variational finite localized molecular orbital (VFL) approximation and, based on its foundation, the linear scaling semiempirical method LocalSCF. Since the time of our first publication presenting QM calculation of 100,000 atoms GroEL-GroES chaperonin complex we expanded the size limit of biological systems treated on a desktop computer to million  atoms protein multimers placed in explicit solvent having the entire system reaching several tens of nanometers in size. Due to the quantum-mechanical treatment of electrostatic interactions we get more realistic protein-solvent interaction profile than otherwise possible using the fixed-charge classical models. Besides of  gaining new insights into the electronic structure of nano-size biological systems the ability to treat quantum-mechanically the ultra-large biomolecules provides the necessary speed advantage to consider  molecular dynamics (MD) of regular size proteins at QM level. These simulations reach hundred picosecond time length on a modest-size cluster within a few month time using 1 fs integration time step. The size of the systems and the length of simulations thus achieved brought QM methods close to their molecular mechanics (MM) counterparts yet expanding the modeling technique by the consistent treatment of charge polarization, which is inherent to the QM framework. Based on these advantages we compute protein-ligand binding free energy from first principles utilizing QM MD. Considering the importance of polarization effects in realistic treatment of protein-ligand interactions we developed two-layer QM/QM method for the purpose of entirely quantum-mechanical high throughput docking of million-compound libraries of commercially available compounds toward known receptor. By reducing the variable portion of the density matrix to the area confining the active site and ligand and considering the distant part of the protein as carrying a frozen density matrix we obtained up to 10-fold speed up over the regular one-layer QM calculation. Based on the performed tests we found acceptable including a limited part of the protein defined by 6 Angstrom spherical cut-off distance from ligand atoms into the variable density matrix zone. For high-precision calculation the spherical cut-off distance may be increased to 10 Angstroms still retaining 4-fold speed up. Using the developed method we performed large-scale QM docking studies on SH2 domain of p56 LCK tyrosine kinase consisting of 1700 atoms, treating the entire system at semiempirical AM1 level. The lymphoid T-cell tyrosine kinase plays a critical role in T-cell-mediated immune response and it is an important target in the treatment of blood cancer, rheumatoid arthritis and for immunosupression. This highly ionized protein including 13 negatively charged and 17 positively charged amino acids is capable of inducing a strong polarization on bound ligand and to sense back the polarization coming from the ligand. QM scoring of 200,000 empirically docked structures in their protein-ligand complexes required 1 day of computation on 32-CPU cluster. Utilizing this remarkable performance we performed flexible ligand QM docking of the selected 10,000 top-scoring structures by relaxing the ligand geometry while holding the receptor rigid. The docking study including 100 steps of ligand geometry optimization required half day of computation time on 32-CPU cluster. Since the performed calculations treat the protein-ligand complex at the single QM-level of theory this eliminates theoretical uncertainties in treatment of boundary region and interfacing methods using incompatible physical representations. LocalCSF method also offers valuable resources to ligand database preparation. Million compound libraries of commercially available drug-like compounds are the typical component in routine high throughput virtual screening computations. The immense size of such libraries makes difficult assuring the quality of 3D structure of the stored compounds. Facilitating the validation is reconstruction of molecular topology by LocalSCF based on atom types and Cartesian coordinates of the input structure, which is necessary to generate chemically intuitive localized molecular orbitals. The utilization of topology information proved to be the highly efficient factor in making LocalSCF fast. Besides, failure to construct the chemically sensible topology and to assign electrons to bonds and lone-pairs helps to identify a structure problem, which left undetected may ruin the course of calculation. Using this capability we performed structure validation  of  10,000,000 compound library using 64-CPUs on the Texas Advanced Computing Center. This calculation at AM1 level took one day. Despite of using a well-prepared database 0.1% out of the executed compound showed various degree of structural problems. Common types included missing hydrogen atoms, short interatomic distances (collisions), stretched bonds, structural distortions, etc. Atomic charges computed for such compounds would be essentially unphysical and corresponding to unexpected ionization states or worse radicals. Identification of the broken compounds is the necessary step for improvement of 2D->3D conversion protocols. In the conclusion, the application of semiempirical QM methods to the wide spectrum of biological problems announces the term of entirely quantum-mechanical simulation of biological macromolecules and structure based drug design.