The University of Southampton
University of Southampton Institutional Repository

Using linear-scaling DFT for biomolecular simulations

Using linear-scaling DFT for biomolecular simulations
Using linear-scaling DFT for biomolecular simulations
In the drug discovery process, there are multiple factors that make a successful candidate other than whether it antagonises a chosen active site, or performs allosteric regulation. Each test candidate is profiled by its absorption into the bloodstream, distribution throughout the organism, its products of metabolism, method of excretion, and overall toxicity; summarised as ADMET. There are currently methods to calculate and predict such properties, but the majority of these involve rule-based, empirical approaches that run the risk of lacking accuracy as one's search of chemical space ventures into the more novel. The lack of experimental data on organometallic systems also means that some of these methods refuse to predict properties on them outright, losing the opportunity to exploit this relatively untapped area that holds promise for new antibacterial and antineoplastic pharmaceutical compounds. Using the more transferable and definitive quantum mechanical (QM) approach to drug discovery is desirable, but the computational cost of conventional Hartree-Fock (HF) and Density Functional Theory (DFT) calculations are too high. Using the linear-scaling DFT program, onetep, we aim to exploit the benefits of DFT in calculations with much larger fragments of, and in some cases entire biomolecules, in order to demonstrate calculations which could ultimately be used in developing more accurate methods of profiling drug candidates, with a computational cost that albeit still high, is now feasible with the provision of modern supercomputers. In this thesis, we first use linear-scaling DFT methods to address the lack of electron polarisation and charge transfer effects in energy calculations using a molecular mechanics forcefield. Multiple DFT calculations are performed on molecular dynamics(MD) snapshots of small molecules in a waterbox, with the aim of computing a MM!QM correction term, which can be applied to a forcefield binding free energy approach (such as thermodynamic integration) which will process a far greater number of MD snapshots. As a result, one will obtain the precision from processing very large numbers of MD snapshots of biomolecular systems, but the accuracy of QM. To improve efficiency of the QM phase of the overall method, we use electrostatic embedding to model the regions of the waterbox that are far from the solute, yet are still important to include. As this is a relatively new module in onetep, we present validation data prior to its use in the main work. Secondly, we validate different methods of calculating the pKa of a wide variety of molecules: from small, organic compounds, to the organometallic cisplatin, with the ultimate goal being of such calculations to eventually address questions such as, assuming oral intake, where in the gastrointestinal tract will a drug molecule be absorbed into the bloodstream, and how much of the original dose will be absorbed. These calculations are then scaled up significantly to examine the potential of using linear-scaling DFT to calculate the pKa of specific residues in proteins. This is performed with a 305-atom tryptophan cage, the 814-atom Ovomucoid Silver Pheasant Third Domain(OMSVP3) and a 2346-atom section of the T99A/M102Q T4-lysozyme mutant. We also highlight the challenges in calculating protein pKa. Finally, we study the hydrogen-abstraction reaction between cyclohexene and cytochrome P450cam, through onetep single point energy calculations of a 10-snapshot adiabaticreaction profile generated by the Mulholland Group(University of Bristol). Following this, the LST and QST methods of determining the transition state (available through onetep) are used, with the aims of determining the importance of the protein surrounding the active site in regards to the activation energy and structural geometry of the calculated transition state. The LST and QST methods are also validated, through modelling of the SN2 reaction between fluoride and chloromethane. The aim of this part of our work is to eventually assist in developing a metabolism (and toxicity) model of the different isoforms of cytochrome P450. Overall, this thesis aims to highlight not only the capability of linear-scaling DFT in becoming an important part of biomolecular simulation, but also the challenges that one will face upon scaling up calculations that were previously simple to perform, based on the small size of the system being modelled.
Pittock, Chris
0732d958-6ae6-48d8-81c8-3d17e32a0039
Pittock, Chris
0732d958-6ae6-48d8-81c8-3d17e32a0039
Skylaris, Chris-Kriton
8f593d13-3ace-4558-ba08-04e48211af61

Pittock, Chris (2014) Using linear-scaling DFT for biomolecular simulations. University of Southampton, Chemistry, Doctoral Thesis, 263pp.

Record type: Thesis (Doctoral)

Abstract

In the drug discovery process, there are multiple factors that make a successful candidate other than whether it antagonises a chosen active site, or performs allosteric regulation. Each test candidate is profiled by its absorption into the bloodstream, distribution throughout the organism, its products of metabolism, method of excretion, and overall toxicity; summarised as ADMET. There are currently methods to calculate and predict such properties, but the majority of these involve rule-based, empirical approaches that run the risk of lacking accuracy as one's search of chemical space ventures into the more novel. The lack of experimental data on organometallic systems also means that some of these methods refuse to predict properties on them outright, losing the opportunity to exploit this relatively untapped area that holds promise for new antibacterial and antineoplastic pharmaceutical compounds. Using the more transferable and definitive quantum mechanical (QM) approach to drug discovery is desirable, but the computational cost of conventional Hartree-Fock (HF) and Density Functional Theory (DFT) calculations are too high. Using the linear-scaling DFT program, onetep, we aim to exploit the benefits of DFT in calculations with much larger fragments of, and in some cases entire biomolecules, in order to demonstrate calculations which could ultimately be used in developing more accurate methods of profiling drug candidates, with a computational cost that albeit still high, is now feasible with the provision of modern supercomputers. In this thesis, we first use linear-scaling DFT methods to address the lack of electron polarisation and charge transfer effects in energy calculations using a molecular mechanics forcefield. Multiple DFT calculations are performed on molecular dynamics(MD) snapshots of small molecules in a waterbox, with the aim of computing a MM!QM correction term, which can be applied to a forcefield binding free energy approach (such as thermodynamic integration) which will process a far greater number of MD snapshots. As a result, one will obtain the precision from processing very large numbers of MD snapshots of biomolecular systems, but the accuracy of QM. To improve efficiency of the QM phase of the overall method, we use electrostatic embedding to model the regions of the waterbox that are far from the solute, yet are still important to include. As this is a relatively new module in onetep, we present validation data prior to its use in the main work. Secondly, we validate different methods of calculating the pKa of a wide variety of molecules: from small, organic compounds, to the organometallic cisplatin, with the ultimate goal being of such calculations to eventually address questions such as, assuming oral intake, where in the gastrointestinal tract will a drug molecule be absorbed into the bloodstream, and how much of the original dose will be absorbed. These calculations are then scaled up significantly to examine the potential of using linear-scaling DFT to calculate the pKa of specific residues in proteins. This is performed with a 305-atom tryptophan cage, the 814-atom Ovomucoid Silver Pheasant Third Domain(OMSVP3) and a 2346-atom section of the T99A/M102Q T4-lysozyme mutant. We also highlight the challenges in calculating protein pKa. Finally, we study the hydrogen-abstraction reaction between cyclohexene and cytochrome P450cam, through onetep single point energy calculations of a 10-snapshot adiabaticreaction profile generated by the Mulholland Group(University of Bristol). Following this, the LST and QST methods of determining the transition state (available through onetep) are used, with the aims of determining the importance of the protein surrounding the active site in regards to the activation energy and structural geometry of the calculated transition state. The LST and QST methods are also validated, through modelling of the SN2 reaction between fluoride and chloromethane. The aim of this part of our work is to eventually assist in developing a metabolism (and toxicity) model of the different isoforms of cytochrome P450. Overall, this thesis aims to highlight not only the capability of linear-scaling DFT in becoming an important part of biomolecular simulation, but also the challenges that one will face upon scaling up calculations that were previously simple to perform, based on the small size of the system being modelled.

PDF
__soton.ac.uk_ude_PersonalFiles_Users_lp5_mydocuments_Theses PDF files_Chris_Pittock_PhD_Thesis.pdf - Other
Download (14MB)

More information

Published date: 28 February 2014
Organisations: University of Southampton, Chemistry

Identifiers

Local EPrints ID: 362968
URI: http://eprints.soton.ac.uk/id/eprint/362968
PURE UUID: db7e3282-95ca-4d9b-97d2-c09d3ac178c8
ORCID for Chris-Kriton Skylaris: ORCID iD orcid.org/0000-0003-0258-3433

Catalogue record

Date deposited: 18 Mar 2014 11:47
Last modified: 17 Jul 2018 00:32

Export record

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×