Using linear-scaling DFT for biomolecular simulations
Using linear-scaling DFT for biomolecular simulations
 
  In the drug discovery process, there are multiple factors that make a successful candidate other than whether it antagonises a chosen active site, or performs allosteric regulation. Each test candidate is profiled by its absorption into the bloodstream, distribution throughout the organism, its products of metabolism, method of excretion, and overall toxicity; summarised as ADMET. There are currently methods to calculate and predict such properties, but the majority of these involve rule-based, empirical approaches that run the risk of lacking accuracy as one's search of chemical space ventures into the more novel. The lack of experimental data on organometallic systems also means that some of these methods refuse to predict properties on them outright, losing the opportunity to exploit this relatively untapped area that holds promise for new antibacterial and antineoplastic pharmaceutical compounds. Using the more transferable and definitive quantum mechanical (QM) approach to drug discovery is desirable, but the computational cost of conventional Hartree-Fock (HF) and Density Functional Theory (DFT) calculations are too high. Using the linear-scaling DFT program, onetep, we aim to exploit the benefits of DFT in calculations with much larger fragments of, and in some cases entire biomolecules, in order to demonstrate calculations which could ultimately be used in developing more accurate methods of profiling drug candidates, with a computational cost that albeit still high, is now feasible with the provision of modern supercomputers. In this thesis, we first use linear-scaling DFT methods to address the lack of electron polarisation and charge transfer effects in energy calculations using a molecular mechanics forcefield. Multiple DFT calculations are performed on molecular dynamics(MD) snapshots of small molecules in a waterbox, with the aim of computing a MM!QM correction term, which can be applied to a forcefield binding free energy approach (such as thermodynamic integration) which will process a far greater number of MD snapshots. As a result, one will obtain the precision from processing very large numbers of MD snapshots of biomolecular systems, but the accuracy of QM. To improve efficiency of the QM phase of the overall method, we use electrostatic embedding to model the regions of the waterbox that are far from the solute, yet are still important to include. As this is a relatively new module in onetep, we present validation data prior to its use in the main work. Secondly, we validate different methods of calculating the pKa of a wide variety of molecules: from small, organic compounds, to the organometallic cisplatin, with the ultimate goal being of such calculations to eventually address questions such as, assuming oral intake, where in the gastrointestinal tract will a drug molecule be absorbed into the bloodstream, and how much of the original dose will be absorbed. These calculations are then scaled up significantly to examine the potential of using linear-scaling DFT to calculate the pKa of specific residues in proteins. This is performed with a 305-atom tryptophan cage, the 814-atom Ovomucoid Silver Pheasant Third Domain(OMSVP3) and a 2346-atom section of the T99A/M102Q T4-lysozyme mutant. We also highlight the challenges in calculating protein pKa. Finally, we study the hydrogen-abstraction reaction between cyclohexene and cytochrome P450cam, through onetep single point energy calculations of a 10-snapshot adiabaticreaction profile generated by the Mulholland Group(University of Bristol). Following this, the LST and QST methods of determining the transition state (available through onetep) are used, with the aims of determining the importance of the protein surrounding the active site in regards to the activation energy and structural geometry of the calculated transition state. The LST and QST methods are also validated, through modelling of the SN2 reaction between fluoride and chloromethane. The aim of this part of our work is to eventually assist in developing a metabolism (and toxicity) model of the different isoforms of cytochrome P450. Overall, this thesis aims to highlight not only the capability of linear-scaling DFT in becoming an important part of biomolecular simulation, but also the challenges that one will face upon scaling up calculations that were previously simple to perform, based on the small size of the system being modelled.
  
    
      Pittock, Chris
      
        0732d958-6ae6-48d8-81c8-3d17e32a0039
      
     
  
  
   
  
  
    
      28 February 2014
    
    
  
  
    
      Pittock, Chris
      
        0732d958-6ae6-48d8-81c8-3d17e32a0039
      
     
  
    
      Skylaris, Chris-Kriton
      
        8f593d13-3ace-4558-ba08-04e48211af61
      
     
  
       
    
 
  
    
      
  
 
  
  
  
    Pittock, Chris
  
  
  
  
   
    (2014)
  
  
    
    Using linear-scaling DFT for biomolecular simulations.
  University of Southampton, Chemistry, Doctoral Thesis, 263pp.
  
   
  
    
      Record type:
      Thesis
      
      
      (Doctoral)
    
   
    
    
      
        
          Abstract
          In the drug discovery process, there are multiple factors that make a successful candidate other than whether it antagonises a chosen active site, or performs allosteric regulation. Each test candidate is profiled by its absorption into the bloodstream, distribution throughout the organism, its products of metabolism, method of excretion, and overall toxicity; summarised as ADMET. There are currently methods to calculate and predict such properties, but the majority of these involve rule-based, empirical approaches that run the risk of lacking accuracy as one's search of chemical space ventures into the more novel. The lack of experimental data on organometallic systems also means that some of these methods refuse to predict properties on them outright, losing the opportunity to exploit this relatively untapped area that holds promise for new antibacterial and antineoplastic pharmaceutical compounds. Using the more transferable and definitive quantum mechanical (QM) approach to drug discovery is desirable, but the computational cost of conventional Hartree-Fock (HF) and Density Functional Theory (DFT) calculations are too high. Using the linear-scaling DFT program, onetep, we aim to exploit the benefits of DFT in calculations with much larger fragments of, and in some cases entire biomolecules, in order to demonstrate calculations which could ultimately be used in developing more accurate methods of profiling drug candidates, with a computational cost that albeit still high, is now feasible with the provision of modern supercomputers. In this thesis, we first use linear-scaling DFT methods to address the lack of electron polarisation and charge transfer effects in energy calculations using a molecular mechanics forcefield. Multiple DFT calculations are performed on molecular dynamics(MD) snapshots of small molecules in a waterbox, with the aim of computing a MM!QM correction term, which can be applied to a forcefield binding free energy approach (such as thermodynamic integration) which will process a far greater number of MD snapshots. As a result, one will obtain the precision from processing very large numbers of MD snapshots of biomolecular systems, but the accuracy of QM. To improve efficiency of the QM phase of the overall method, we use electrostatic embedding to model the regions of the waterbox that are far from the solute, yet are still important to include. As this is a relatively new module in onetep, we present validation data prior to its use in the main work. Secondly, we validate different methods of calculating the pKa of a wide variety of molecules: from small, organic compounds, to the organometallic cisplatin, with the ultimate goal being of such calculations to eventually address questions such as, assuming oral intake, where in the gastrointestinal tract will a drug molecule be absorbed into the bloodstream, and how much of the original dose will be absorbed. These calculations are then scaled up significantly to examine the potential of using linear-scaling DFT to calculate the pKa of specific residues in proteins. This is performed with a 305-atom tryptophan cage, the 814-atom Ovomucoid Silver Pheasant Third Domain(OMSVP3) and a 2346-atom section of the T99A/M102Q T4-lysozyme mutant. We also highlight the challenges in calculating protein pKa. Finally, we study the hydrogen-abstraction reaction between cyclohexene and cytochrome P450cam, through onetep single point energy calculations of a 10-snapshot adiabaticreaction profile generated by the Mulholland Group(University of Bristol). Following this, the LST and QST methods of determining the transition state (available through onetep) are used, with the aims of determining the importance of the protein surrounding the active site in regards to the activation energy and structural geometry of the calculated transition state. The LST and QST methods are also validated, through modelling of the SN2 reaction between fluoride and chloromethane. The aim of this part of our work is to eventually assist in developing a metabolism (and toxicity) model of the different isoforms of cytochrome P450. Overall, this thesis aims to highlight not only the capability of linear-scaling DFT in becoming an important part of biomolecular simulation, but also the challenges that one will face upon scaling up calculations that were previously simple to perform, based on the small size of the system being modelled.
         
      
      
        
          
            
  
    Text
 __soton.ac.uk_ude_PersonalFiles_Users_lp5_mydocuments_Theses PDF files_Chris_Pittock_PhD_Thesis.pdf
     - Other
   
  
  
 
          
            
          
            
           
            
           
        
        
       
    
   
  
  
  More information
  
    
      Published date: 28 February 2014
 
    
  
  
    
  
    
  
    
  
    
  
    
  
    
  
    
     
        Organisations:
        University of Southampton, Chemistry
      
    
  
    
  
  
        Identifiers
        Local EPrints ID: 362968
        URI: http://eprints.soton.ac.uk/id/eprint/362968
        
        
        
        
          PURE UUID: db7e3282-95ca-4d9b-97d2-c09d3ac178c8
        
  
    
        
          
        
    
        
          
            
              
            
          
        
    
  
  Catalogue record
  Date deposited: 18 Mar 2014 11:47
  Last modified: 22 Aug 2025 01:56
  Export record
  
  
 
 
  
    
    
      Contributors
      
          
          Author:
          
            
            
              Chris Pittock
            
          
        
      
        
      
      
      
    
  
   
  
    Download statistics
    
      Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
      
      View more statistics