Energy decomposition analysis for large-scale first principles quantum mechanical simulations of biomolecules
Energy decomposition analysis for large-scale first principles quantum mechanical simulations of biomolecules
Kohn-Sham density functional theory (DFT) is an extraordinarily powerful and versatile tool for calculating the properties of materials. In its conventional form, this approach scales cubically with the size of the system under study. This scaling becomes prohibitive when investigating larger arrangements such as biomolecules and nanostructures. More recently linear-scaling approaches have been developed that overcome this limitation, allowing calculations to be performed on systems many thousands of atoms in size. An example of such an approach is the ONETEP code which uses a plane wave-like basis set and is based upon the use of spherically-localised orbitals. A simple yet common calculation performed using ab initio codes is the total (ground state) energy calculation. By comparing the energy of isolated parts of a system to the energy of the combined system, we are able to obtain the energy of interaction. This quantity is useful as it provides a relative measure of the enthalpic stability of an interaction which can be compared to other systems. Equally, however, this quantity gives little indication of the driving forces that lead to the interaction energy we observe. A number of approaches have been developed that aim to identify these driving forces. Energy decomposition analysis (EDA) refers to the set of methods that decompose the interaction energy into physically relevant energy components which add to the full interaction energy. Few studies have applied EDA approaches to larger systems in the thousand-atom regime, with the vast majority of investigations focussing on small system studies (less than 100 atoms in size). These methods have shown varying degrees of success. In this work, we have evaluated the suitability of a selection of popular EDA methods in decomposing the interaction energies of small biomolecule-like systems. Based on the results of this review, we developed a linear-scaling EDA approach in the ONETEP code that separates the intermolecular interaction energy into chemically distinct components (electrostatic, exchange, correlation, Pauli repulsion, polarisation, and charge transfer). The intermediate state used to calculate polarisation, also known as the absolutely localised molecular orbital (ALMO) state, has the key advantage of being fully antisymmetric and variationally optimised. The linear-scaling capability of the scheme is based on use of an adaptive purification approach and sparse matrix equations. We demonstrate the accuracy of this approach in reproducing the energy component values of its Gaussian basis counterparts, and present a remedy to the limitation of polarisation and charge transfer basis set dependence that is based on the property of strict localisation of the ONETEP orbitals. Additionally, we show the method to have mild exchange-correlation functional and atomic coordinate dependence. We have demonstrated the high value of our method by applying it to the thrombin protein interacting with a number of small binders. Here, we used our scheme in combination with electron density difference (EDD) plots to identify the key protein and ligand regions that contribute to polarisation and charge transfer. In our studies, we assessed convergence of the EDA components with protein truncation up to a total system size of 4975 atoms. Additionally, we applied our EDA to binders that had been partitioned into smaller fragments. Here, we accurately quantified the bonding contributions of key ligand moieties with particular regions of the protein cavity. We assessed how accurately the ligand binding components are reproduced by the fragment contributions using an additivity measure. Using this measure, we showed the fragment binding components to add up to the full ligand binding component with overall minimal additivity error. We also investigated the energy components of a series of small thrombin S1 pocket binders all less than 30 atoms in size. In this study, we demonstrate the EDA and EDD plots as tools for understanding the relative importance of different binder structural features and positionings within the pocket. Overall, we show our EDA method to be a stable and powerful approach for the analysis of interaction energies in systems of large size. The application of this method is not limited to biomolecular studies, and we expect that this approach can be readily applied to analyses within other fields, for example materials, catalysts, and nanostructures.
University of Southampton
Phipps, Maximillian, Joshua Sebastian
290febb8-7f0a-4bda-9944-96594d6343d2
January 2017
Phipps, Maximillian, Joshua Sebastian
290febb8-7f0a-4bda-9944-96594d6343d2
Skylaris, Chris-Kriton
8f593d13-3ace-4558-ba08-04e48211af61
Phipps, Maximillian, Joshua Sebastian
(2017)
Energy decomposition analysis for large-scale first principles quantum mechanical simulations of biomolecules.
University of Southampton, Doctoral Thesis, 298pp.
Record type:
Thesis
(Doctoral)
Abstract
Kohn-Sham density functional theory (DFT) is an extraordinarily powerful and versatile tool for calculating the properties of materials. In its conventional form, this approach scales cubically with the size of the system under study. This scaling becomes prohibitive when investigating larger arrangements such as biomolecules and nanostructures. More recently linear-scaling approaches have been developed that overcome this limitation, allowing calculations to be performed on systems many thousands of atoms in size. An example of such an approach is the ONETEP code which uses a plane wave-like basis set and is based upon the use of spherically-localised orbitals. A simple yet common calculation performed using ab initio codes is the total (ground state) energy calculation. By comparing the energy of isolated parts of a system to the energy of the combined system, we are able to obtain the energy of interaction. This quantity is useful as it provides a relative measure of the enthalpic stability of an interaction which can be compared to other systems. Equally, however, this quantity gives little indication of the driving forces that lead to the interaction energy we observe. A number of approaches have been developed that aim to identify these driving forces. Energy decomposition analysis (EDA) refers to the set of methods that decompose the interaction energy into physically relevant energy components which add to the full interaction energy. Few studies have applied EDA approaches to larger systems in the thousand-atom regime, with the vast majority of investigations focussing on small system studies (less than 100 atoms in size). These methods have shown varying degrees of success. In this work, we have evaluated the suitability of a selection of popular EDA methods in decomposing the interaction energies of small biomolecule-like systems. Based on the results of this review, we developed a linear-scaling EDA approach in the ONETEP code that separates the intermolecular interaction energy into chemically distinct components (electrostatic, exchange, correlation, Pauli repulsion, polarisation, and charge transfer). The intermediate state used to calculate polarisation, also known as the absolutely localised molecular orbital (ALMO) state, has the key advantage of being fully antisymmetric and variationally optimised. The linear-scaling capability of the scheme is based on use of an adaptive purification approach and sparse matrix equations. We demonstrate the accuracy of this approach in reproducing the energy component values of its Gaussian basis counterparts, and present a remedy to the limitation of polarisation and charge transfer basis set dependence that is based on the property of strict localisation of the ONETEP orbitals. Additionally, we show the method to have mild exchange-correlation functional and atomic coordinate dependence. We have demonstrated the high value of our method by applying it to the thrombin protein interacting with a number of small binders. Here, we used our scheme in combination with electron density difference (EDD) plots to identify the key protein and ligand regions that contribute to polarisation and charge transfer. In our studies, we assessed convergence of the EDA components with protein truncation up to a total system size of 4975 atoms. Additionally, we applied our EDA to binders that had been partitioned into smaller fragments. Here, we accurately quantified the bonding contributions of key ligand moieties with particular regions of the protein cavity. We assessed how accurately the ligand binding components are reproduced by the fragment contributions using an additivity measure. Using this measure, we showed the fragment binding components to add up to the full ligand binding component with overall minimal additivity error. We also investigated the energy components of a series of small thrombin S1 pocket binders all less than 30 atoms in size. In this study, we demonstrate the EDA and EDD plots as tools for understanding the relative importance of different binder structural features and positionings within the pocket. Overall, we show our EDA method to be a stable and powerful approach for the analysis of interaction energies in systems of large size. The application of this method is not limited to biomolecular studies, and we expect that this approach can be readily applied to analyses within other fields, for example materials, catalysts, and nanostructures.
Text
MaxPhipps_Thesis_compressed
- Version of Record
More information
Published date: January 2017
Organisations:
University of Southampton, Chemistry
Identifiers
Local EPrints ID: 410305
URI: http://eprints.soton.ac.uk/id/eprint/410305
PURE UUID: d62cc143-76f3-4c7d-a5fe-ef7a3a58ef14
Catalogue record
Date deposited: 07 Jun 2017 04:01
Last modified: 16 Mar 2024 05:24
Export record
Contributors
Author:
Maximillian, Joshua Sebastian Phipps
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics