The University of Southampton
University of Southampton Institutional Repository

Methods for accurate and efficient simulation of the conformational landscape of ligands

Methods for accurate and efficient simulation of the conformational landscape of ligands
Methods for accurate and efficient simulation of the conformational landscape of ligands
Ligand modelling is an essential element of drug discovery. To accurately simulate chemical and physical phenomena, it is necessary to employ molecular models that provide reliable results in a timely fashion. The gold standard method in ligand modelling remains quantum mechanics (QM). Owing to the high computational cost of QM methods, their use in ab initio simulations is limited to all but the simplest systems. Molecular mechanics force fields (MM FFs) have also been around for decades. They stand as the cheapest alternative to QM methods, despite their widely-known accuracy limitations. A promising new alternative to FFs are the machine-learning (ML) potentials. ML potentials are molecular models based on artificial intelligence, seemingly more flexible and accurate than FFs, although more computationally costly. For a given FF functional form, the quality of the parameterisation is crucial and determines how accurately observable properties can be computed from simulations. Whilst accurate FF parameterisations are available for biomolecules, the parameterisation of novel drug candidates is particularly challenging, as these may involve functional groups and interactions for which accurate parameters are not available. To address the problem of FF accuracy, we developed ParaMol, software that has the capability of reparameterising class I FFs with a special focus on druglike molecules. We demonstrate that, within the constraints of a FF functional form, ParaMol can derive near-ideal FF parameters. Additionally, we illustrate the best practices to follow when employing specific parameterisation routes; the sensitivity of different fitting data sets, such as relaxed dihedral scans and configurational ensembles, to the parameterisation procedure; and the features of the various weighting methods available to weight configurations. Monte Carlo (MC) and molecular dynamics (MD) simulations can be performed using FFs, ML potentials, or QM methods. The higher the level of theory used in MD or MC simulations, the more reliable the structural information extracted from them will be, despite the increase in computational cost. To combine the accuracy of ab initio simulations with the efficiency of classical ones, we present a multilevel MC method that allows quantum configurational ensembles to be generated while keeping the computational cost at a minimum. We show that FF reparameterisation is an efficient route to generate FFs that reproduce QM results more closely, which in turn can be used as low-cost models to approach the gold standard QM accuracy. We demonstrate that the MC acceptance rate is strongly correlated with various phase space overlap measurements, constituting a robust metric to evaluate the similarity between any two levels of theory. As more advanced applications, we apply the nMC-MC algorithm to generate the QM/MM distribution of a ligand in aqueous solution and present a selfparameterising version of the method. Recently, ML potentials have emerged as an alternative to FFs. However, owing to their newness, there are many unanswered questions concerning their applicability that must be addressed. To this end, we present a comparative study that evaluates the performance of a ML potential, a traditional FF, and an optimally tuned FF in the modelling of a set of 10 γ-fluorohydrins that exhibit a complex interplay between intra- and intermolecular interactions in determining conformer stability. For this set of molecules, we benchmark the performance of each molecular model, evaluating their energetic, geometric, and sampling accuracy relative to quantum mechanical data, both in the gas phase and chloroform solution. We also assess the performance of the aforementioned molecular models in estimating J-coupling constants by comparing their predictions to experimental data available in chloroform. We then discuss and highlight the strengths and weaknesses of each model, providing guidelines for future development of FFs and ML potentials. The complexity and scope of the problems addressed in this thesis preclude complete or definitive solutions. Even so, we believe the outcomes of this work may have implications in different areas of chemistry and biology, especially for those interested in modelling the conformational landscape of small organic molecules. The overall conclusions of this thesis are: FFs can be reliably parameterised in an automated fashion using ParaMol; optimally tuned FFs can work as gateways to generate QM ensembles, at least for small molecules in the gas phase; despite the ability of ML potentials to reproduce their training data, the transferability of ML potentials to other domains is limited, and conventional FFs still play an important role in molecular simulations.
University of Southampton
Dos Santos Morado, Joao Pedro
f83f0c26-bbe3-420c-9999-e22ab439c9c6
Dos Santos Morado, Joao Pedro
f83f0c26-bbe3-420c-9999-e22ab439c9c6
Skylaris, Chris-Kriton
8f593d13-3ace-4558-ba08-04e48211af61

Dos Santos Morado, Joao Pedro (2022) Methods for accurate and efficient simulation of the conformational landscape of ligands. University of Southampton, Doctoral Thesis, 323pp.

Record type: Thesis (Doctoral)

Abstract

Ligand modelling is an essential element of drug discovery. To accurately simulate chemical and physical phenomena, it is necessary to employ molecular models that provide reliable results in a timely fashion. The gold standard method in ligand modelling remains quantum mechanics (QM). Owing to the high computational cost of QM methods, their use in ab initio simulations is limited to all but the simplest systems. Molecular mechanics force fields (MM FFs) have also been around for decades. They stand as the cheapest alternative to QM methods, despite their widely-known accuracy limitations. A promising new alternative to FFs are the machine-learning (ML) potentials. ML potentials are molecular models based on artificial intelligence, seemingly more flexible and accurate than FFs, although more computationally costly. For a given FF functional form, the quality of the parameterisation is crucial and determines how accurately observable properties can be computed from simulations. Whilst accurate FF parameterisations are available for biomolecules, the parameterisation of novel drug candidates is particularly challenging, as these may involve functional groups and interactions for which accurate parameters are not available. To address the problem of FF accuracy, we developed ParaMol, software that has the capability of reparameterising class I FFs with a special focus on druglike molecules. We demonstrate that, within the constraints of a FF functional form, ParaMol can derive near-ideal FF parameters. Additionally, we illustrate the best practices to follow when employing specific parameterisation routes; the sensitivity of different fitting data sets, such as relaxed dihedral scans and configurational ensembles, to the parameterisation procedure; and the features of the various weighting methods available to weight configurations. Monte Carlo (MC) and molecular dynamics (MD) simulations can be performed using FFs, ML potentials, or QM methods. The higher the level of theory used in MD or MC simulations, the more reliable the structural information extracted from them will be, despite the increase in computational cost. To combine the accuracy of ab initio simulations with the efficiency of classical ones, we present a multilevel MC method that allows quantum configurational ensembles to be generated while keeping the computational cost at a minimum. We show that FF reparameterisation is an efficient route to generate FFs that reproduce QM results more closely, which in turn can be used as low-cost models to approach the gold standard QM accuracy. We demonstrate that the MC acceptance rate is strongly correlated with various phase space overlap measurements, constituting a robust metric to evaluate the similarity between any two levels of theory. As more advanced applications, we apply the nMC-MC algorithm to generate the QM/MM distribution of a ligand in aqueous solution and present a selfparameterising version of the method. Recently, ML potentials have emerged as an alternative to FFs. However, owing to their newness, there are many unanswered questions concerning their applicability that must be addressed. To this end, we present a comparative study that evaluates the performance of a ML potential, a traditional FF, and an optimally tuned FF in the modelling of a set of 10 γ-fluorohydrins that exhibit a complex interplay between intra- and intermolecular interactions in determining conformer stability. For this set of molecules, we benchmark the performance of each molecular model, evaluating their energetic, geometric, and sampling accuracy relative to quantum mechanical data, both in the gas phase and chloroform solution. We also assess the performance of the aforementioned molecular models in estimating J-coupling constants by comparing their predictions to experimental data available in chloroform. We then discuss and highlight the strengths and weaknesses of each model, providing guidelines for future development of FFs and ML potentials. The complexity and scope of the problems addressed in this thesis preclude complete or definitive solutions. Even so, we believe the outcomes of this work may have implications in different areas of chemistry and biology, especially for those interested in modelling the conformational landscape of small organic molecules. The overall conclusions of this thesis are: FFs can be reliably parameterised in an automated fashion using ParaMol; optimally tuned FFs can work as gateways to generate QM ensembles, at least for small molecules in the gas phase; despite the ability of ML potentials to reproduce their training data, the transferability of ML potentials to other domains is limited, and conventional FFs still play an important role in molecular simulations.

Text
Southampton_PhD_Thesis_JoaoMorado - Version of Record
Available under License University of Southampton Thesis Licence.
Download (19MB)
Text
PTD_Thesis_Morado-SIGNED
Restricted to Repository staff only

More information

Published date: August 2022

Identifiers

Local EPrints ID: 482719
URI: http://eprints.soton.ac.uk/id/eprint/482719
PURE UUID: 31a69607-4850-4827-9d73-6cd3db0f2dab
ORCID for Chris-Kriton Skylaris: ORCID iD orcid.org/0000-0003-0258-3433

Catalogue record

Date deposited: 11 Oct 2023 17:14
Last modified: 17 Mar 2024 07:38

Export record

Contributors

Author: Joao Pedro Dos Santos Morado
Thesis advisor: Chris-Kriton Skylaris ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×