Topological Data Analysis and its Application to Chemical Systems
Topological Data Analysis and its Application to Chemical Systems
Topological data analysis techniques are applied to distinct problems in chemistry, to determine their efficacy and gain new understanding of chemical systems. The mapper algorithm is utilised to understand the underlying descriptor space of a solubility prediction data set. Insight from the resulting topological summaries was able to create more consistent solubility models. Persistent homology is then used to create a series of metric spaces for molecular shape. It is shown that these metric spaces correlate with other molecular descriptors, and also allow for the accounting of molecular flexibility.
This molecular flexibility is further explored with persistent homology. By constructing a point cloud of individual conformers, a technique to characterise the conformational spaces of various molecules is developed. Alanine dipeptide is shown to have a toroidal conformational space, and persistence is then used to locate extrema on its torsional free energy surface. Pentane is then studied, and shown to also have a toroidal conformational space, or a Mobius band when symmetry is taken into account. The conformational space of cyclooctane is shown to be non-manifold, and the separate manifold components separated. It is found that there are separate spherical and Klein bottle components, before the single point energy landscape of the sphere is also analysed and extrema located.
Finally, simulated water networks are analysed through persistent homology. The general use of persistence to analyse simulations is studied, and persistence is shown to be a well-behaved descriptor. A size-agnostic persistence descriptor is generated, and used with a support vector machine to understand the differences in simulated water networks. Atomistic and coarse-grained water potentials are compared, and similarities between potentials are related to topological features.
University of Southampton
Steinberg, Lee
283f7d74-c02e-4f52-a59e-396b12239e02
October 2019
Steinberg, Lee
283f7d74-c02e-4f52-a59e-396b12239e02
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f
Steinberg, Lee
(2019)
Topological Data Analysis and its Application to Chemical Systems.
University of Southampton, Doctoral Thesis, 221pp.
Record type:
Thesis
(Doctoral)
Abstract
Topological data analysis techniques are applied to distinct problems in chemistry, to determine their efficacy and gain new understanding of chemical systems. The mapper algorithm is utilised to understand the underlying descriptor space of a solubility prediction data set. Insight from the resulting topological summaries was able to create more consistent solubility models. Persistent homology is then used to create a series of metric spaces for molecular shape. It is shown that these metric spaces correlate with other molecular descriptors, and also allow for the accounting of molecular flexibility.
This molecular flexibility is further explored with persistent homology. By constructing a point cloud of individual conformers, a technique to characterise the conformational spaces of various molecules is developed. Alanine dipeptide is shown to have a toroidal conformational space, and persistence is then used to locate extrema on its torsional free energy surface. Pentane is then studied, and shown to also have a toroidal conformational space, or a Mobius band when symmetry is taken into account. The conformational space of cyclooctane is shown to be non-manifold, and the separate manifold components separated. It is found that there are separate spherical and Klein bottle components, before the single point energy landscape of the sphere is also analysed and extrema located.
Finally, simulated water networks are analysed through persistent homology. The general use of persistence to analyse simulations is studied, and persistence is shown to be a well-behaved descriptor. A size-agnostic persistence descriptor is generated, and used with a support vector machine to understand the differences in simulated water networks. Atomistic and coarse-grained water potentials are compared, and similarities between potentials are related to topological features.
Text
Topological Data Analysis and its Application to Chemical Systems
- Version of Record
More information
Published date: October 2019
Identifiers
Local EPrints ID: 438904
URI: http://eprints.soton.ac.uk/id/eprint/438904
PURE UUID: 4ea373f4-c928-43c1-9cf5-3d53f98f21a8
Catalogue record
Date deposited: 26 Mar 2020 17:31
Last modified: 17 Mar 2024 05:21
Export record
Contributors
Author:
Lee Steinberg
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics