The University of Southampton
University of Southampton Institutional Repository

Topological Data Analysis and its Application to Chemical Systems

Topological Data Analysis and its Application to Chemical Systems
Topological Data Analysis and its Application to Chemical Systems
Topological data analysis techniques are applied to distinct problems in chemistry, to determine their efficacy and gain new understanding of chemical systems. The mapper algorithm is utilised to understand the underlying descriptor space of a solubility prediction data set. Insight from the resulting topological summaries was able to create more consistent solubility models. Persistent homology is then used to create a series of metric spaces for molecular shape. It is shown that these metric spaces correlate with other molecular descriptors, and also allow for the accounting of molecular flexibility.

This molecular flexibility is further explored with persistent homology. By constructing a point cloud of individual conformers, a technique to characterise the conformational spaces of various molecules is developed. Alanine dipeptide is shown to have a toroidal conformational space, and persistence is then used to locate extrema on its torsional free energy surface. Pentane is then studied, and shown to also have a toroidal conformational space, or a Mobius band when symmetry is taken into account. The conformational space of cyclooctane is shown to be non-manifold, and the separate manifold components separated. It is found that there are separate spherical and Klein bottle components, before the single point energy landscape of the sphere is also analysed and extrema located.

Finally, simulated water networks are analysed through persistent homology. The general use of persistence to analyse simulations is studied, and persistence is shown to be a well-behaved descriptor. A size-agnostic persistence descriptor is generated, and used with a support vector machine to understand the differences in simulated water networks. Atomistic and coarse-grained water potentials are compared, and similarities between potentials are related to topological features.
University of Southampton
Steinberg, Lee
283f7d74-c02e-4f52-a59e-396b12239e02
Steinberg, Lee
283f7d74-c02e-4f52-a59e-396b12239e02
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f

Steinberg, Lee (2019) Topological Data Analysis and its Application to Chemical Systems. University of Southampton, Doctoral Thesis, 221pp.

Record type: Thesis (Doctoral)

Abstract

Topological data analysis techniques are applied to distinct problems in chemistry, to determine their efficacy and gain new understanding of chemical systems. The mapper algorithm is utilised to understand the underlying descriptor space of a solubility prediction data set. Insight from the resulting topological summaries was able to create more consistent solubility models. Persistent homology is then used to create a series of metric spaces for molecular shape. It is shown that these metric spaces correlate with other molecular descriptors, and also allow for the accounting of molecular flexibility.

This molecular flexibility is further explored with persistent homology. By constructing a point cloud of individual conformers, a technique to characterise the conformational spaces of various molecules is developed. Alanine dipeptide is shown to have a toroidal conformational space, and persistence is then used to locate extrema on its torsional free energy surface. Pentane is then studied, and shown to also have a toroidal conformational space, or a Mobius band when symmetry is taken into account. The conformational space of cyclooctane is shown to be non-manifold, and the separate manifold components separated. It is found that there are separate spherical and Klein bottle components, before the single point energy landscape of the sphere is also analysed and extrema located.

Finally, simulated water networks are analysed through persistent homology. The general use of persistence to analyse simulations is studied, and persistence is shown to be a well-behaved descriptor. A size-agnostic persistence descriptor is generated, and used with a support vector machine to understand the differences in simulated water networks. Atomistic and coarse-grained water potentials are compared, and similarities between potentials are related to topological features.

Text
Topological Data Analysis and its Application to Chemical Systems - Version of Record
Available under License University of Southampton Thesis Licence.
Download (13MB)

More information

Published date: October 2019

Identifiers

Local EPrints ID: 438904
URI: http://eprints.soton.ac.uk/id/eprint/438904
PURE UUID: 4ea373f4-c928-43c1-9cf5-3d53f98f21a8
ORCID for Jeremy G. Frey: ORCID iD orcid.org/0000-0003-0842-4302

Catalogue record

Date deposited: 26 Mar 2020 17:31
Last modified: 17 Mar 2024 05:21

Export record

Contributors

Author: Lee Steinberg
Thesis advisor: Jeremy G. Frey ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×