The University of Southampton
University of Southampton Institutional Repository

Machine learned potentials by active learning from organic crystal structure prediction landscapes

Machine learned potentials by active learning from organic crystal structure prediction landscapes
Machine learned potentials by active learning from organic crystal structure prediction landscapes
A primary challenge in organic molecular crystal structure prediction (CSP) is accurately ranking the energies of potential structures. While high-level solid state density functional theory (DFT) methods allow for mostly reliable discrimination of the low energy structures, their high computational cost is problematic because of the need to evaluate tens to hundreds of thousands of trial crystal structures to fully explore typical crystal energy landscapes. Consequently, lower-cost but less accurate empirical force fields are often used, sometimes as the first stage of a hierarchical scheme involving multiple stages of increasingly accurate energy calculations. Machine learned interatomic potentials (MLIPs), trained to reproduce the results of ab initio methods with computational cost close to that of force fields, can improve the efficiency of CSP by reducing or eliminating the need for costly DFT calculations. Here, we investigate active learning methods for training MLIPs with CSP datasets. The combination of active learning with the well-developed sampling methods from CSP yields potentials in a highly automated workflow that are relevant over a wide range of the crystal packing space. To demonstrate these potentials, we illustrate efficiently re-ranking large, diverse crystal structure landscapes to near-DFT accuracy from force field-based CSP, improving the reliability of the final energy ranking. Furthermore, we demonstrate how these potentials can be extended to more accurately model structures far from lattice energy minima through additional on-the-fly training within Monte Carlo simulations.
1089-5639
Butler, Patrick Walter Villers
6e0f7f4a-4cb5-4868-9820-d120c7d905f8
Hafizi, Roohollah
bdf707e3-cfc0-4c9b-8daa-d1acc5123632
Day, Graeme M.
e3be79ba-ad12-4461-b735-74d5c4355636
Butler, Patrick Walter Villers
6e0f7f4a-4cb5-4868-9820-d120c7d905f8
Hafizi, Roohollah
bdf707e3-cfc0-4c9b-8daa-d1acc5123632
Day, Graeme M.
e3be79ba-ad12-4461-b735-74d5c4355636

Butler, Patrick Walter Villers, Hafizi, Roohollah and Day, Graeme M. (2024) Machine learned potentials by active learning from organic crystal structure prediction landscapes. Journal of Physical Chemistry A. (In Press)

Record type: Article

Abstract

A primary challenge in organic molecular crystal structure prediction (CSP) is accurately ranking the energies of potential structures. While high-level solid state density functional theory (DFT) methods allow for mostly reliable discrimination of the low energy structures, their high computational cost is problematic because of the need to evaluate tens to hundreds of thousands of trial crystal structures to fully explore typical crystal energy landscapes. Consequently, lower-cost but less accurate empirical force fields are often used, sometimes as the first stage of a hierarchical scheme involving multiple stages of increasingly accurate energy calculations. Machine learned interatomic potentials (MLIPs), trained to reproduce the results of ab initio methods with computational cost close to that of force fields, can improve the efficiency of CSP by reducing or eliminating the need for costly DFT calculations. Here, we investigate active learning methods for training MLIPs with CSP datasets. The combination of active learning with the well-developed sampling methods from CSP yields potentials in a highly automated workflow that are relevant over a wide range of the crystal packing space. To demonstrate these potentials, we illustrate efficiently re-ranking large, diverse crystal structure landscapes to near-DFT accuracy from force field-based CSP, improving the reliability of the final energy ranking. Furthermore, we demonstrate how these potentials can be extended to more accurately model structures far from lattice energy minima through additional on-the-fly training within Monte Carlo simulations.

Text
CSP_AL_JPhysChem_revised - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (9MB)
Text
CSP_Active_Learning_ESI
Restricted to Repository staff only
Available under License Creative Commons Attribution.
Request a copy

More information

Accepted/In Press date: 11 January 2024

Identifiers

Local EPrints ID: 486188
URI: http://eprints.soton.ac.uk/id/eprint/486188
ISSN: 1089-5639
PURE UUID: 1642ff8b-be61-42e8-a630-779c6d0b3edf
ORCID for Roohollah Hafizi: ORCID iD orcid.org/0000-0001-6513-4446
ORCID for Graeme M. Day: ORCID iD orcid.org/0000-0001-8396-2771

Catalogue record

Date deposited: 12 Jan 2024 17:39
Last modified: 18 Mar 2024 05:02

Export record

Contributors

Author: Patrick Walter Villers Butler
Author: Roohollah Hafizi ORCID iD
Author: Graeme M. Day ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×