The University of Southampton
University of Southampton Institutional Repository

AI3SD Video: Machine Learning with Causality: Solubility Prediction in Organic Solvents and Water

AI3SD Video: Machine Learning with Causality: Solubility Prediction in Organic Solvents and Water
AI3SD Video: Machine Learning with Causality: Solubility Prediction in Organic Solvents and Water
Solubility prediction remains a critical challenge in drug development, synthetic route and chemical process design, extraction and crystallisation. Here we report a successful approach to solubility prediction in organic solvents and water using a combination of machine learning (ANN, SVM, RF, ExtraTrees, Bagging and GP) and computational chemistry. Rational interpretation of dissolution process into a numerical problem led to a small set of selected descriptors and subsequent predictions which are independent of the applied machine learning method. These models gave significantly more accurate predictions compared to benchmarked open-access and commercial tools, achieving accuracy close to the expected level of noise in training data (LogS ± 0.7). Finally, they reproduced physicochemical relationship between solubility and molecular properties in different solvents, which led to rational approaches to improve the accuracy of each models.
AI, AI3SD Event, Artificial Intelligence, Chemistry, Machine Learning, Machine Intelligence, ML, Property Prediction
Nguyen, Bao
dff7138b-bc7f-42c1-a210-1297fd1f4e9f
Kanza, Samantha
b73bcf34-3ff8-4691-bd09-aa657dcff420
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Hooper, Victoria
af1a99f1-7848-4d5c-a4b5-615888838d84
Nguyen, Bao
dff7138b-bc7f-42c1-a210-1297fd1f4e9f
Kanza, Samantha
b73bcf34-3ff8-4691-bd09-aa657dcff420
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Hooper, Victoria
af1a99f1-7848-4d5c-a4b5-615888838d84

Nguyen, Bao (2021) AI3SD Video: Machine Learning with Causality: Solubility Prediction in Organic Solvents and Water. Kanza, Samantha, Frey, Jeremy G., Niranjan, Mahesan and Hooper, Victoria (eds.) AI3SD Winter Seminar Series, , Online. 18 Nov 2020 - 21 Apr 2021 . (doi:10.5258/SOTON/P0072).

Record type: Conference or Workshop Item (Other)

Abstract

Solubility prediction remains a critical challenge in drug development, synthetic route and chemical process design, extraction and crystallisation. Here we report a successful approach to solubility prediction in organic solvents and water using a combination of machine learning (ANN, SVM, RF, ExtraTrees, Bagging and GP) and computational chemistry. Rational interpretation of dissolution process into a numerical problem led to a small set of selected descriptors and subsequent predictions which are independent of the applied machine learning method. These models gave significantly more accurate predictions compared to benchmarked open-access and commercial tools, achieving accuracy close to the expected level of noise in training data (LogS ± 0.7). Finally, they reproduced physicochemical relationship between solubility and molecular properties in different solvents, which led to rational approaches to improve the accuracy of each models.

Video
AI3SD-Winter-Seminar-Series-PropertyPrediction-BaoNguyen - Version of Record
Available under License Creative Commons Attribution.
Download (535MB)

More information

Published date: 17 March 2021
Additional Information: Dr Bao Nguyen is a Lecturer in Physical Organic Chemistry at University of Leeds, where he has been from September 2012. He actively collaborates with colleagues from both the School of Chemistry and School of Chemical and Process Engineering to address current challenges in process chemistry. He is a core member of the Institute of Process Research and Development (iPRD), a flagship institute set up by the Leeds Transformation Fund. Dr Nguyen did his PhD in Organic Chemistry at the University of Oxford, under the supervision of Dr John M. Brown FRS. He then moved to Dr Michael C. Willis' group, where he developed the first Pd-catalysed coupling reaction employing sulfur dioxide by suppressing catalyst deactivation. Afterward, he joined Imperial College London, working in Dr King Kuok Hii's group to delineate the nature of the palladium species in different catalytic reactions and developing separation methods for these species. He was awarded his first independent position as a Ramsay Memorial Fellow at Department of Chemistry, Imperial College London.
Venue - Dates: AI3SD Winter Seminar Series, , Online, 2020-11-18 - 2021-04-21
Keywords: AI, AI3SD Event, Artificial Intelligence, Chemistry, Machine Learning, Machine Intelligence, ML, Property Prediction

Identifiers

Local EPrints ID: 448771
URI: http://eprints.soton.ac.uk/id/eprint/448771
PURE UUID: c3da39da-fe21-4bce-8026-d8ad69c771f7
ORCID for Samantha Kanza: ORCID iD orcid.org/0000-0002-4831-9489
ORCID for Jeremy G. Frey: ORCID iD orcid.org/0000-0003-0842-4302
ORCID for Mahesan Niranjan: ORCID iD orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 05 May 2021 16:32
Last modified: 17 Mar 2024 03:51

Export record

Altmetrics

Contributors

Author: Bao Nguyen
Editor: Samantha Kanza ORCID iD
Editor: Jeremy G. Frey ORCID iD
Editor: Mahesan Niranjan ORCID iD
Editor: Victoria Hooper

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×