The University of Southampton
University of Southampton Institutional Repository

AI3SD Video: Statistics are a girl’s best friend: Expanding the mechanistic study toolbox with data science

AI3SD Video: Statistics are a girl’s best friend: Expanding the mechanistic study toolbox with data science
AI3SD Video: Statistics are a girl’s best friend: Expanding the mechanistic study toolbox with data science
The value of amassing and standardizing chemical data for improving the efficiency of chemical discovery is becoming increasingly clear. Machine learning analyses of these data are focused on finding correlations, trends and patterns to uncover needles of knowledge in the haystack of chemical reactions. However, in many cases, especially in academic settings, we do not have the means to produce large data sets, so by necessity we remain in the Small Data regime. In this talk, I will present our work in the field of organocatalysis focused on applying machine learning strategies to small data sets as a means to uncover underlying mechanisms.
We aim to show that whereas Big Data serves to identify hidden correlations, Small Data encourages the discovery of causation. In this sense, Small Data is not just a necessity, but is key to bridging the gap between human intuition and machine learning.
AI3SD Event, Chemistry, Data Decision Making, Data Quality, Data Science, Data Sharing, Datasets
Milo, Anat
74a63778-0643-4f23-b4ab-1f7a7bb93e43
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f
Kanza, Samantha
b73bcf34-3ff8-4691-bd09-aa657dcff420
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Milo, Anat
74a63778-0643-4f23-b4ab-1f7a7bb93e43
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f
Kanza, Samantha
b73bcf34-3ff8-4691-bd09-aa657dcff420
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Milo, Anat (2021) AI3SD Video: Statistics are a girl’s best friend: Expanding the mechanistic study toolbox with data science. Frey, Jeremy G., Kanza, Samantha and Niranjan, Mahesan (eds.) AI3SD Autumn Seminar Series 2021. 13 Oct - 15 Dec 2021. (doi:10.5258/SOTON/AI3SD0158).

Record type: Conference or Workshop Item (Other)

Abstract

The value of amassing and standardizing chemical data for improving the efficiency of chemical discovery is becoming increasingly clear. Machine learning analyses of these data are focused on finding correlations, trends and patterns to uncover needles of knowledge in the haystack of chemical reactions. However, in many cases, especially in academic settings, we do not have the means to produce large data sets, so by necessity we remain in the Small Data regime. In this talk, I will present our work in the field of organocatalysis focused on applying machine learning strategies to small data sets as a means to uncover underlying mechanisms.
We aim to show that whereas Big Data serves to identify hidden correlations, Small Data encourages the discovery of causation. In this sense, Small Data is not just a necessity, but is key to bridging the gap between human intuition and machine learning.

Video
AI3SDAutumnSeminar-271021-AnatMilo - Version of Record
Available under License Creative Commons Attribution.
Download (386MB)
Text
27102021-AI3SDQA-AM
Available under License Creative Commons Attribution.
Download (56kB)

More information

Published date: 27 October 2021
Additional Information: Anat Milo received her BSc/BA in Chemistry and Humanities from the Hebrew University of Jerusalem in 2001, her MSc from UPMC Paris in 2004 with Berhold Hasenknopf, and her PhD from the Weizmann Institute of Science in 2011 with Ronny Neumann. Her postdoctoral studies at the University of Utah with Matthew Sigman focused on developing physical organic descriptors and data analysis approaches for chemical reactions. At the end of 2015 she returned to Israel to join the Department of Chemistry at Ben-Gurion University of the Negev, where her research group develops experimental, statistical, and computational strategies for identifying molecular design principles in catalysis with a particular focus on stabilizing and intercepting reactive intermediates by second sphere interactions.
Venue - Dates: AI3SD Autumn Seminar Series 2021, 2021-10-13 - 2021-12-15
Keywords: AI3SD Event, Chemistry, Data Decision Making, Data Quality, Data Science, Data Sharing, Datasets

Identifiers

Local EPrints ID: 452734
URI: http://eprints.soton.ac.uk/id/eprint/452734
PURE UUID: 1e00c439-a682-4c71-b7dd-3a4860d98ca1
ORCID for Jeremy G. Frey: ORCID iD orcid.org/0000-0003-0842-4302
ORCID for Samantha Kanza: ORCID iD orcid.org/0000-0002-4831-9489
ORCID for Mahesan Niranjan: ORCID iD orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 17 Dec 2021 17:37
Last modified: 17 Mar 2024 03:51

Export record

Altmetrics

Contributors

Author: Anat Milo
Editor: Jeremy G. Frey ORCID iD
Editor: Samantha Kanza ORCID iD
Editor: Mahesan Niranjan ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×