AI3SD Video: Statistics are a girl’s best friend: Expanding the mechanistic study toolbox with data science
AI3SD Video: Statistics are a girl’s best friend: Expanding the mechanistic study toolbox with data science
The value of amassing and standardizing chemical data for improving the efficiency of chemical discovery is becoming increasingly clear. Machine learning analyses of these data are focused on finding correlations, trends and patterns to uncover needles of knowledge in the haystack of chemical reactions. However, in many cases, especially in academic settings, we do not have the means to produce large data sets, so by necessity we remain in the Small Data regime. In this talk, I will present our work in the field of organocatalysis focused on applying machine learning strategies to small data sets as a means to uncover underlying mechanisms.
We aim to show that whereas Big Data serves to identify hidden correlations, Small Data encourages the discovery of causation. In this sense, Small Data is not just a necessity, but is key to bridging the gap between human intuition and machine learning.
AI3SD Event, Chemistry, Data Decision Making, Data Quality, Data Science, Data Sharing, Datasets
Milo, Anat
74a63778-0643-4f23-b4ab-1f7a7bb93e43
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f
Kanza, Samantha
b73bcf34-3ff8-4691-bd09-aa657dcff420
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
27 October 2021
Milo, Anat
74a63778-0643-4f23-b4ab-1f7a7bb93e43
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f
Kanza, Samantha
b73bcf34-3ff8-4691-bd09-aa657dcff420
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Milo, Anat
(2021)
AI3SD Video: Statistics are a girl’s best friend: Expanding the mechanistic study toolbox with data science.
Frey, Jeremy G., Kanza, Samantha and Niranjan, Mahesan
(eds.)
AI3SD Autumn Seminar Series 2021.
13 Oct - 15 Dec 2021.
(doi:10.5258/SOTON/AI3SD0158).
Record type:
Conference or Workshop Item
(Other)
Abstract
The value of amassing and standardizing chemical data for improving the efficiency of chemical discovery is becoming increasingly clear. Machine learning analyses of these data are focused on finding correlations, trends and patterns to uncover needles of knowledge in the haystack of chemical reactions. However, in many cases, especially in academic settings, we do not have the means to produce large data sets, so by necessity we remain in the Small Data regime. In this talk, I will present our work in the field of organocatalysis focused on applying machine learning strategies to small data sets as a means to uncover underlying mechanisms.
We aim to show that whereas Big Data serves to identify hidden correlations, Small Data encourages the discovery of causation. In this sense, Small Data is not just a necessity, but is key to bridging the gap between human intuition and machine learning.
Video
AI3SDAutumnSeminar-271021-AnatMilo
- Version of Record
More information
Published date: 27 October 2021
Additional Information:
Anat Milo received her BSc/BA in Chemistry and Humanities from the Hebrew University of Jerusalem in 2001, her MSc from UPMC Paris in 2004 with Berhold Hasenknopf, and her PhD from the Weizmann Institute of Science in 2011 with Ronny Neumann. Her postdoctoral studies at the University of Utah with Matthew Sigman focused on developing physical organic descriptors and data analysis approaches for chemical reactions. At the end of 2015 she returned to Israel to join the Department of Chemistry at Ben-Gurion University of the Negev, where her research group develops experimental, statistical, and computational strategies for identifying molecular design principles in catalysis with a particular focus on stabilizing and intercepting reactive intermediates by second sphere interactions.
Venue - Dates:
AI3SD Autumn Seminar Series 2021, 2021-10-13 - 2021-12-15
Keywords:
AI3SD Event, Chemistry, Data Decision Making, Data Quality, Data Science, Data Sharing, Datasets
Identifiers
Local EPrints ID: 452734
URI: http://eprints.soton.ac.uk/id/eprint/452734
PURE UUID: 1e00c439-a682-4c71-b7dd-3a4860d98ca1
Catalogue record
Date deposited: 17 Dec 2021 17:37
Last modified: 17 Mar 2024 03:51
Export record
Altmetrics
Contributors
Author:
Anat Milo
Editor:
Mahesan Niranjan
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics