The University of Southampton
University of Southampton Institutional Repository

AI3SD Video: The Application of Machine Learning in Molecular Spectroscopy Study

AI3SD Video: The Application of Machine Learning in Molecular Spectroscopy Study
AI3SD Video: The Application of Machine Learning in Molecular Spectroscopy Study
Optical-spectroscopy provides powerful toolkits to decipher molecular structures and their configuration evolutions. However, the theoretical analysis of spectroscopic signals and connecting them with structural detail is a challenging task. Moreover, the intrinsic complexity of spectroscopic signals of molecular systems makes it difficult to correlate spectral characteristics with the underlying molecular structure and dynamics. Herein, we have developed data-driven machine learning (ML) protocols that can predict infrared (IR), ultraviolet/visible (UV/Vis) and Raman spectra of molecule systems with 3 to 5 orders of magnitude reduced computation cost compared to direct quantum chemistry calculations. A convolutional neural network (CNN) model was trained and tested on a dataset consisting 87993 spectra computed from protein peptide segments with α-helical, β-sheet, and other typical secondary structures. The secondary structure classification accuracy reached near 100% and over 98.7% on spectra sets of new segments extracted from the same and homologous proteins, respectively. Importantly, we demonstrate the ML protocol to realize cost-effective relations between spectra, structure, and chemical properties, i.e. spectra determination/prediction from structural information, and configuration or chemical properties determination/recognition from spectroscopic signals.

1. S. Ye, K. Zhong, J.X. Zhang, W. Hu, J. Hirst, G.Z. Zhang, S. Mukamel, J. Jiang*, A Machine Learning Protocol for Predicting Protein Infrared Spectra, J. Am. Chem. Soc. 142 (2020) 19071-19077.
2. X.J. Wang, S. Ye, W. Hu, E. Sharman, R. Liu, Y. Liu, Y. Luo, J. Jiang*, Electric Dipole Descriptor for Machine Learning Prediction of Catalyst Surface-Molecular Adsorbate Interactions, J. Am. Chem. Soc. 142 (2020) 7737-7743.
3. S. Ye, W. Hu, X. Li, J.X. Zhang, K. Zhong, G.Z. Zhang, Y. Luo, S. Mukamel*, J. Jiang*, A Neural Network Protocol for Electronic excitations of N-Methylacetamide, Proc Natl Acad Sci USA. 116 (2019) 11612-11617.
4. W. Hu, S. Ye, Y.J Zhang, T.D. Li, G.Z. Zhang, Y. Luo, S. Mukamel, J. Jiang*, Machine Learning Protocol for Surface-Enhanced Raman Spectroscopy, J. Phys. Chem. Lett. 10 (2019) 6026-6031.
AI, AI3SD Event, Artificial Intelligence, Machine Intelligence, Machine Learning, ML, Proteins
Jiang, Jun
74b3388c-ef1e-4d8a-971d-444d6b7f42bb
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f
Kanza, Samantha
b73bcf34-3ff8-4691-bd09-aa657dcff420
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Jiang, Jun
74b3388c-ef1e-4d8a-971d-444d6b7f42bb
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f
Kanza, Samantha
b73bcf34-3ff8-4691-bd09-aa657dcff420
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Jiang, Jun (2021) AI3SD Video: The Application of Machine Learning in Molecular Spectroscopy Study. Frey, Jeremy G., Kanza, Samantha and Niranjan, Mahesan (eds.) AI 4 Proteins Seminar Series 2021. 14 Apr - 17 Jun 2021. (doi:10.5258/SOTON/P0094).

Record type: Conference or Workshop Item (Other)

Abstract

Optical-spectroscopy provides powerful toolkits to decipher molecular structures and their configuration evolutions. However, the theoretical analysis of spectroscopic signals and connecting them with structural detail is a challenging task. Moreover, the intrinsic complexity of spectroscopic signals of molecular systems makes it difficult to correlate spectral characteristics with the underlying molecular structure and dynamics. Herein, we have developed data-driven machine learning (ML) protocols that can predict infrared (IR), ultraviolet/visible (UV/Vis) and Raman spectra of molecule systems with 3 to 5 orders of magnitude reduced computation cost compared to direct quantum chemistry calculations. A convolutional neural network (CNN) model was trained and tested on a dataset consisting 87993 spectra computed from protein peptide segments with α-helical, β-sheet, and other typical secondary structures. The secondary structure classification accuracy reached near 100% and over 98.7% on spectra sets of new segments extracted from the same and homologous proteins, respectively. Importantly, we demonstrate the ML protocol to realize cost-effective relations between spectra, structure, and chemical properties, i.e. spectra determination/prediction from structural information, and configuration or chemical properties determination/recognition from spectroscopic signals.

1. S. Ye, K. Zhong, J.X. Zhang, W. Hu, J. Hirst, G.Z. Zhang, S. Mukamel, J. Jiang*, A Machine Learning Protocol for Predicting Protein Infrared Spectra, J. Am. Chem. Soc. 142 (2020) 19071-19077.
2. X.J. Wang, S. Ye, W. Hu, E. Sharman, R. Liu, Y. Liu, Y. Luo, J. Jiang*, Electric Dipole Descriptor for Machine Learning Prediction of Catalyst Surface-Molecular Adsorbate Interactions, J. Am. Chem. Soc. 142 (2020) 7737-7743.
3. S. Ye, W. Hu, X. Li, J.X. Zhang, K. Zhong, G.Z. Zhang, Y. Luo, S. Mukamel*, J. Jiang*, A Neural Network Protocol for Electronic excitations of N-Methylacetamide, Proc Natl Acad Sci USA. 116 (2019) 11612-11617.
4. W. Hu, S. Ye, Y.J Zhang, T.D. Li, G.Z. Zhang, Y. Luo, S. Mukamel, J. Jiang*, Machine Learning Protocol for Surface-Enhanced Raman Spectroscopy, J. Phys. Chem. Lett. 10 (2019) 6026-6031.

Video
AI4Proteins-Seminar-Series-JunJiang-050521 - Version of Record
Available under License Creative Commons Attribution.
Download (650MB)

More information

Published date: 12 May 2021
Additional Information: Jun Jiang is a professor at Hefei National Laboratory for Physical Sciences at the Microscale, University of Science and Technology of China (USTC). He received a B.S. degree in Theoretical Physics in 2000 at WuHan University, China, a Ph.D. degree in Theoretical Chemistry under the tutelage of Prof. Yi Luo in 2007 at Royal Institute of Technology, Sweden, a Ph.D. degree in Solid State Physics under the tutelage of Prof. Wei Lu in 2008 at Shanghai Institute of Technical Physics, Chinese Academy of Science. From 2008 to 2011, he worked as Post-doc at Royal Institute of Technology, Sweden and University of California Irvine under the tutelage of Prof. Shaul Mukamel. He joined the University of Science and Technology of China in December, 2011 as a Professor in Physical Chemistry. Dr. Jiang’s research interests focus on the development and employment of multi-scale modeling methods and Machine learning techniques, for simulating Charge kinetics in complex system. He targets on a wide range of physics or chemistry applications such as Photocatalysis, Biochemistry, Photochemistry, Molecular electronics and photonics. He has published more than 50 papers in prestigious journals such as Nature Energy, J. Am. Chem. Soc., Angew. Chem. Int. Ed. Dr. Jiang is a recipient of the “National Science Fund for Distinguished Young Scholars in China”, and has won the “Young Theoretical Chemistry Investigator Award of Chinese Chemistry Society”, “Distinguished Lectureship Award of the Chemical Society of Japan 2020”.
Venue - Dates: AI 4 Proteins Seminar Series 2021, 2021-04-14 - 2021-06-17
Keywords: AI, AI3SD Event, Artificial Intelligence, Machine Intelligence, Machine Learning, ML, Proteins

Identifiers

Local EPrints ID: 450086
URI: http://eprints.soton.ac.uk/id/eprint/450086
PURE UUID: 2666e6a6-4635-4ce0-a5e0-754d6e47b670
ORCID for Jeremy G. Frey: ORCID iD orcid.org/0000-0003-0842-4302
ORCID for Samantha Kanza: ORCID iD orcid.org/0000-0002-4831-9489
ORCID for Mahesan Niranjan: ORCID iD orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 09 Jul 2021 16:33
Last modified: 17 Mar 2024 03:51

Export record

Altmetrics

Contributors

Author: Jun Jiang
Editor: Jeremy G. Frey ORCID iD
Editor: Samantha Kanza ORCID iD
Editor: Mahesan Niranjan ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×