The University of Southampton
University of Southampton Institutional Repository

Protein NMR assignment by isotope pattern recognition

Protein NMR assignment by isotope pattern recognition
Protein NMR assignment by isotope pattern recognition

The current standard method for amino acid signal identification in protein NMR spectra is sequential assignment using triple-resonance experiments. Good software and elaborate heuristics exist, but the process remains laboriously manual. Machine learning does help, but its training databases need millions of samples that cover all relevant physics and every kind of instrumental artifact. In this communication, we offer a solution to this problem. We propose polyadic decompositions to store millions of simulated three-dimensional NMR spectra, on-the-fly generation of artifacts during training, a probabilistic way to incorporate prior and posterior information, and integration with the industry standard CcpNmr software framework. The resulting neural nets take [ 1H, 13C] slices of mixed pyruvate–labeled HNCA spectra (different CA signal shapes for different residue types) and return an amino acid probability table. In combination with primary sequence information, backbones of common proteins (GB1, MBP, and INMT) are rapidly assigned from just the HNCA spectrum.

2375-2548
Rasulov, Uluk
c31a7c8c-3838-4357-833a-1aae8e119171
Wang, Harrison K.
0e9ecc5a-1d82-470f-8adf-129e13d8b10a
Viennet, Thibault
3288686f-598c-4242-9581-91460c2fa507
Droemer, Maxim A.
1d910483-3e3e-49b1-820c-e5544d4fa025
Matosin, Srđan
3ba8ef01-2033-407b-ba15-5607d06249e5
Schindler, Sebastian
486f8dd4-e145-4f1e-bff7-91c1b6c12f73
Sun, Zhen-Yu J.
957e6489-8188-4ecc-8882-5c92b3497ec0
Mureddu, Luca
963f5a4f-dbe5-4a82-a0f3-d60dd9f479b9
Vuister, Geerten W.
f693dcbb-57e9-4839-ae59-e5116fd83626
Robson, Scott A.
06a943ca-378d-443e-a078-0a4fa7f3a8a0
Arthanari, Haribabu
e6908018-4f11-4276-ac59-f77630dd3939
Kuprov, Ilya
bb07f28a-5038-4524-8146-e3fc8344c065
Rasulov, Uluk
c31a7c8c-3838-4357-833a-1aae8e119171
Wang, Harrison K.
0e9ecc5a-1d82-470f-8adf-129e13d8b10a
Viennet, Thibault
3288686f-598c-4242-9581-91460c2fa507
Droemer, Maxim A.
1d910483-3e3e-49b1-820c-e5544d4fa025
Matosin, Srđan
3ba8ef01-2033-407b-ba15-5607d06249e5
Schindler, Sebastian
486f8dd4-e145-4f1e-bff7-91c1b6c12f73
Sun, Zhen-Yu J.
957e6489-8188-4ecc-8882-5c92b3497ec0
Mureddu, Luca
963f5a4f-dbe5-4a82-a0f3-d60dd9f479b9
Vuister, Geerten W.
f693dcbb-57e9-4839-ae59-e5116fd83626
Robson, Scott A.
06a943ca-378d-443e-a078-0a4fa7f3a8a0
Arthanari, Haribabu
e6908018-4f11-4276-ac59-f77630dd3939
Kuprov, Ilya
bb07f28a-5038-4524-8146-e3fc8344c065

Rasulov, Uluk, Wang, Harrison K., Viennet, Thibault, Droemer, Maxim A., Matosin, Srđan, Schindler, Sebastian, Sun, Zhen-Yu J., Mureddu, Luca, Vuister, Geerten W., Robson, Scott A., Arthanari, Haribabu and Kuprov, Ilya (2024) Protein NMR assignment by isotope pattern recognition. Science Advances, 10 (36), [eado0403]. (doi:10.1126/sciadv.ado0403).

Record type: Article

Abstract

The current standard method for amino acid signal identification in protein NMR spectra is sequential assignment using triple-resonance experiments. Good software and elaborate heuristics exist, but the process remains laboriously manual. Machine learning does help, but its training databases need millions of samples that cover all relevant physics and every kind of instrumental artifact. In this communication, we offer a solution to this problem. We propose polyadic decompositions to store millions of simulated three-dimensional NMR spectra, on-the-fly generation of artifacts during training, a probabilistic way to incorporate prior and posterior information, and integration with the industry standard CcpNmr software framework. The resulting neural nets take [ 1H, 13C] slices of mixed pyruvate–labeled HNCA spectra (different CA signal shapes for different residue types) and return an amino acid probability table. In combination with primary sequence information, backbones of common proteins (GB1, MBP, and INMT) are rapidly assigned from just the HNCA spectrum.

Text
manuscript_sci_adv_format - Accepted Manuscript
Restricted to Repository staff only
Request a copy
Text
sciadv.ado0403 - Version of Record
Available under License Creative Commons Attribution.
Download (2MB)

More information

Accepted/In Press date: 29 July 2024
e-pub ahead of print date: 4 September 2024
Additional Information: The authors acknowledge the use of the IRIDIS High Performance Computing Facility, and associat-ed support services at the University of Southampton, in the completion of this work.

Identifiers

Local EPrints ID: 494988
URI: http://eprints.soton.ac.uk/id/eprint/494988
ISSN: 2375-2548
PURE UUID: 38d59061-502b-4d56-a178-9d56dbfe0de8
ORCID for Ilya Kuprov: ORCID iD orcid.org/0000-0003-0430-2682

Catalogue record

Date deposited: 24 Oct 2024 16:50
Last modified: 25 Oct 2024 01:44

Export record

Altmetrics

Contributors

Author: Uluk Rasulov
Author: Harrison K. Wang
Author: Thibault Viennet
Author: Maxim A. Droemer
Author: Srđan Matosin
Author: Sebastian Schindler
Author: Zhen-Yu J. Sun
Author: Luca Mureddu
Author: Geerten W. Vuister
Author: Scott A. Robson
Author: Haribabu Arthanari
Author: Ilya Kuprov ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×