Predicting glycan structure from tandem mass spectrometry via deep learning
Predicting glycan structure from tandem mass spectrometry via deep learning
Glycans constitute the most complicated post-translational modification, modulating protein activity in health and disease. However, structural annotation from tandem mass spectrometry (MS/MS) data is a bottleneck in glycomics, preventing high-throughput endeavors and relegating glycomics to a few experts. Trained on a newly curated set of 500,000 annotated MS/MS spectra, here we present CandyCrunch, a dilated residual neural network predicting glycan structure from raw liquid chromatography–MS/MS data in seconds (top-1 accuracy: 90.3%). We developed an open-access Python-based workflow of raw data conversion and prediction, followed by automated curation and fragment annotation, with predictions recapitulating and extending expert annotation. We demonstrate that this can be used for de novo annotation, diagnostic fragment identification and high-throughput glycomics. For maximum impact, this entire pipeline is tightly interlaced with our glycowork platform and can be easily tested at https://colab.research.google.com/github/BojarLab/CandyCrunch/blob/main/CandyCrunch.ipynb. We envision CandyCrunch to democratize structural glycomics and the elucidation of biological roles of glycans.
1206-1215
Urban, James
72e83b2c-12d5-42d7-a32a-2ff5d83b6116
Jin, Chunsheng
3294edf0-69cb-408e-a4f6-709a5a45b7b4
Thomsson, Kristina A.
b63999db-05d1-4f71-b3c2-bc620e638619
Karlsson, Niclas G.
1036f4d0-3080-4337-ad17-43b312657406
Ives, Callum M.
b8c798a7-ddf0-40ac-8194-c757032b85e2
Fadda, Elisa
11ba1755-9585-44aa-a38e-a8bcfd766abb
Bojar, Daniel
9c301895-2b74-4d82-8ac1-69d4b90b3958
1 July 2024
Urban, James
72e83b2c-12d5-42d7-a32a-2ff5d83b6116
Jin, Chunsheng
3294edf0-69cb-408e-a4f6-709a5a45b7b4
Thomsson, Kristina A.
b63999db-05d1-4f71-b3c2-bc620e638619
Karlsson, Niclas G.
1036f4d0-3080-4337-ad17-43b312657406
Ives, Callum M.
b8c798a7-ddf0-40ac-8194-c757032b85e2
Fadda, Elisa
11ba1755-9585-44aa-a38e-a8bcfd766abb
Bojar, Daniel
9c301895-2b74-4d82-8ac1-69d4b90b3958
Urban, James, Jin, Chunsheng, Thomsson, Kristina A., Karlsson, Niclas G., Ives, Callum M., Fadda, Elisa and Bojar, Daniel
(2024)
Predicting glycan structure from tandem mass spectrometry via deep learning.
Nature Methods, 21 (7), .
(doi:10.1038/s41592-024-02314-6).
Abstract
Glycans constitute the most complicated post-translational modification, modulating protein activity in health and disease. However, structural annotation from tandem mass spectrometry (MS/MS) data is a bottleneck in glycomics, preventing high-throughput endeavors and relegating glycomics to a few experts. Trained on a newly curated set of 500,000 annotated MS/MS spectra, here we present CandyCrunch, a dilated residual neural network predicting glycan structure from raw liquid chromatography–MS/MS data in seconds (top-1 accuracy: 90.3%). We developed an open-access Python-based workflow of raw data conversion and prediction, followed by automated curation and fragment annotation, with predictions recapitulating and extending expert annotation. We demonstrate that this can be used for de novo annotation, diagnostic fragment identification and high-throughput glycomics. For maximum impact, this entire pipeline is tightly interlaced with our glycowork platform and can be easily tested at https://colab.research.google.com/github/BojarLab/CandyCrunch/blob/main/CandyCrunch.ipynb. We envision CandyCrunch to democratize structural glycomics and the elucidation of biological roles of glycans.
Text
s41592-024-02314-6
- Version of Record
More information
Accepted/In Press date: 17 May 2024
Published date: 1 July 2024
Additional Information:
Publisher Copyright:
© The Author(s) 2024.
Identifiers
Local EPrints ID: 500250
URI: http://eprints.soton.ac.uk/id/eprint/500250
ISSN: 1548-7091
PURE UUID: 75561460-5642-4bd5-bde4-30cd887c7467
Catalogue record
Date deposited: 23 Apr 2025 16:43
Last modified: 22 Aug 2025 02:42
Export record
Altmetrics
Contributors
Author:
James Urban
Author:
Chunsheng Jin
Author:
Kristina A. Thomsson
Author:
Niclas G. Karlsson
Author:
Callum M. Ives
Author:
Elisa Fadda
Author:
Daniel Bojar
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics