A mechanism-aware and multiomic machine-learning pipeline characterizes yeast cell growth
A mechanism-aware and multiomic machine-learning pipeline characterizes yeast cell growth
Metabolic modeling and machine learning are key components in the emerging next generation of systems and synthetic biology tools, targeting the genotype–phenotype–environment relationship. Rather than being used in isolation, it is becoming clear that their value is maximized when they are combined. However, the potential of integrating these two frameworks for omic data augmentation and integration is largely unexplored. We propose, rigorously assess, and compare machine-learning–based data integration techniques, combining gene expression profiles with computationally generated metabolic flux data to predict yeast cell growth. To this end, we create strain-specific metabolic models for 1,143 Saccharomyces cerevisiae mutants and we test 27 machine-learning methods, incorporating state-of-the-art feature selection and multiview learning approaches. We propose a multiview neural network using fluxomic and transcriptomic data, showing that the former increases the predictive accuracy of the latter and reveals functional patterns that are not directly deducible from gene expression alone. We test the proposed neural network on a further 86 strains generated in a different experiment, therefore verifying its robustness to an additional independent dataset. Finally, we show that introducing mechanistic flux features improves the predictions also for knockout strains whose genes were not modeled in the metabolic reconstruction. Our results thus demonstrate that fusing experimental cues with in silico models, based on known biochemistry, can contribute with disjoint information toward biologically informed and interpretable machine learning. Overall, this study provides tools for understanding and manipulating complex phenotypes, increasing both the prediction accuracy and the extent of discernible mechanistic biological insights.
Flux balance analysis, Machine learning, Metabolic modeling, Multimodal learning, Systems biology
18869-18879
Culley, Christopher
1d5426c0-f3ac-4f02-9dd2-83cdc2a8f2fc
Vijayakumar, Supreeta
52295064-c5e5-4da6-b101-6f3b4dc7ee82
Zampieri, Guido
1b29f770-77d9-4e3b-95a6-9b0ecb2fca51
Angione, Claudio
1c81131d-d828-42c0-9c3a-3c63c45208ac
4 August 2020
Culley, Christopher
1d5426c0-f3ac-4f02-9dd2-83cdc2a8f2fc
Vijayakumar, Supreeta
52295064-c5e5-4da6-b101-6f3b4dc7ee82
Zampieri, Guido
1b29f770-77d9-4e3b-95a6-9b0ecb2fca51
Angione, Claudio
1c81131d-d828-42c0-9c3a-3c63c45208ac
Culley, Christopher, Vijayakumar, Supreeta, Zampieri, Guido and Angione, Claudio
(2020)
A mechanism-aware and multiomic machine-learning pipeline characterizes yeast cell growth.
Proceedings of the National Academy of Sciences, 117 (31), .
(doi:10.1073/pnas.2002959117).
Abstract
Metabolic modeling and machine learning are key components in the emerging next generation of systems and synthetic biology tools, targeting the genotype–phenotype–environment relationship. Rather than being used in isolation, it is becoming clear that their value is maximized when they are combined. However, the potential of integrating these two frameworks for omic data augmentation and integration is largely unexplored. We propose, rigorously assess, and compare machine-learning–based data integration techniques, combining gene expression profiles with computationally generated metabolic flux data to predict yeast cell growth. To this end, we create strain-specific metabolic models for 1,143 Saccharomyces cerevisiae mutants and we test 27 machine-learning methods, incorporating state-of-the-art feature selection and multiview learning approaches. We propose a multiview neural network using fluxomic and transcriptomic data, showing that the former increases the predictive accuracy of the latter and reveals functional patterns that are not directly deducible from gene expression alone. We test the proposed neural network on a further 86 strains generated in a different experiment, therefore verifying its robustness to an additional independent dataset. Finally, we show that introducing mechanistic flux features improves the predictions also for knockout strains whose genes were not modeled in the metabolic reconstruction. Our results thus demonstrate that fusing experimental cues with in silico models, based on known biochemistry, can contribute with disjoint information toward biologically informed and interpretable machine learning. Overall, this study provides tools for understanding and manipulating complex phenotypes, increasing both the prediction accuracy and the extent of discernible mechanistic biological insights.
Text
PNAS postprint
- Accepted Manuscript
Text
pnas202002959_fd4068v_2
Restricted to Repository staff only
Request a copy
More information
Accepted/In Press date: 12 June 2020
e-pub ahead of print date: 16 July 2020
Published date: 4 August 2020
Keywords:
Flux balance analysis, Machine learning, Metabolic modeling, Multimodal learning, Systems biology
Identifiers
Local EPrints ID: 444226
URI: http://eprints.soton.ac.uk/id/eprint/444226
ISSN: 0027-8424
PURE UUID: 081427ea-b064-4b40-a797-335fe5cf842b
Catalogue record
Date deposited: 01 Oct 2020 16:34
Last modified: 17 Mar 2024 05:46
Export record
Altmetrics
Contributors
Author:
Christopher Culley
Author:
Supreeta Vijayakumar
Author:
Guido Zampieri
Author:
Claudio Angione
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics