The University of Southampton
University of Southampton Institutional Repository

Enhancing the biological relevance of machine learning classifiers for reverse vaccinology

Enhancing the biological relevance of machine learning classifiers for reverse vaccinology
Enhancing the biological relevance of machine learning classifiers for reverse vaccinology
Reverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM) classifier that could discriminate BPAs (n = 200) from non-BPAs (n = 200) with an area under the curve (AUC) of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future.
1422-0067
Heinson, Ashley
822775d1-9379-4bde-99c3-3c031c3100fb
Gunawardana, Yawwani P
e7a9c0f0-8452-43f8-8623-24be36ef5cb3
Moesker, Bastiaan
4d8a2308-e949-4c0d-8ad5-6099b5c8aa09
Denman Hume, Carmen C.
bc2f7921-b191-4d3e-87b3-0c5116bbc545
Vataga, Elena
a7bbb165-96a2-4235-916e-a38eafa7a0a2
Hall, Yper
84a1a1ae-829f-4522-b9e6-b55f96d5d660
Stylianou, Elena
9d0e8222-1353-4f94-bd1d-76de8ab825ad
Mcshane, Helen
08d12cb0-42b4-40f7-ad20-6294f4ddd747
Williams, Ann
9cc09f36-22cb-422d-a79e-8b3eab1bdb49
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Woelk, Christopher H.
4d3af0fd-658f-4626-b3b5-49a6192bcf7d
Heinson, Ashley
822775d1-9379-4bde-99c3-3c031c3100fb
Gunawardana, Yawwani P
e7a9c0f0-8452-43f8-8623-24be36ef5cb3
Moesker, Bastiaan
4d8a2308-e949-4c0d-8ad5-6099b5c8aa09
Denman Hume, Carmen C.
bc2f7921-b191-4d3e-87b3-0c5116bbc545
Vataga, Elena
a7bbb165-96a2-4235-916e-a38eafa7a0a2
Hall, Yper
84a1a1ae-829f-4522-b9e6-b55f96d5d660
Stylianou, Elena
9d0e8222-1353-4f94-bd1d-76de8ab825ad
Mcshane, Helen
08d12cb0-42b4-40f7-ad20-6294f4ddd747
Williams, Ann
9cc09f36-22cb-422d-a79e-8b3eab1bdb49
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Woelk, Christopher H.
4d3af0fd-658f-4626-b3b5-49a6192bcf7d

Heinson, Ashley, Gunawardana, Yawwani P, Moesker, Bastiaan, Denman Hume, Carmen C., Vataga, Elena, Hall, Yper, Stylianou, Elena, Mcshane, Helen, Williams, Ann, Niranjan, Mahesan and Woelk, Christopher H. (2017) Enhancing the biological relevance of machine learning classifiers for reverse vaccinology. International Journal of Molecular Sciences, 18 (2). (doi:10.3390/ijms18020312).

Record type: Article

Abstract

Reverse vaccinology (RV) is a bioinformatics approach that can predict antigens with protective potential from the protein coding genomes of bacterial pathogens for subunit vaccine design. RV has become firmly established following the development of the BEXSERO® vaccine against Neisseria meningitidis serogroup B. RV studies have begun to incorporate machine learning (ML) techniques to distinguish bacterial protective antigens (BPAs) from non-BPAs. This research contributes significantly to the RV field by using permutation analysis to demonstrate that a signal for protective antigens can be curated from published data. Furthermore, the effects of the following on an ML approach to RV were also assessed: nested cross-validation, balancing selection of non-BPAs for subcellular localization, increasing the training data, and incorporating greater numbers of protein annotation tools for feature generation. These enhancements yielded a support vector machine (SVM) classifier that could discriminate BPAs (n = 200) from non-BPAs (n = 200) with an area under the curve (AUC) of 0.787. In addition, hierarchical clustering of BPAs revealed that intracellular BPAs clustered separately from extracellular BPAs. However, no immediate benefit was derived when training SVM classifiers on data sets exclusively containing intra- or extracellular BPAs. In conclusion, this work demonstrates that ML classifiers have great utility in RV approaches and will lead to new subunit vaccines in the future.

Text
ijms-18-00312-v3 - Version of Record
Available under License Creative Commons Attribution.
Download (1MB)

More information

Accepted/In Press date: 17 January 2017
e-pub ahead of print date: 1 February 2017

Identifiers

Local EPrints ID: 446082
URI: http://eprints.soton.ac.uk/id/eprint/446082
ISSN: 1422-0067
PURE UUID: 3bbfa81c-87d1-40c0-b337-bc24a23f762a

Catalogue record

Date deposited: 20 Jan 2021 17:31
Last modified: 20 Jan 2021 17:31

Export record

Altmetrics

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×