The University of Southampton
University of Southampton Institutional Repository

Improving reverse vaccinology with a machine learning approach

Improving reverse vaccinology with a machine learning approach
Improving reverse vaccinology with a machine learning approach
Reverse vaccinology aims to accelerate subunit vaccine design by rapidly predicting which proteins in a pathogenic bacterial proteome are putative protective antigens.Support vector machine classification is a machine learning approach that has been applied to solve numerous classification problems in biological sciences but has not previously been incorporated into a reverse vaccinology approach. A training data set of 136 bacterial protective antigens paired with 136 non-antigens was constructed and bioinformatic tools were used to annotate this data for predicted protein features, many of which are associated with antigenicity (i.e. extracellular localization, signal peptides and B-cell epitopes). Annotation was used to train support vector machine classifiers that exhibited a maximum accuracy of 92% for discriminating protective antigens from non-antigens as assessed by a leave-tenth-out cross validation approach. These accuracies were superior to those achieved when annotating training data with auto and cross covariance transformations of z-descriptors for hydrophobicity, molecular size and polarity, or when classification was performed using regression methods. To further validate support vector machine classifiers,they were used to rank all the proteins in six bacterial proteomes for their antigenicity. Protective antigens from the training data were significantly recalled (enriched) in the top 75 ranked proteins for all six proteomes as assessed by a Fisher’s exact test (p < 0.05). This paper describes a superior workflow for performing reverse vaccinology studies and provides a benchmark training data set that can be used to evaluate future methodological improvements.
reverse vaccinology, bacterial pathogens, protective antigen, support vector machines
8156-8164
Bowman, Brett N.
93ec7ea0-56ba-460a-9a15-c9ff910791f1
McAdam, Paul R.
4a5ea1fc-27fe-4903-956e-77b05b8b7c81
Vivona, Sandro
302aeabd-a992-4a19-b01a-54a5574525d4
Zhang, Jin X.
a84b2681-e9e5-469e-b563-df5c34722dc4
Luong, Tiffany
3bd8dc89-4a8a-4616-a230-4210b18be1c6
Belew, Richard K.
71b66f69-a9d8-4094-8cf7-63fe3db9dfb2
Sahota, Harpal
8e2b0dd2-b7b8-4e62-88b1-33f1acaaf677
Guiney, Donald
cd9eccf0-ba70-4e4c-864b-03a58f7a59df
Valafar, Faramarz
01569bd4-279a-41bf-b121-a533de6f5551
Fierer, Joshua
d1d0403a-4eda-4b23-a762-fec5c2bb2e9b
Woelk, C.H.
4d3af0fd-658f-4626-b3b5-49a6192bcf7d
Bowman, Brett N.
93ec7ea0-56ba-460a-9a15-c9ff910791f1
McAdam, Paul R.
4a5ea1fc-27fe-4903-956e-77b05b8b7c81
Vivona, Sandro
302aeabd-a992-4a19-b01a-54a5574525d4
Zhang, Jin X.
a84b2681-e9e5-469e-b563-df5c34722dc4
Luong, Tiffany
3bd8dc89-4a8a-4616-a230-4210b18be1c6
Belew, Richard K.
71b66f69-a9d8-4094-8cf7-63fe3db9dfb2
Sahota, Harpal
8e2b0dd2-b7b8-4e62-88b1-33f1acaaf677
Guiney, Donald
cd9eccf0-ba70-4e4c-864b-03a58f7a59df
Valafar, Faramarz
01569bd4-279a-41bf-b121-a533de6f5551
Fierer, Joshua
d1d0403a-4eda-4b23-a762-fec5c2bb2e9b
Woelk, C.H.
4d3af0fd-658f-4626-b3b5-49a6192bcf7d

Bowman, Brett N., McAdam, Paul R., Vivona, Sandro, Zhang, Jin X., Luong, Tiffany, Belew, Richard K., Sahota, Harpal, Guiney, Donald, Valafar, Faramarz, Fierer, Joshua and Woelk, C.H. (2011) Improving reverse vaccinology with a machine learning approach. Vaccine, 29 (45), 8156-8164. (doi:10.1016/j.vaccine.2011.07.142). (PMID:21864619)

Record type: Article

Abstract

Reverse vaccinology aims to accelerate subunit vaccine design by rapidly predicting which proteins in a pathogenic bacterial proteome are putative protective antigens.Support vector machine classification is a machine learning approach that has been applied to solve numerous classification problems in biological sciences but has not previously been incorporated into a reverse vaccinology approach. A training data set of 136 bacterial protective antigens paired with 136 non-antigens was constructed and bioinformatic tools were used to annotate this data for predicted protein features, many of which are associated with antigenicity (i.e. extracellular localization, signal peptides and B-cell epitopes). Annotation was used to train support vector machine classifiers that exhibited a maximum accuracy of 92% for discriminating protective antigens from non-antigens as assessed by a leave-tenth-out cross validation approach. These accuracies were superior to those achieved when annotating training data with auto and cross covariance transformations of z-descriptors for hydrophobicity, molecular size and polarity, or when classification was performed using regression methods. To further validate support vector machine classifiers,they were used to rank all the proteins in six bacterial proteomes for their antigenicity. Protective antigens from the training data were significantly recalled (enriched) in the top 75 ranked proteins for all six proteomes as assessed by a Fisher’s exact test (p < 0.05). This paper describes a superior workflow for performing reverse vaccinology studies and provides a benchmark training data set that can be used to evaluate future methodological improvements.

This record has no associated files available for download.

More information

e-pub ahead of print date: 22 August 2011
Published date: October 2011
Keywords: reverse vaccinology, bacterial pathogens, protective antigen, support vector machines
Organisations: Clinical & Experimental Sciences

Identifiers

Local EPrints ID: 350568
URI: http://eprints.soton.ac.uk/id/eprint/350568
PURE UUID: a26f0b62-9d07-462b-b780-e8fcfdc158ec

Catalogue record

Date deposited: 05 Apr 2013 14:52
Last modified: 14 Mar 2024 13:27

Export record

Altmetrics

Contributors

Author: Brett N. Bowman
Author: Paul R. McAdam
Author: Sandro Vivona
Author: Jin X. Zhang
Author: Tiffany Luong
Author: Richard K. Belew
Author: Harpal Sahota
Author: Donald Guiney
Author: Faramarz Valafar
Author: Joshua Fierer
Author: C.H. Woelk

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×