Primer C-VAE: An interpretable deep learning primer design method to detect emerging virus variants
Primer C-VAE: An interpretable deep learning primer design method to detect emerging virus variants
Motivation: PCR is more economical and quicker than Next Generation Sequencing for detecting target organisms, with primer design being a critical step. In epidemiology with rapidly mutating viruses, designing effective primers is challenging. Traditional methods require substantial manual intervention and struggle to ensure effective primer design across different strains. For organisms with large, similar genomes like Escherichia coli and Shigella flexneri, differentiating between species is also difficult but crucial. Results: We developed Primer C-VAE, a model based on a Variational Auto-Encoder framework with Convolutional Neural Networks to identify variants and generate specific primers. Using SARS-CoV-2, our model classified variants (alpha, beta, gamma, delta, omicron) with 98% accuracy and generated variant-specific primers. These primers appeared with >95% frequency in target variants and
q-bio.GN, cs.LG
Wang, Hanyu
21caf449-b5ce-4f73-8a82-60f6cf5ceb21
Tsinda, Emmanuel K.
982bfdfd-fe65-4ee5-8d40-473db7b3480f
Dunn, Anthony J.
161d9c8e-6813-4909-95ea-6c11bbbca287
Chikweto, Francis
ea6e5d9d-24f0-48b3-ab47-9d9117857c57
Zemkoho, Alain B.
30c79e30-9879-48bd-8d0b-e2fbbc01269e
Wang, Hanyu
21caf449-b5ce-4f73-8a82-60f6cf5ceb21
Tsinda, Emmanuel K.
982bfdfd-fe65-4ee5-8d40-473db7b3480f
Dunn, Anthony J.
161d9c8e-6813-4909-95ea-6c11bbbca287
Chikweto, Francis
ea6e5d9d-24f0-48b3-ab47-9d9117857c57
Zemkoho, Alain B.
30c79e30-9879-48bd-8d0b-e2fbbc01269e
[Unknown type: UNSPECIFIED]
Abstract
Motivation: PCR is more economical and quicker than Next Generation Sequencing for detecting target organisms, with primer design being a critical step. In epidemiology with rapidly mutating viruses, designing effective primers is challenging. Traditional methods require substantial manual intervention and struggle to ensure effective primer design across different strains. For organisms with large, similar genomes like Escherichia coli and Shigella flexneri, differentiating between species is also difficult but crucial. Results: We developed Primer C-VAE, a model based on a Variational Auto-Encoder framework with Convolutional Neural Networks to identify variants and generate specific primers. Using SARS-CoV-2, our model classified variants (alpha, beta, gamma, delta, omicron) with 98% accuracy and generated variant-specific primers. These primers appeared with >95% frequency in target variants and
Text
2503.01459v2
- Version of Record
More information
Accepted/In Press date: 3 March 2025
Keywords:
q-bio.GN, cs.LG
Identifiers
Local EPrints ID: 508843
URI: http://eprints.soton.ac.uk/id/eprint/508843
ISSN: 2331-8422
PURE UUID: f6a2e3dd-802c-4b2d-bacf-4bae180836ee
Catalogue record
Date deposited: 04 Feb 2026 17:54
Last modified: 05 Feb 2026 02:47
Export record
Altmetrics
Contributors
Author:
Hanyu Wang
Author:
Emmanuel K. Tsinda
Author:
Anthony J. Dunn
Author:
Francis Chikweto
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics