The University of Southampton
University of Southampton Institutional Repository

Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity

Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity
Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity

The combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole-exome sequencing data of about 4000 SARS-CoV-2-positive individuals were used to define an interpretable machine-learning model for predicting COVID-19 severity. First, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthily, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.

0340-6717
147-173
Fallerini, Chiara
020ce7ff-1350-4a48-837d-4e5bbd404e85
Picchiotti, Nicola
9d9fab12-75ab-486f-9625-458f2fa6812b
Baldassarri, Margherita
25615d5e-9b0e-42b3-a495-e8825e950813
Cusack, Rebecca
dfb1595f-2792-4f76-ac6d-da027cf40146
et al.
WES/WGS Working Group Within the HGI
GenOMICC Consortium
GEN-COVID Multicenter Study
Fallerini, Chiara
020ce7ff-1350-4a48-837d-4e5bbd404e85
Picchiotti, Nicola
9d9fab12-75ab-486f-9625-458f2fa6812b
Baldassarri, Margherita
25615d5e-9b0e-42b3-a495-e8825e950813
Cusack, Rebecca
dfb1595f-2792-4f76-ac6d-da027cf40146

Fallerini, Chiara, Picchiotti, Nicola and Baldassarri, Margherita , et al., WES/WGS Working Group Within the HGI, GenOMICC Consortium and GEN-COVID Multicenter Study (2022) Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity. Human Genetics, 141 (1), 147-173. (doi:10.1007/s00439-021-02397-7).

Record type: Article

Abstract

The combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole-exome sequencing data of about 4000 SARS-CoV-2-positive individuals were used to define an interpretable machine-learning model for predicting COVID-19 severity. First, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthily, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.

Text
s00439-021-02397-7 - Version of Record
Available under License Creative Commons Attribution.
Download (4MB)

More information

Published date: 1 January 2022

Identifiers

Local EPrints ID: 490727
URI: http://eprints.soton.ac.uk/id/eprint/490727
ISSN: 0340-6717
PURE UUID: e208f589-6b40-48a5-9b59-543513fe407a
ORCID for Rebecca Cusack: ORCID iD orcid.org/0000-0003-2863-2870

Catalogue record

Date deposited: 04 Jun 2024 16:58
Last modified: 05 Jun 2024 01:55

Export record

Altmetrics

Contributors

Author: Chiara Fallerini
Author: Nicola Picchiotti
Author: Margherita Baldassarri
Author: Rebecca Cusack ORCID iD
Corporate Author: et al.
Corporate Author: WES/WGS Working Group Within the HGI
Corporate Author: GenOMICC Consortium
Corporate Author: GEN-COVID Multicenter Study

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×