Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity
Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity
The combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole-exome sequencing data of about 4000 SARS-CoV-2-positive individuals were used to define an interpretable machine-learning model for predicting COVID-19 severity. First, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthily, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.
147-173
Fallerini, Chiara
020ce7ff-1350-4a48-837d-4e5bbd404e85
Picchiotti, Nicola
9d9fab12-75ab-486f-9625-458f2fa6812b
Baldassarri, Margherita
25615d5e-9b0e-42b3-a495-e8825e950813
Cusack, Rebecca
dfb1595f-2792-4f76-ac6d-da027cf40146
WES/WGS Working Group Within the HGI
GEN-COVID Multicenter Study
1 January 2022
Fallerini, Chiara
020ce7ff-1350-4a48-837d-4e5bbd404e85
Picchiotti, Nicola
9d9fab12-75ab-486f-9625-458f2fa6812b
Baldassarri, Margherita
25615d5e-9b0e-42b3-a495-e8825e950813
Cusack, Rebecca
dfb1595f-2792-4f76-ac6d-da027cf40146
Fallerini, Chiara, Picchiotti, Nicola and Baldassarri, Margherita
,
et al., WES/WGS Working Group Within the HGI, GenOMICC Consortium and GEN-COVID Multicenter Study
(2022)
Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity.
Human Genetics, 141 (1), .
(doi:10.1007/s00439-021-02397-7).
Abstract
The combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole-exome sequencing data of about 4000 SARS-CoV-2-positive individuals were used to define an interpretable machine-learning model for predicting COVID-19 severity. First, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthily, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.
Text
s00439-021-02397-7
- Version of Record
More information
Published date: 1 January 2022
Identifiers
Local EPrints ID: 490727
URI: http://eprints.soton.ac.uk/id/eprint/490727
ISSN: 0340-6717
PURE UUID: e208f589-6b40-48a5-9b59-543513fe407a
Catalogue record
Date deposited: 04 Jun 2024 16:58
Last modified: 05 Jun 2024 01:55
Export record
Altmetrics
Contributors
Author:
Chiara Fallerini
Author:
Nicola Picchiotti
Author:
Margherita Baldassarri
Author:
Rebecca Cusack
Corporate Author: et al.
Corporate Author: WES/WGS Working Group Within the HGI
Corporate Author: GenOMICC Consortium
Corporate Author: GEN-COVID Multicenter Study
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics