The University of Southampton
University of Southampton Institutional Repository

Ranking the importance of genetic factors by variable‐selection confidence sets

Ranking the importance of genetic factors by variable‐selection confidence sets
Ranking the importance of genetic factors by variable‐selection confidence sets
The widespread use of generalized linear models in case-control genetic studies has helped identify many disease-associated risk factors typically defined as DNA variants, or single nucleotide polymorphisms (SNPs). Up to now, most literature has focused on selecting a unique best subset of SNPs based on some statistical perspective. When the noise is large compared to signal, however, multiple biological paths are often found to be supported by a given dataset. We address the ambiguity related to SNP selection by constructing a list of models C called variable selection confidence set (VSCS) C which contains the collection of all well-supported SNP combinations at a user-specified confidence level. The VSCS extends the familiar notion of confidence intervals in the variable selection setting and provides the practitioner with new tools aiding the variable selection activity beyond trusting a single model. Based on the VSCS, we consider natural graphical and numerical statistics measuring the inclusion importance of a SNP based on its frequency in the most parsimonious VSCS models. This work is motivated by available case-control genetic data on age-related macular degeneration, a widespread disease and leading cause of vision loss.
0035-9254
727-749
Zheng, Chao
f3e2a919-4c02-4f5a-8de6-4c4de8ab6b60
Ferrari, Davide
b061fda3-174e-409f-8130-b09eca7e4c93
Zhang, Michael
e701c792-5679-4549-95e0-2a173f91b04b
Baird, Paul
3d99935f-5744-4cff-965f-ff7b63e00ffd
Zheng, Chao
f3e2a919-4c02-4f5a-8de6-4c4de8ab6b60
Ferrari, Davide
b061fda3-174e-409f-8130-b09eca7e4c93
Zhang, Michael
e701c792-5679-4549-95e0-2a173f91b04b
Baird, Paul
3d99935f-5744-4cff-965f-ff7b63e00ffd

Zheng, Chao, Ferrari, Davide, Zhang, Michael and Baird, Paul (2019) Ranking the importance of genetic factors by variable‐selection confidence sets. Journal of the Royal Statistical Society, Series C (Applied Statistics), 68 (3), 727-749. (doi:10.1111/rssc.12337).

Record type: Article

Abstract

The widespread use of generalized linear models in case-control genetic studies has helped identify many disease-associated risk factors typically defined as DNA variants, or single nucleotide polymorphisms (SNPs). Up to now, most literature has focused on selecting a unique best subset of SNPs based on some statistical perspective. When the noise is large compared to signal, however, multiple biological paths are often found to be supported by a given dataset. We address the ambiguity related to SNP selection by constructing a list of models C called variable selection confidence set (VSCS) C which contains the collection of all well-supported SNP combinations at a user-specified confidence level. The VSCS extends the familiar notion of confidence intervals in the variable selection setting and provides the practitioner with new tools aiding the variable selection activity beyond trusting a single model. Based on the VSCS, we consider natural graphical and numerical statistics measuring the inclusion importance of a SNP based on its frequency in the most parsimonious VSCS models. This work is motivated by available case-control genetic data on age-related macular degeneration, a widespread disease and leading cause of vision loss.

Text
Submission_JRSSC_rev2 - Accepted Manuscript
Download (411kB)

More information

Accepted/In Press date: 10 December 2018
e-pub ahead of print date: 21 February 2019
Published date: 1 April 2019

Identifiers

Local EPrints ID: 441568
URI: http://eprints.soton.ac.uk/id/eprint/441568
ISSN: 0035-9254
PURE UUID: 8daa8a1e-f23e-49cf-8672-39c9c2664113
ORCID for Chao Zheng: ORCID iD orcid.org/0000-0001-7943-6349

Catalogue record

Date deposited: 18 Jun 2020 16:30
Last modified: 17 Mar 2024 05:39

Export record

Altmetrics

Contributors

Author: Chao Zheng ORCID iD
Author: Davide Ferrari
Author: Michael Zhang
Author: Paul Baird

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×