The mutational constraint spectrum quantified from variation in 141,456 humans
The mutational constraint spectrum quantified from variation in 141,456 humans
Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases
434–443
Karczewski, Konrad J.
b6d81e8d-3586-4016-aeb7-edfa89372d15
Francioli, Laurent C.
4ca1708d-dd00-4eb3-a1c4-44d097f33a4e
Tiao, Grace
ffe34a5f-b15c-4144-bbc3-8489665d7c5c
Seaby, Eleanor
ec948f42-007c-4bd8-9dff-bb86278bf03f
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Genome Aggregation Database Consortium
28 May 2020
Karczewski, Konrad J.
b6d81e8d-3586-4016-aeb7-edfa89372d15
Francioli, Laurent C.
4ca1708d-dd00-4eb3-a1c4-44d097f33a4e
Tiao, Grace
ffe34a5f-b15c-4144-bbc3-8489665d7c5c
Seaby, Eleanor
ec948f42-007c-4bd8-9dff-bb86278bf03f
Karczewski, Konrad J., Francioli, Laurent C., Tiao, Grace and Seaby, Eleanor
,
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA and Genome Aggregation Database Consortium
(2020)
The mutational constraint spectrum quantified from variation in 141,456 humans.
Nature, 581, .
(doi:10.1038/s41586-020-2308-7).
Abstract
Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases
This record has no associated files available for download.
More information
Accepted/In Press date: 26 March 2020
Published date: 28 May 2020
Additional Information:
Addendum to: Nature https://doi.org/10.1038/s41586-020-2308-7 Published online 27 May 2020
Identifiers
Local EPrints ID: 469934
URI: http://eprints.soton.ac.uk/id/eprint/469934
ISSN: 0028-0836
PURE UUID: 4e3d4e71-33e4-496e-b230-b0f86da90ba5
Catalogue record
Date deposited: 28 Sep 2022 17:14
Last modified: 17 Mar 2024 04:05
Export record
Altmetrics
Contributors
Author:
Konrad J. Karczewski
Author:
Laurent C. Francioli
Author:
Grace Tiao
Author:
Eleanor Seaby
Corporate Author: Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Corporate Author: Genome Aggregation Database Consortium
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics