Whole genome sequences are required to fully resolve the linkage disequilibrium structure of human populations
Whole genome sequences are required to fully resolve the linkage disequilibrium structure of human populations
BACKGROUND: An understanding of linkage disequilibrium (LD) structures in the human genome underpins much of medical genetics and provides a basis for disease gene mapping and investigating biological mechanisms such as recombination and selection. Whole genome sequencing (WGS) provides the opportunity to determine LD structures at maximal resolution.
RESULTS: We compare LD maps constructed from WGS data with LD maps produced from the array-based HapMap dataset, for representative European and African populations. WGS provides up to 5.7-fold greater SNP density than array-based data and achieves much greater resolution of LD structure, allowing for identification of up to 2.8-fold more regions of intense recombination. The absence of ascertainment bias in variant genotyping improves the population representativeness of the WGS maps, and highlights the extent of uncaptured variation using array genotyping methodologies. The complete capture of LD patterns using WGS allows for higher genome-wide association study (GWAS) power compared to array-based GWAS, with WGS also allowing for the analysis of rare variation. The impact of marker ascertainment issues in arrays has been greatest for Sub-Saharan African populations where larger sample sizes and substantially higher marker densities are required to fully resolve the LD structure.
CONCLUSIONS: WGS provides the best possible resource for LD mapping due to the maximal marker density and lack of ascertainment bias. WGS LD maps provide a rich resource for medical and population genetics studies. The increasing availability of WGS data for large populations will allow for improved research utilising LD, such as GWAS and recombination biology studies.
1-10
Pengelly, Reuben
af97c0c1-b568-415c-9f59-1823b65be76d
Tapper, William
9d5ddc92-a8dd-4c78-ac67-c5867b62724c
Gibson, Jane
855033a6-38f3-4853-8f60-d7d4561226ae
Knut, Marcin
68d387d0-9b44-48d1-91c6-310f7470cdc2
Tearle, Rick
45850257-5056-4f6f-a5cf-d19f90f04b36
Collins, Andrew
7daa83eb-0b21-43b2-af1a-e38fb36e2a64
Ennis, Sarah
7b57f188-9d91-4beb-b217-09856146f1e9
3 September 2015
Pengelly, Reuben
af97c0c1-b568-415c-9f59-1823b65be76d
Tapper, William
9d5ddc92-a8dd-4c78-ac67-c5867b62724c
Gibson, Jane
855033a6-38f3-4853-8f60-d7d4561226ae
Knut, Marcin
68d387d0-9b44-48d1-91c6-310f7470cdc2
Tearle, Rick
45850257-5056-4f6f-a5cf-d19f90f04b36
Collins, Andrew
7daa83eb-0b21-43b2-af1a-e38fb36e2a64
Ennis, Sarah
7b57f188-9d91-4beb-b217-09856146f1e9
Pengelly, Reuben, Tapper, William and Gibson, Jane et al.
(2015)
Whole genome sequences are required to fully resolve the linkage disequilibrium structure of human populations.
BMC Genomics, 16 (1), , [666].
(doi:10.1186/s12864-015-1854-0).
(PMID:26335686)
Abstract
BACKGROUND: An understanding of linkage disequilibrium (LD) structures in the human genome underpins much of medical genetics and provides a basis for disease gene mapping and investigating biological mechanisms such as recombination and selection. Whole genome sequencing (WGS) provides the opportunity to determine LD structures at maximal resolution.
RESULTS: We compare LD maps constructed from WGS data with LD maps produced from the array-based HapMap dataset, for representative European and African populations. WGS provides up to 5.7-fold greater SNP density than array-based data and achieves much greater resolution of LD structure, allowing for identification of up to 2.8-fold more regions of intense recombination. The absence of ascertainment bias in variant genotyping improves the population representativeness of the WGS maps, and highlights the extent of uncaptured variation using array genotyping methodologies. The complete capture of LD patterns using WGS allows for higher genome-wide association study (GWAS) power compared to array-based GWAS, with WGS also allowing for the analysis of rare variation. The impact of marker ascertainment issues in arrays has been greatest for Sub-Saharan African populations where larger sample sizes and substantially higher marker densities are required to fully resolve the LD structure.
CONCLUSIONS: WGS provides the best possible resource for LD mapping due to the maximal marker density and lack of ascertainment bias. WGS LD maps provide a rich resource for medical and population genetics studies. The increasing availability of WGS data for large populations will allow for improved research utilising LD, such as GWAS and recombination biology studies.
Text
s12864-015-1854-0.pdf
- Version of Record
More information
Accepted/In Press date: 17 August 2015
Published date: 3 September 2015
Organisations:
Human Development & Health
Identifiers
Local EPrints ID: 381353
URI: http://eprints.soton.ac.uk/id/eprint/381353
ISSN: 1471-2164
PURE UUID: a137ea90-f2d5-4427-b9af-a80ee1e1cd58
Catalogue record
Date deposited: 06 Oct 2015 12:59
Last modified: 15 Mar 2024 03:48
Export record
Altmetrics
Contributors
Author:
Marcin Knut
Author:
Rick Tearle
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics