Exploiting large scale computing for the analysis of genetic linkage disequilibrium
Exploiting large scale computing for the analysis of genetic linkage disequilibrium
Linkage disequilibrium (LD) maps are analogous to linkage maps in that map distances are additive but, for disease association mapping, they provide potentially much higher resolution. LDMAP is a program for constructing LD maps from single nucleotide polymorphism (SNP) data of large population samples. It employs a sequential algorithm which consecutively selects the informative SNP pairwise data for each interval between adjacent SNPs, and models the decline of association with distance centered on each interval. From the HapMap project and other sources, millions of SNP genotypes are now available for constructing LD maps at high resolution. The voluminous data imposes a considerable computational load which can be addressed effectively with the parallel paradigm utilizing large scale computing. Various approaches have been evaluated for the best implementation of the high throughput alternative. The equivalent of 2.8 computing years was achieved in one month real time for constructing genome-wide LD maps for four HapMap populations using a parallelized version of the LDMAP program deployed on a local Beowulf cluster. Comparison with lower resolution phase I maps confirms map distances are essentially maker density independent. The higher resolution of the phase II maps however resolves a "holes" where LD is low. The developed parallel framework has also been adopted in CHROMSCAN for enabling the rapid analysis of disease association data, thus demonstrating the extendibility of the paradigm.
University of Southampton
2007
Lau, Winston Wai Shing
(2007)
Exploiting large scale computing for the analysis of genetic linkage disequilibrium.
University of Southampton, Doctoral Thesis.
Record type:
Thesis
(Doctoral)
Abstract
Linkage disequilibrium (LD) maps are analogous to linkage maps in that map distances are additive but, for disease association mapping, they provide potentially much higher resolution. LDMAP is a program for constructing LD maps from single nucleotide polymorphism (SNP) data of large population samples. It employs a sequential algorithm which consecutively selects the informative SNP pairwise data for each interval between adjacent SNPs, and models the decline of association with distance centered on each interval. From the HapMap project and other sources, millions of SNP genotypes are now available for constructing LD maps at high resolution. The voluminous data imposes a considerable computational load which can be addressed effectively with the parallel paradigm utilizing large scale computing. Various approaches have been evaluated for the best implementation of the high throughput alternative. The equivalent of 2.8 computing years was achieved in one month real time for constructing genome-wide LD maps for four HapMap populations using a parallelized version of the LDMAP program deployed on a local Beowulf cluster. Comparison with lower resolution phase I maps confirms map distances are essentially maker density independent. The higher resolution of the phase II maps however resolves a "holes" where LD is low. The developed parallel framework has also been adopted in CHROMSCAN for enabling the rapid analysis of disease association data, thus demonstrating the extendibility of the paradigm.
This record has no associated files available for download.
More information
Published date: 2007
Identifiers
Local EPrints ID: 466352
URI: http://eprints.soton.ac.uk/id/eprint/466352
PURE UUID: bdddb4bf-c44e-48b3-852a-ada76d80c538
Catalogue record
Date deposited: 05 Jul 2022 05:12
Last modified: 05 Jul 2022 05:12
Export record
Contributors
Author:
Winston Wai Shing Lau
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics