Exome sequence read depth methods for identifying copy number changes
Exome sequence read depth methods for identifying copy number changes
Copy number variants (CNVs) play important roles in a number of human diseases and in pharmacogenetics. Powerful methods exist for CNV detection in whole genome sequencing (WGS) data, but such data are costly to obtain. Many disease causal CNVs span or are found in genome coding regions (exons), which makes CNV detection using whole exome sequencing (WES) data attractive. If reliably validated against WGS-based CNVs, exome-derived CNVs have potential applications in a clinical setting. Several algorithms have been developed to exploit exome data for CNV detection and comparisons made to find the most suitable methods for particular data samples. The results are not consistent across studies. Here, we review some of the exome CNV detection methods based on depth of coverage profiles and examine their performance to identify problems contributing to discrepancies in published results. We also present a streamlined strategy that uses a single metric, the likelihood ratio, to compare exome methods, and we demonstrated its utility using the VarScan 2 and eXome Hidden Markov Model (XHMM) programs using paired normal and tumour exome data from chronic lymphocytic leukaemia patients. We use array-based somatic CNV (SCNV) calls as a reference standard to compute prevalence-independent statistics, such as sensitivity, specificity and likelihood ratio, for validation of the exome-derived SCNVs. We also account for factors known to influence the performance of exome read depth methods, such as CNV size and frequency, while comparing our findings with published results.
Algorithms, Base Sequence, Chromosome Mapping/methods, DNA Copy Number Variations/genetics, DNA, Neoplasm/genetics, Data Interpretation, Statistical, Exome/genetics, Humans, Leukemia, Lymphocytic, Chronic, B-Cell/genetics, Molecular Sequence Data, Pattern Recognition, Automated/methods, Reproducibility of Results, Sensitivity and Specificity, Sequence Analysis, DNA/methods
380-392
Kadalayil, Latha
92878359-046f-4f1c-b9e9-4fce54607ac4
Rafiq, Sajjad
54722709-929f-4faa-b4d9-863d4d563056
Rose-Zerilli, Matthew J J
29603401-e310-4054-b818-8a542c361b9a
Pengelly, Reuben J
af97c0c1-b568-415c-9f59-1823b65be76d
Parker, Helen
1652d4f8-0e80-4ecc-be24-aebed48c6a19
Oscier, David
c2620a1d-25bb-48f7-9651-f5d023636381
Strefford, Jonathan C
3782b392-f080-42bf-bdca-8aa5d6ca532f
Tapper, William J
9d5ddc92-a8dd-4c78-ac67-c5867b62724c
Gibson, Jane
62ecc833-c348-44a1-be5c-3010d15eccf1
Ennis, Sarah
7b57f188-9d91-4beb-b217-09856146f1e9
Collins, Andrew
dfaf2088-2c1c-44b3-a347-c18b66a2082d
May 2015
Kadalayil, Latha
92878359-046f-4f1c-b9e9-4fce54607ac4
Rafiq, Sajjad
54722709-929f-4faa-b4d9-863d4d563056
Rose-Zerilli, Matthew J J
29603401-e310-4054-b818-8a542c361b9a
Pengelly, Reuben J
af97c0c1-b568-415c-9f59-1823b65be76d
Parker, Helen
1652d4f8-0e80-4ecc-be24-aebed48c6a19
Oscier, David
c2620a1d-25bb-48f7-9651-f5d023636381
Strefford, Jonathan C
3782b392-f080-42bf-bdca-8aa5d6ca532f
Tapper, William J
9d5ddc92-a8dd-4c78-ac67-c5867b62724c
Gibson, Jane
62ecc833-c348-44a1-be5c-3010d15eccf1
Ennis, Sarah
7b57f188-9d91-4beb-b217-09856146f1e9
Collins, Andrew
dfaf2088-2c1c-44b3-a347-c18b66a2082d
Kadalayil, Latha, Rafiq, Sajjad, Rose-Zerilli, Matthew J J, Pengelly, Reuben J, Parker, Helen, Oscier, David, Strefford, Jonathan C, Tapper, William J, Gibson, Jane, Ennis, Sarah and Collins, Andrew
(2015)
Exome sequence read depth methods for identifying copy number changes.
Briefings in Bioinformatics, 16 (3), , [bbu027].
(doi:10.1093/bib/bbu027).
Abstract
Copy number variants (CNVs) play important roles in a number of human diseases and in pharmacogenetics. Powerful methods exist for CNV detection in whole genome sequencing (WGS) data, but such data are costly to obtain. Many disease causal CNVs span or are found in genome coding regions (exons), which makes CNV detection using whole exome sequencing (WES) data attractive. If reliably validated against WGS-based CNVs, exome-derived CNVs have potential applications in a clinical setting. Several algorithms have been developed to exploit exome data for CNV detection and comparisons made to find the most suitable methods for particular data samples. The results are not consistent across studies. Here, we review some of the exome CNV detection methods based on depth of coverage profiles and examine their performance to identify problems contributing to discrepancies in published results. We also present a streamlined strategy that uses a single metric, the likelihood ratio, to compare exome methods, and we demonstrated its utility using the VarScan 2 and eXome Hidden Markov Model (XHMM) programs using paired normal and tumour exome data from chronic lymphocytic leukaemia patients. We use array-based somatic CNV (SCNV) calls as a reference standard to compute prevalence-independent statistics, such as sensitivity, specificity and likelihood ratio, for validation of the exome-derived SCNVs. We also account for factors known to influence the performance of exome read depth methods, such as CNV size and frequency, while comparing our findings with published results.
This record has no associated files available for download.
More information
Accepted/In Press date: 28 August 2014
e-pub ahead of print date: 28 August 2014
Published date: May 2015
Keywords:
Algorithms, Base Sequence, Chromosome Mapping/methods, DNA Copy Number Variations/genetics, DNA, Neoplasm/genetics, Data Interpretation, Statistical, Exome/genetics, Humans, Leukemia, Lymphocytic, Chronic, B-Cell/genetics, Molecular Sequence Data, Pattern Recognition, Automated/methods, Reproducibility of Results, Sensitivity and Specificity, Sequence Analysis, DNA/methods
Organisations:
Chemistry, Cancer Sciences, Human Development & Health, Centre for Biological Sciences
Identifiers
Local EPrints ID: 368514
URI: http://eprints.soton.ac.uk/id/eprint/368514
ISSN: 1467-5463
PURE UUID: 956e6aa1-5bf4-4fe6-b928-2e4e7bf01f6d
Catalogue record
Date deposited: 09 Sep 2014 13:37
Last modified: 15 Mar 2024 03:48
Export record
Altmetrics
Contributors
Author:
Latha Kadalayil
Author:
Sajjad Rafiq
Author:
Matthew J J Rose-Zerilli
Author:
Helen Parker
Author:
David Oscier
Author:
Jane Gibson
Author:
Andrew Collins
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics