The University of Southampton
University of Southampton Institutional Repository

Detection of long repeat expansions from PCR-free whole-genome sequence data

Detection of long repeat expansions from PCR-free whole-genome sequence data
Detection of long repeat expansions from PCR-free whole-genome sequence data

Identifying large expansions of short tandem repeats (STRs), such as those that cause amyotrophic lateral sclerosis (ALS) and fragile X syndrome, is challenging for short-read whole-genome sequencing (WGS) data. A solution to this problem is an important step toward integrating WGS into precision medicine. We developed a software tool called ExpansionHunter that, using PCR-free WGS short-read data, can genotype repeats at the locus of interest, even if the expanded repeat is larger than the read length. We applied our algorithm to WGS data from 3001 ALS patients who have been tested for the presence of the C9orf72 repeat expansion with repeat-primed PCR (RP-PCR). Compared against this truth data, ExpansionHunter correctly classified all (212/212, 95% CI [0.98, 1.00]) of the expanded samples as either expansions (208) or potential expansions (4). Additionally, 99.9% (2786/2789, 95% CI [0.997, 1.00]) of the wild-type samples were correctly classified as wild type by this method with the remaining three samples identified as possible expansions. We further applied our algorithm to a set of 152 samples in which every sample had one of eight different pathogenic repeat expansions, including those associated with fragile X syndrome, Friedreich's ataxia, and Huntington's disease, and correctly flagged all but one of the known repeat expansions. Thus, ExpansionHunter can be used to accurately detect known pathogenic repeat expansions and provides researchers with a tool that can be used to identify new pathogenic repeat expansions.

Journal Article
1088-9051
1895-1903
Dolzhenko, Egor
97e442c5-5704-4f62-953c-17b7f29c41ab
van Vugt, Joke J.F.A.
dbd4909a-bc49-4fcd-b515-89d2c26e19ac
Shaw, Richard J
298c2aed-130e-4580-af23-1032d29e16f7
Bekritsky, Mitchell A
25295961-0f5c-4582-98a8-01c9ba322f8d
van Blitterswijk, Marka
60dea6f4-97e7-45de-a71e-d4a905547bfd
Narzisi, Giuseppe
0b70f41c-c455-4f15-8b98-7f141060866d
Ajay, Subramanian S.
c3139b96-6aa7-4548-82a2-3813e3713e75
Rajan, Vani
6de537ad-0fca-4a47-8c4e-6c107b1bf985
Lajoie, Bryan R.
96a427a4-d643-4e87-99f3-e8eac1066686
Johnson, Nathan H.
bdec7e28-3bd1-4a84-853a-c8ba9f9d8a8a
Kingsbury, Zoya
d74ea6f7-a2e1-4727-aa38-4e5cbf9724b1
Humphray, Sean J.
dd925059-66ef-4763-940a-1ee2bc95dae5
Schellevis, Raymond D
e2d89708-e7ca-405c-ada2-e2270607eb41
Brands, William J.
d932d9ce-b26d-46df-b0be-5bfa2c03eb15
Baker, Matt
07d6900f-03d7-4ecd-b22d-dfbbed4e4553
Rademakers, Rosa
51368430-ea77-44ba-8710-0522375d8ce6
Kooyman, Maarten
ccdb8c75-8681-4809-99a9-8fc715b9474b
Tazelaar, Gijs H.P.
e52f61b9-9b3b-49fb-880d-81bdbabe067a
van Es, Michael A.
dffcd987-c214-4d52-a97a-8012e21cfa7a
McLaughlin, Russell
2550a488-a7c7-4af0-83eb-a05e45f203f9
Sproviero, William
f0231df6-ec34-456b-976d-d66ec3f3228f
Shatunov, Aleksey
e4422e0f-4878-404a-8e9e-fc45b4d5d9bd
Jones, Ashley
88875ff8-f810-4186-b815-a243a80bbc77
Al Khleifat, Ahmad
36f3f499-11a5-4b0a-a617-23d44eab04e9
Pittman, Alan
542e5159-e078-46f7-b126-cce2b8d3b47a
Morgan, Sarah
2843f219-71cb-4748-bd79-6fa035330eac
Hardiman, Orla
95816121-f024-4ca0-bc95-426d778e5b80
Al-Chalabi, Ammar
8ac40bc3-a3c0-47ad-afa7-37e08f7aab6f
Shaw, Chris
76d8effd-a5f7-49ea-aa53-922c83edaf3a
Smith, Bradley
2703005f-a5e9-4a7b-8751-aa5c4d3252d9
Neo, Edmund J.
603a39fd-6ab8-43fc-8a1e-c71496e40858
Morrison, Karen
f00890f0-2fde-4dbd-a73b-7422e1b0ede8
Shaw, Pamela J.
3d0a5c6f-9610-45be-a0bc-5b6888003028
Reeves, Catherine
ed4ddf6d-b550-40bd-a46b-41db470f5df8
Winterkorn, Lara
558f5532-2278-4e50-adca-bfa2aa21af9d
Wexler, Nancy S.
b04b256d-7863-45aa-a7b0-dc70afee1a25
Housman, David E.
169fb102-09fc-4143-b079-08ed00989250
Ng, Christopher W.
0e50259d-309f-4232-83a9-49e325ef00ef
Li, Alina L.
70a5bc8e-0499-4f21-9d95-85ce3730b1ea
Taft, Ryan J.
9945ca3f-a67f-43d2-992d-9d1a34e5404a
van den Berg, Leonard H.
bd6ed390-a56e-48d3-8273-a139bdaca4da
Bentley, David R.
013a4621-98b8-4aac-bd17-afb394f35370
Veldink, Jan H.
7c073868-8350-4d7a-b345-86541b34e3a2
Eberle, Michael A.
bbd86de7-da8b-46a1-b739-942992ab06fb
US–Venezuela Collaborative Research Group
Dolzhenko, Egor
97e442c5-5704-4f62-953c-17b7f29c41ab
van Vugt, Joke J.F.A.
dbd4909a-bc49-4fcd-b515-89d2c26e19ac
Shaw, Richard J
298c2aed-130e-4580-af23-1032d29e16f7
Bekritsky, Mitchell A
25295961-0f5c-4582-98a8-01c9ba322f8d
van Blitterswijk, Marka
60dea6f4-97e7-45de-a71e-d4a905547bfd
Narzisi, Giuseppe
0b70f41c-c455-4f15-8b98-7f141060866d
Ajay, Subramanian S.
c3139b96-6aa7-4548-82a2-3813e3713e75
Rajan, Vani
6de537ad-0fca-4a47-8c4e-6c107b1bf985
Lajoie, Bryan R.
96a427a4-d643-4e87-99f3-e8eac1066686
Johnson, Nathan H.
bdec7e28-3bd1-4a84-853a-c8ba9f9d8a8a
Kingsbury, Zoya
d74ea6f7-a2e1-4727-aa38-4e5cbf9724b1
Humphray, Sean J.
dd925059-66ef-4763-940a-1ee2bc95dae5
Schellevis, Raymond D
e2d89708-e7ca-405c-ada2-e2270607eb41
Brands, William J.
d932d9ce-b26d-46df-b0be-5bfa2c03eb15
Baker, Matt
07d6900f-03d7-4ecd-b22d-dfbbed4e4553
Rademakers, Rosa
51368430-ea77-44ba-8710-0522375d8ce6
Kooyman, Maarten
ccdb8c75-8681-4809-99a9-8fc715b9474b
Tazelaar, Gijs H.P.
e52f61b9-9b3b-49fb-880d-81bdbabe067a
van Es, Michael A.
dffcd987-c214-4d52-a97a-8012e21cfa7a
McLaughlin, Russell
2550a488-a7c7-4af0-83eb-a05e45f203f9
Sproviero, William
f0231df6-ec34-456b-976d-d66ec3f3228f
Shatunov, Aleksey
e4422e0f-4878-404a-8e9e-fc45b4d5d9bd
Jones, Ashley
88875ff8-f810-4186-b815-a243a80bbc77
Al Khleifat, Ahmad
36f3f499-11a5-4b0a-a617-23d44eab04e9
Pittman, Alan
542e5159-e078-46f7-b126-cce2b8d3b47a
Morgan, Sarah
2843f219-71cb-4748-bd79-6fa035330eac
Hardiman, Orla
95816121-f024-4ca0-bc95-426d778e5b80
Al-Chalabi, Ammar
8ac40bc3-a3c0-47ad-afa7-37e08f7aab6f
Shaw, Chris
76d8effd-a5f7-49ea-aa53-922c83edaf3a
Smith, Bradley
2703005f-a5e9-4a7b-8751-aa5c4d3252d9
Neo, Edmund J.
603a39fd-6ab8-43fc-8a1e-c71496e40858
Morrison, Karen
f00890f0-2fde-4dbd-a73b-7422e1b0ede8
Shaw, Pamela J.
3d0a5c6f-9610-45be-a0bc-5b6888003028
Reeves, Catherine
ed4ddf6d-b550-40bd-a46b-41db470f5df8
Winterkorn, Lara
558f5532-2278-4e50-adca-bfa2aa21af9d
Wexler, Nancy S.
b04b256d-7863-45aa-a7b0-dc70afee1a25
Housman, David E.
169fb102-09fc-4143-b079-08ed00989250
Ng, Christopher W.
0e50259d-309f-4232-83a9-49e325ef00ef
Li, Alina L.
70a5bc8e-0499-4f21-9d95-85ce3730b1ea
Taft, Ryan J.
9945ca3f-a67f-43d2-992d-9d1a34e5404a
van den Berg, Leonard H.
bd6ed390-a56e-48d3-8273-a139bdaca4da
Bentley, David R.
013a4621-98b8-4aac-bd17-afb394f35370
Veldink, Jan H.
7c073868-8350-4d7a-b345-86541b34e3a2
Eberle, Michael A.
bbd86de7-da8b-46a1-b739-942992ab06fb

Dolzhenko, Egor, van Vugt, Joke J.F.A., Shaw, Richard J, Bekritsky, Mitchell A, van Blitterswijk, Marka, Narzisi, Giuseppe, Ajay, Subramanian S., Rajan, Vani, Lajoie, Bryan R., Johnson, Nathan H., Kingsbury, Zoya, Humphray, Sean J., Schellevis, Raymond D, Brands, William J., Baker, Matt, Rademakers, Rosa, Kooyman, Maarten, Tazelaar, Gijs H.P., van Es, Michael A., McLaughlin, Russell, Sproviero, William, Shatunov, Aleksey, Jones, Ashley, Al Khleifat, Ahmad, Pittman, Alan, Morgan, Sarah, Hardiman, Orla, Al-Chalabi, Ammar, Shaw, Chris, Smith, Bradley, Neo, Edmund J., Morrison, Karen, Shaw, Pamela J., Reeves, Catherine, Winterkorn, Lara, Wexler, Nancy S., Housman, David E., Ng, Christopher W., Li, Alina L., Taft, Ryan J., van den Berg, Leonard H., Bentley, David R., Veldink, Jan H. and Eberle, Michael A. , US–Venezuela Collaborative Research Group (2017) Detection of long repeat expansions from PCR-free whole-genome sequence data. Genome Research, 27 (11), 1895-1903. (doi:10.1101/gr.225672.117).

Record type: Article

Abstract

Identifying large expansions of short tandem repeats (STRs), such as those that cause amyotrophic lateral sclerosis (ALS) and fragile X syndrome, is challenging for short-read whole-genome sequencing (WGS) data. A solution to this problem is an important step toward integrating WGS into precision medicine. We developed a software tool called ExpansionHunter that, using PCR-free WGS short-read data, can genotype repeats at the locus of interest, even if the expanded repeat is larger than the read length. We applied our algorithm to WGS data from 3001 ALS patients who have been tested for the presence of the C9orf72 repeat expansion with repeat-primed PCR (RP-PCR). Compared against this truth data, ExpansionHunter correctly classified all (212/212, 95% CI [0.98, 1.00]) of the expanded samples as either expansions (208) or potential expansions (4). Additionally, 99.9% (2786/2789, 95% CI [0.997, 1.00]) of the wild-type samples were correctly classified as wild type by this method with the remaining three samples identified as possible expansions. We further applied our algorithm to a set of 152 samples in which every sample had one of eight different pathogenic repeat expansions, including those associated with fragile X syndrome, Friedreich's ataxia, and Huntington's disease, and correctly flagged all but one of the known repeat expansions. Thus, ExpansionHunter can be used to accurately detect known pathogenic repeat expansions and provides researchers with a tool that can be used to identify new pathogenic repeat expansions.

Text
Genome Res.-2017-Dolzhenko-1895-903 - Version of Record
Available under License Creative Commons Attribution.
Download (851kB)

More information

Accepted/In Press date: 28 August 2017
e-pub ahead of print date: 8 September 2017
Published date: November 2017
Keywords: Journal Article

Identifiers

Local EPrints ID: 417051
URI: http://eprints.soton.ac.uk/id/eprint/417051
ISSN: 1088-9051
PURE UUID: 1737088b-a632-4b2e-846f-273755284d46
ORCID for Karen Morrison: ORCID iD orcid.org/0000-0003-0216-5717

Catalogue record

Date deposited: 18 Jan 2018 17:30
Last modified: 07 Oct 2020 02:09

Export record

Altmetrics

Contributors

Author: Egor Dolzhenko
Author: Joke J.F.A. van Vugt
Author: Richard J Shaw
Author: Mitchell A Bekritsky
Author: Marka van Blitterswijk
Author: Giuseppe Narzisi
Author: Subramanian S. Ajay
Author: Vani Rajan
Author: Bryan R. Lajoie
Author: Nathan H. Johnson
Author: Zoya Kingsbury
Author: Sean J. Humphray
Author: Raymond D Schellevis
Author: William J. Brands
Author: Matt Baker
Author: Rosa Rademakers
Author: Maarten Kooyman
Author: Gijs H.P. Tazelaar
Author: Michael A. van Es
Author: Russell McLaughlin
Author: William Sproviero
Author: Aleksey Shatunov
Author: Ashley Jones
Author: Ahmad Al Khleifat
Author: Alan Pittman
Author: Sarah Morgan
Author: Orla Hardiman
Author: Ammar Al-Chalabi
Author: Chris Shaw
Author: Bradley Smith
Author: Edmund J. Neo
Author: Karen Morrison ORCID iD
Author: Pamela J. Shaw
Author: Catherine Reeves
Author: Lara Winterkorn
Author: Nancy S. Wexler
Author: David E. Housman
Author: Christopher W. Ng
Author: Alina L. Li
Author: Ryan J. Taft
Author: Leonard H. van den Berg
Author: David R. Bentley
Author: Jan H. Veldink
Author: Michael A. Eberle
Corporate Author: US–Venezuela Collaborative Research Group

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×