Optimizing depth and type of high-throughput sequencing data for microsatellite discovery
Optimizing depth and type of high-throughput sequencing data for microsatellite discovery
PREMISE: Simple sequence repeat (SSR) markers (microsatellites) are a mainstay of many labs, especially when working on a limited budget, carrying out preliminary analyses, and in teaching. Whether SSRs mined from plant genomes or transcriptomes are preferred for certain applications, and the depth of sequencing needed to allow efficient SSR discovery, has not been tested.
METHODS: I used genome and transcriptome high-throughput sequencing data at a range of sequencing depths to compare efficacy of SSR identification. I then tested primers from tomato for amplification, polymorphism, and transferability to related species.
RESULTS: Small assemblies (two million read pairs) identified ca. 200–2000 potential markers from the genome assemblies and ca. 600–3650 from the transcriptome assemblies. Genomederived contigs were often short, potentially precluding primer design. Genomic SSR primers were less transferable across species but exhibited greater variation (partially explained by being composed of more repeat units) than transcriptome-derived primers.
DISCUSSION: Small high-throughput sequencing resources may be sufficient for identification of hundreds of SSRs. Genomic data may be preferable in species with low polymorphism, but transcriptome data may result in longer loci (more amenable to primer design) and primers may be more trasferable to related species.
Chapman, Mark
8bac4a92-bfa7-4c3c-af29-9af852ef6383
November 2019
Chapman, Mark
8bac4a92-bfa7-4c3c-af29-9af852ef6383
Chapman, Mark
(2019)
Optimizing depth and type of high-throughput sequencing data for microsatellite discovery.
Applications in Plant Sciences, 7 (11), [e11298].
(doi:10.1002/aps3.11298).
Abstract
PREMISE: Simple sequence repeat (SSR) markers (microsatellites) are a mainstay of many labs, especially when working on a limited budget, carrying out preliminary analyses, and in teaching. Whether SSRs mined from plant genomes or transcriptomes are preferred for certain applications, and the depth of sequencing needed to allow efficient SSR discovery, has not been tested.
METHODS: I used genome and transcriptome high-throughput sequencing data at a range of sequencing depths to compare efficacy of SSR identification. I then tested primers from tomato for amplification, polymorphism, and transferability to related species.
RESULTS: Small assemblies (two million read pairs) identified ca. 200–2000 potential markers from the genome assemblies and ca. 600–3650 from the transcriptome assemblies. Genomederived contigs were often short, potentially precluding primer design. Genomic SSR primers were less transferable across species but exhibited greater variation (partially explained by being composed of more repeat units) than transcriptome-derived primers.
DISCUSSION: Small high-throughput sequencing resources may be sufficient for identification of hundreds of SSRs. Genomic data may be preferable in species with low polymorphism, but transcriptome data may result in longer loci (more amenable to primer design) and primers may be more trasferable to related species.
Text
Chapman-2019-Applications_in_Plant_Sciences
- Version of Record
More information
Accepted/In Press date: 11 October 2019
e-pub ahead of print date: 3 November 2019
Published date: November 2019
Identifiers
Local EPrints ID: 435676
URI: http://eprints.soton.ac.uk/id/eprint/435676
ISSN: 2168-0450
PURE UUID: 5d1d2e45-b4cc-472a-84ae-da04861ab415
Catalogue record
Date deposited: 18 Nov 2019 17:30
Last modified: 17 Mar 2024 03:31
Export record
Altmetrics
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics