The University of Southampton
University of Southampton Institutional Repository

Visualising the repeat structure of genomic sequences

Visualising the repeat structure of genomic sequences
Visualising the repeat structure of genomic sequences
Repeats are a common feature of genomic sequences and much remains to be understood of their origin and structure. The identification of repeated strings in genomic sequences is therefore of importance for a variety of applications in biology.

In this paper a new method for finding all repeats and visualizing them in a two-dimensional plot is presented. The method is first applied to a set of constructed sequences in order to develop a comparative framework. Several complete genomes are then analyzed, including the whole human genome.

The technique reveals the complex repeat structure of genomic sequences. In particular, interesting differences in the repeat character of the coding and noncoding regions of bacterial genomes are noted.

The method allows fast identification of all repeats and easy inter-genome comparison. In doing this the plot effectively creates a signature of a sequence which allows some classes of repeats present in a sequence to be identified by simple visual inspection.

To our knowledge this is the first time all exact repeats have been visualized in a single plot that highlights the degree to which repeats occur within a genomic sequence, giving an indication of the important role repeats play. From this it is clear that large scale repeat analysis remains an important and unsolved problem in bioinformatics.
381-398
Whiteford, N.
75545cad-edcf-4435-8d26-80f483c1b53d
Haslam, N.
b58e997d-4114-4813-82e9-127cfca15c79
Weber, G.
2efb4751-cfcf-4500-9712-662056970679
Prugel-Bennett, A.
b107a151-1751-4d8b-b8db-2c395ac4e14e
Essex, J.~W.
b8161ca3-e151-4ed7-8dc5-dd6ad7dae8e2
Neylon, C.
fa1a47c2-baef-40af-b331-b84693db94b7
Whiteford, N.
75545cad-edcf-4435-8d26-80f483c1b53d
Haslam, N.
b58e997d-4114-4813-82e9-127cfca15c79
Weber, G.
2efb4751-cfcf-4500-9712-662056970679
Prugel-Bennett, A.
b107a151-1751-4d8b-b8db-2c395ac4e14e
Essex, J.~W.
b8161ca3-e151-4ed7-8dc5-dd6ad7dae8e2
Neylon, C.
fa1a47c2-baef-40af-b331-b84693db94b7

Whiteford, N., Haslam, N., Weber, G., Prugel-Bennett, A., Essex, J.~W. and Neylon, C. (2008) Visualising the repeat structure of genomic sequences. Complex Systems, 17 (4), 381-398.

Record type: Article

Abstract

Repeats are a common feature of genomic sequences and much remains to be understood of their origin and structure. The identification of repeated strings in genomic sequences is therefore of importance for a variety of applications in biology.

In this paper a new method for finding all repeats and visualizing them in a two-dimensional plot is presented. The method is first applied to a set of constructed sequences in order to develop a comparative framework. Several complete genomes are then analyzed, including the whole human genome.

The technique reveals the complex repeat structure of genomic sequences. In particular, interesting differences in the repeat character of the coding and noncoding regions of bacterial genomes are noted.

The method allows fast identification of all repeats and easy inter-genome comparison. In doing this the plot effectively creates a signature of a sequence which allows some classes of repeats present in a sequence to be identified by simple visual inspection.

To our knowledge this is the first time all exact repeats have been visualized in a single plot that highlights the degree to which repeats occur within a genomic sequence, giving an indication of the important role repeats play. From this it is clear that large scale repeat analysis remains an important and unsolved problem in bioinformatics.

Text
whiteford08a.pdf - Author's Original
Download (363kB)
Text
visualisation-proofs.pdf - Accepted Manuscript
Restricted to Registered users only
Download (442kB)
Request a copy

More information

Published date: 1 April 2008
Organisations: Southampton Wireless Group

Identifiers

Local EPrints ID: 270919
URI: http://eprints.soton.ac.uk/id/eprint/270919
PURE UUID: 51e3e861-b69c-487d-960f-9fd59bbdd60e

Catalogue record

Date deposited: 23 Apr 2010 14:52
Last modified: 27 Feb 2019 17:31

Export record

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×