Visualising the repeat structure of genomic sequences

Whiteford, Nava E., Haslam, Niall J., Weber, Gerald, Prügel-Bennett, Adam, Essex, Jonathan W. and Neylon, Cameron (2008) Visualising the repeat structure of genomic sequences Complex Systems, 17, (4), pp. 381-398.


[img] PDF whiteford08a.pdf - Author's Original
Download (363kB)


Repeats are a common feature of genomic sequences and much remains to be understood of their origin and structure. The identification of repeated strings in genomic sequences is therefore of importance for a variety of applications in biology. In this paper a new method for finding all repeats and visualising them in a two dimensional plot is presented. The method is first applied to a set of constructed sequences in order to develop a comparative framework. Several complete genomes are then analysed, including the whole human genome. The technique reveals the complex repeat structure of genomic sequences. In particular, interesting differences in the repeat character of the coding and non-coding regions of bacterial genomes are noted. The method allows fast identification of all repeats and easy intergenome comparison. In doing this the plot effectively creates a signature of a sequence which allows some classes of repeat present in a sequence to be identified by simple visual inspection. To our knowledge this is the first time all exact repeats have been visualised in a single plot that highlights the degree to which repeats occur within a genomic sequence, giving an indication of the important role repeats play. From this it is clear that large scale repeat analysis remains an important and unsolved problem in Bioinformatics.

Item Type: Article
Related URLs:
ePrint ID: 64707
Date :
Date Event
Date Deposited: 09 Jan 2009
Last Modified: 16 Apr 2017 17:18
Further Information:Google Scholar

Actions (login required)

View Item View Item