The University of Southampton
University of Southampton Institutional Repository

Evolving the structure of Hidden Markov Models for biological sequence analysis

Evolving the structure of Hidden Markov Models for biological sequence analysis
Evolving the structure of Hidden Markov Models for biological sequence analysis

Hidden Markov Models (HMMs) are widely used for biological sequence analysis because of their ability to incorporate biological information in their structure.  An automatic method of optimising the structure of HMMs for biological sequence analysis is highly desirable.

In this thesis, we explore the possibility of using a genetic algorithm (GA) for optimising the HMM structure.  The Baum-Welch algorithm is hybridised within its evolutionary cycle.  To prevent overfitting, a separate dataset is used for comparing the performance of the HMMs to that used for the Baum-Welch training.

The proposed GA for hidden Markov models (GA-HMM) allows HMMs with different number of states to evolve.  The GA-HMM was capable of finding an HMM comparable to a hand-coded HMM designed for the same task, which has been published previously.

We also propose Block-HMMs where the topology of HMMs was assembled from biologically meaningful building blocks.  New genetic operators are designed to evolve the HMM structure while preserving the blocks.

We applied the evolving HMM structure methods to modelling the promoter and coding region of a prokaryote and predicting the secondary structure of proteins.  The Block-HMM method could generate HMM structures and find conserved promoter region and triplet codon model without any prior information on the sequences.  When the Block-HMM is tested for the protein secondary structure prediction problem, it showed superior performance to other prediction methods using HMMs and was comparable to the best known techniques for this problem.

University of Southampton
Won, Kyoung-Jae
3b5c7d9a-e6bd-4624-9825-338e795b9945
Won, Kyoung-Jae
3b5c7d9a-e6bd-4624-9825-338e795b9945

Won, Kyoung-Jae (2005) Evolving the structure of Hidden Markov Models for biological sequence analysis. University of Southampton, Doctoral Thesis.

Record type: Thesis (Doctoral)

Abstract

Hidden Markov Models (HMMs) are widely used for biological sequence analysis because of their ability to incorporate biological information in their structure.  An automatic method of optimising the structure of HMMs for biological sequence analysis is highly desirable.

In this thesis, we explore the possibility of using a genetic algorithm (GA) for optimising the HMM structure.  The Baum-Welch algorithm is hybridised within its evolutionary cycle.  To prevent overfitting, a separate dataset is used for comparing the performance of the HMMs to that used for the Baum-Welch training.

The proposed GA for hidden Markov models (GA-HMM) allows HMMs with different number of states to evolve.  The GA-HMM was capable of finding an HMM comparable to a hand-coded HMM designed for the same task, which has been published previously.

We also propose Block-HMMs where the topology of HMMs was assembled from biologically meaningful building blocks.  New genetic operators are designed to evolve the HMM structure while preserving the blocks.

We applied the evolving HMM structure methods to modelling the promoter and coding region of a prokaryote and predicting the secondary structure of proteins.  The Block-HMM method could generate HMM structures and find conserved promoter region and triplet codon model without any prior information on the sequences.  When the Block-HMM is tested for the protein secondary structure prediction problem, it showed superior performance to other prediction methods using HMMs and was comparable to the best known techniques for this problem.

Text
1011970.pdf - Version of Record
Available under License University of Southampton Thesis Licence.
Download (3MB)

More information

Published date: 2005

Identifiers

Local EPrints ID: 465873
URI: http://eprints.soton.ac.uk/id/eprint/465873
PURE UUID: 53599d94-5844-4f9c-b012-e7fe4a18b466

Catalogue record

Date deposited: 05 Jul 2022 03:22
Last modified: 16 Mar 2024 20:25

Export record

Contributors

Author: Kyoung-Jae Won

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×