A probabilistic framework for mismatch and profile string kernels
A probabilistic framework for mismatch and profile string kernels
There has recently been numerous applications of kernel methods in the field of bioinformatics. In particular, the problem of protein homology has served as a benchmark for the performance of many new kernels which operate directly on strings (such as amino-acid sequences). Several new kernels have been developed and successfully applied to this type of data, including spectrum, string, mismatch, and profile kernels. In this paper we introduce a general probabilistic framework for string kernels which uses the fisher-kernel approach and includes spectrum, mismatch and profile kernels, among others, as special cases. The use of a probabilistic model however provides additional flexibility both in definition and for the re-weighting of features through feature selection methods, prior knowledge or semi-supervised approaches which use data repositories such as BLAST. We give details of the framework and also give preliminary experimental results which show the applicability of the technique.
bioinformatics, string-kernel, fisher-kernel
325-330
Vinokourov, A.
a02864ca-6e1d-479f-b17b-26a9c3e9b76c
Soklakov, A. N.
cefd5e9f-6410-4567-9018-caa7a6563280
Saunders, C. J.
f101d3fd-c359-454d-ae7a-3be19a079010
Verleysen, Michel
5e53d91d-82c0-4273-ad41-77792861c872
2005
Vinokourov, A.
a02864ca-6e1d-479f-b17b-26a9c3e9b76c
Soklakov, A. N.
cefd5e9f-6410-4567-9018-caa7a6563280
Saunders, C. J.
f101d3fd-c359-454d-ae7a-3be19a079010
Verleysen, Michel
5e53d91d-82c0-4273-ad41-77792861c872
Vinokourov, A., Soklakov, A. N. and Saunders, C. J.
(2005)
A probabilistic framework for mismatch and profile string kernels.
Verleysen, Michel
(ed.)
13th European Symposium on Artificial Neural Networks (ESANN 2005), Bruges, Belgium.
27 - 29 Apr 2005.
.
Record type:
Conference or Workshop Item
(Paper)
Abstract
There has recently been numerous applications of kernel methods in the field of bioinformatics. In particular, the problem of protein homology has served as a benchmark for the performance of many new kernels which operate directly on strings (such as amino-acid sequences). Several new kernels have been developed and successfully applied to this type of data, including spectrum, string, mismatch, and profile kernels. In this paper we introduce a general probabilistic framework for string kernels which uses the fisher-kernel approach and includes spectrum, mismatch and profile kernels, among others, as special cases. The use of a probabilistic model however provides additional flexibility both in definition and for the re-weighting of features through feature selection methods, prior knowledge or semi-supervised approaches which use data repositories such as BLAST. We give details of the framework and also give preliminary experimental results which show the applicability of the technique.
Text
probFisher.pdf
- Other
More information
Published date: 2005
Additional Information:
Event Dates: 27-29 April 2005 Address: Bruges, Belgium
Venue - Dates:
13th European Symposium on Artificial Neural Networks (ESANN 2005), Bruges, Belgium, 2005-04-27 - 2005-04-29
Keywords:
bioinformatics, string-kernel, fisher-kernel
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 260835
URI: http://eprints.soton.ac.uk/id/eprint/260835
PURE UUID: db6b07d8-d9b4-49f8-8f11-09ee5b598910
Catalogue record
Date deposited: 04 May 2005
Last modified: 14 Mar 2024 06:44
Export record
Contributors
Author:
A. Vinokourov
Author:
A. N. Soklakov
Author:
C. J. Saunders
Editor:
Michel Verleysen
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics