The University of Southampton
University of Southampton Institutional Repository

String kernels, Fisher kernels and finite state automata

String kernels, Fisher kernels and finite state automata
String kernels, Fisher kernels and finite state automata
In this paper we show how the generation of documents can be thought of as a k-stage Markov process, which leads to a Fisher kernel from which the n-gram and string kernels can be re-constructed. The Fisher kernel view gives a more flexible insight into the string kernel and suggests how it can be parametrised in a way that reflects the statistics of the training corpus. Furthermore, the probabilistic modelling approach suggests extending the Markov process to consider sub-sequences of varying length, rather than the standard fixed-length approach used in the string kernel. We give a procedure for determining which sub-sequences are informative features and hence generate a Finite State Machine model, which can again be used to obtain a Fisher kernel. By adjusting the parametrisation we can also influence the weighting received by the features. In this way we are able to obtain a logarithmic weighting in a Fisher kernel. Finally, experiments are reported comparing the different kernels using the standard Bag of Words kernel as a baseline.
MIT Press
Saunders, Craig
38a38da8-1eb3-47a8-80bc-b9cbb43f26e3
Vinokourov, Alexei
a82e6630-b417-4f82-a604-17a544452010
Shawe-Taylor, John
c32d0ee4-b422-491f-8c28-78663851d6db
Becker, Suzanna
Thrun, Sebastian
Obermayer, Klaus
Saunders, Craig
38a38da8-1eb3-47a8-80bc-b9cbb43f26e3
Vinokourov, Alexei
a82e6630-b417-4f82-a604-17a544452010
Shawe-Taylor, John
c32d0ee4-b422-491f-8c28-78663851d6db
Becker, Suzanna
Thrun, Sebastian
Obermayer, Klaus

Saunders, Craig, Vinokourov, Alexei and Shawe-Taylor, John (2003) String kernels, Fisher kernels and finite state automata. In, Becker, Suzanna, Thrun, Sebastian and Obermayer, Klaus (eds.) Advances of Neural Information Processing Systems 15 (NIPS 2002). Neural Information Processing Systems 15 (01/01/03) MIT Press.

Record type: Book Section

Abstract

In this paper we show how the generation of documents can be thought of as a k-stage Markov process, which leads to a Fisher kernel from which the n-gram and string kernels can be re-constructed. The Fisher kernel view gives a more flexible insight into the string kernel and suggests how it can be parametrised in a way that reflects the statistics of the training corpus. Furthermore, the probabilistic modelling approach suggests extending the Markov process to consider sub-sequences of varying length, rather than the standard fixed-length approach used in the string kernel. We give a procedure for determining which sub-sequences are informative features and hence generate a Finite State Machine model, which can again be used to obtain a Fisher kernel. By adjusting the parametrisation we can also influence the weighting received by the features. In this way we are able to obtain a logarithmic weighting in a Fisher kernel. Finally, experiments are reported comparing the different kernels using the standard Bag of Words kernel as a baseline.

Text
FSA_NIPS02.pdf - Version of Record
Restricted to Repository staff only

More information

Published date: 26 September 2003
Additional Information: Chapter: 8
Venue - Dates: Neural Information Processing Systems 15, 2003-01-01
Organisations: Electronics & Computer Science

Identifiers

Local EPrints ID: 258977
URI: http://eprints.soton.ac.uk/id/eprint/258977
PURE UUID: 8334a2d4-aa28-4ccd-b7f9-e16c767183c7

Catalogue record

Date deposited: 03 Mar 2004
Last modified: 17 Mar 2024 06:08

Export record

Contributors

Author: Craig Saunders
Author: Alexei Vinokourov
Author: John Shawe-Taylor
Editor: Suzanna Becker
Editor: Sebastian Thrun
Editor: Klaus Obermayer

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×