The University of Southampton
University of Southampton Institutional Repository

Interpretable modelling with sparse kernels

Interpretable modelling with sparse kernels
Interpretable modelling with sparse kernels
A drawback of many statistical modelling techniques, commonly used in machine learning, is that the resulting model is difficult to interpret. The principal focus of this thesis is the development of advanced non-linear interpretable models. Interpretable modelling offers us a powerful tool with which to understand the structure of a model constructed from data, allowing model validation and assisting in model selection. Gibbs (1997) observes the easiest way to introduce model interpretability is to use models where the parameters and the related hyperparameters have clearly interpretable meanings. The Bayesian methodology of Automatic Relevance Determination (ARD) (MacKay, 1994; Neal, 1995) is one such approach. In this thesis Laplace approximations, variational learning and Markov Chain Monte Carlo (MCMC) methods for hyperparameter determination are assessed within a Bayesian neural network. Empirical results highlight the numerical instability of the Laplace and variational methods with convergence to a local rather than global minima. Kernel methods have become a popular modelling approach (Vapnik, 1998; Smola, 1998; Williams, 1998). In this thesis the constructed kernel models are equipped with hyperparameters that allow: the ability to select important input variables, the ability to visualise the model structure and the ability to incorporate prior or expert knowledge. Ideas from the Bayesian and the signal processing communities together with the representational advantage of a sparse ANOVA decomposition have been merged. Interpretability is introduced by using two forms of regularisation: a 1-norm based structural regulariser to enforce interpretability, and a 2-norm based regulariser to control smoothness. The model structure can be visualised showing the overall effects of different inputs, their interactions, and the strength of the interactions. The performance of these interpretable learning algorithms is demonstrated on both synthetic and “real” data, notably the AMPG dataset, the Boston house price dataset, and the problem of predicting the mechanical property proof stress of a metal based on its chemical composition. Results from these different approaches are compared in terms of their interpretabilty by exploiting prior knowledge of the problem, and show the potential of interpretable data models.
University of Southampton
Kandola, J.S.
c976459a-d502-4688-b741-334c06796ca8
Kandola, J.S.
c976459a-d502-4688-b741-334c06796ca8
Gunn, S.R.
306af9b3-a7fa-4381-baf9-5d6a6ec89868

Kandola, J.S. (2001) Interpretable modelling with sparse kernels. University of Southampton, Electronics and Computer Science : University of Southampton, Doctoral Thesis, 177pp.

Record type: Thesis (Doctoral)

Abstract

A drawback of many statistical modelling techniques, commonly used in machine learning, is that the resulting model is difficult to interpret. The principal focus of this thesis is the development of advanced non-linear interpretable models. Interpretable modelling offers us a powerful tool with which to understand the structure of a model constructed from data, allowing model validation and assisting in model selection. Gibbs (1997) observes the easiest way to introduce model interpretability is to use models where the parameters and the related hyperparameters have clearly interpretable meanings. The Bayesian methodology of Automatic Relevance Determination (ARD) (MacKay, 1994; Neal, 1995) is one such approach. In this thesis Laplace approximations, variational learning and Markov Chain Monte Carlo (MCMC) methods for hyperparameter determination are assessed within a Bayesian neural network. Empirical results highlight the numerical instability of the Laplace and variational methods with convergence to a local rather than global minima. Kernel methods have become a popular modelling approach (Vapnik, 1998; Smola, 1998; Williams, 1998). In this thesis the constructed kernel models are equipped with hyperparameters that allow: the ability to select important input variables, the ability to visualise the model structure and the ability to incorporate prior or expert knowledge. Ideas from the Bayesian and the signal processing communities together with the representational advantage of a sparse ANOVA decomposition have been merged. Interpretability is introduced by using two forms of regularisation: a 1-norm based structural regulariser to enforce interpretability, and a 2-norm based regulariser to control smoothness. The model structure can be visualised showing the overall effects of different inputs, their interactions, and the strength of the interactions. The performance of these interpretable learning algorithms is demonstrated on both synthetic and “real” data, notably the AMPG dataset, the Boston house price dataset, and the problem of predicting the mechanical property proof stress of a metal based on its chemical composition. Results from these different approaches are compared in terms of their interpretabilty by exploiting prior knowledge of the problem, and show the potential of interpretable data models.

Text
Thesis.pdf - Version of Record
Available under License University of Southampton Thesis Licence.
Download (3MB)

More information

Published date: 1 June 2001
Organisations: University of Southampton, Electronic & Software Systems

Identifiers

Local EPrints ID: 256087
URI: http://eprints.soton.ac.uk/id/eprint/256087
PURE UUID: b1de0571-1521-4b8e-9358-f518ec3ceaea

Catalogue record

Date deposited: 29 Nov 2003
Last modified: 14 Mar 2024 05:38

Export record

Contributors

Author: J.S. Kandola
Thesis advisor: S.R. Gunn

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×