Bayesian algorithms for speech enhancement

Andrianakis, I. (2007) Bayesian algorithms for speech enhancement. University of Southampton, Institute of Sound and Vibration Research, Doctoral Thesis, 198pp.

Record type: Thesis (Doctoral)

Abstract

The portability of modern voice processing devices allows them to be used in environments where background noise conditions can be adverse. Background noise can deteriorate the quality of speech transmitted through such devices, but speech enhancement algorithms can ameliorate this degradation to some extent. The development of speech enhancement algorithms that improve the quality of noisy speech is the aim of this thesis, which consists of three main parts.
In the first part, we propose a framework of algorithms that estimate the clean speech Short Time Fourier Transform (STFT) coefficients. The algorithms are derived from the Bayesian theory of estimation and can be grouped according to i) the STFT representation they estimate ii) the estimator they apply and iii) the speech prior density they assume. Apart from the introduction of algorithms that surpass the performance of similar algorithms that exist in the literature, the compilation of the above framework offers insight on the effect and relative importance of the different components of the algorithms (e.g. prior, estimator) to the quality of the enhanced speech.
In the second part of this thesis, we develop methods for the estimation of the power of time varying noise. The main outcome is a method that exploits some similarities between the distribution of the noisy speech spectral amplitude coefficients within a single frequency bin, and the corresponding distribution of the corrupting noise. The above similarities allow the extraction of samples that are more likely to correspond to noise, from a window of past spectral amplitude observations. The extracted samples are then used to produce an estimate of the noise power.
In the final part of this thesis, we are concerned with the incorporation of the time and frequency dependencies of speech signals in our estimation model. The theoretical framework on which the modelling is based is provided by Markov Random Fields (MRF’s). Initially, we develop a MAP estimator of speech based on the Gaussian MRF prior. In the following, we introduce the Chi MRF, which is employed in the development of an improved speech estimator. Finally, the performance of fixed and adaptive schemes for the estimation of the MRF parameters is investigated.

Text

P2515.pdf - Other

Download (4MB)

More information

Published date: November 2007

Organisations: University of Southampton

Identifiers

Local EPrints ID: 66244

URI: http://eprints.soton.ac.uk/id/eprint/66244

PURE UUID: 039bc728-d12e-4443-bd15-e93d16dc5cf6

ORCID for P.R. White:

orcid.org/0000-0002-4787-8713

Catalogue record

Date deposited: 20 May 2009

Last modified: 14 Mar 2024 02:34

Export record

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: I. Andrianakis

Thesis advisor: P.R. White

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information