The University of Southampton
University of Southampton Institutional Repository

On acoustic emotion recognition: compensating for covariate shift

On acoustic emotion recognition: compensating for covariate shift
On acoustic emotion recognition: compensating for covariate shift
Pattern recognition tasks often face the situation that training data are not fully representative of test data. This problem is well-recognized in speech recognition, where methods like cepstral mean normalization (CMN), vocal tract length normalization (VTLN) and maximum likelihood linear regression (MLLR) are used to compensate for channel and speaker differences. Speech emotion recognition (SER) is an important emerging field in human-computer interaction and faces the same data shift problems, a fact which has been generally overlooked in this domain. In this paper, we show that compensating for channel and speaker differences can give significant improvements in SER by modelling these differences as a covariate shift. We employ three algorithms from the domain of transfer learning that apply importance weights (IWs) within a support vector machine classifier to reduce the effects of covariate shift. We test these methods on the FAU Aibo Emotion Corpus, which was used in the Interspeech 2009 Emotion Challenge. It consists of two separate parts recorded independently at different schools; hence the two parts exhibit covariate shift. Results show that the IW methods outperform combined CMN and VTLN and significantly improve on the baseline performance of the Challenge. The best of the three methods also improves significantly on the winning contribution to the Challenge.
1558-7916
1458-1468
Hassan, A.
f16d7813-136b-414a-88cc-46c38cddff45
Damper, R.I.
6e0e7fdc-57ec-44d4-bc0f-029d17ba441d
Niranjan, M.
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Hassan, A.
f16d7813-136b-414a-88cc-46c38cddff45
Damper, R.I.
6e0e7fdc-57ec-44d4-bc0f-029d17ba441d
Niranjan, M.
5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Hassan, A., Damper, R.I. and Niranjan, M. (2013) On acoustic emotion recognition: compensating for covariate shift. IEEE Transactions on Audio, Speech and Language Processing, 21 (7), 1458-1468. (doi:10.1109/TASL.2013.2255278).

Record type: Article

Abstract

Pattern recognition tasks often face the situation that training data are not fully representative of test data. This problem is well-recognized in speech recognition, where methods like cepstral mean normalization (CMN), vocal tract length normalization (VTLN) and maximum likelihood linear regression (MLLR) are used to compensate for channel and speaker differences. Speech emotion recognition (SER) is an important emerging field in human-computer interaction and faces the same data shift problems, a fact which has been generally overlooked in this domain. In this paper, we show that compensating for channel and speaker differences can give significant improvements in SER by modelling these differences as a covariate shift. We employ three algorithms from the domain of transfer learning that apply importance weights (IWs) within a support vector machine classifier to reduce the effects of covariate shift. We test these methods on the FAU Aibo Emotion Corpus, which was used in the Interspeech 2009 Emotion Challenge. It consists of two separate parts recorded independently at different schools; hence the two parts exhibit covariate shift. Results show that the IW methods outperform combined CMN and VTLN and significantly improve on the baseline performance of the Challenge. The best of the three methods also improves significantly on the winning contribution to the Challenge.

This record has no associated files available for download.

More information

e-pub ahead of print date: 27 March 2013
Published date: July 2013
Organisations: Southampton Wireless Group

Identifiers

Local EPrints ID: 350948
URI: http://eprints.soton.ac.uk/id/eprint/350948
ISSN: 1558-7916
PURE UUID: 3fa3b776-6e18-4a3c-8b98-e194157b06ea
ORCID for M. Niranjan: ORCID iD orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 11 Apr 2013 09:57
Last modified: 15 Mar 2024 03:29

Export record

Altmetrics

Contributors

Author: A. Hassan
Author: R.I. Damper
Author: M. Niranjan ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×