Speaker identification using auditory modelling and vector quantization
Speaker identification using auditory modelling and vector quantization
This paper presents an experimental evaluation of different features for use in speaker identification (SID). The features are tested using speech data provided by the EUROM1 database, in a text-independent closed-set speaker identification task. The main objective of the paper is to present a novel parameterization of speech that is based on an auditory model called Auditory Image Model (AIM). This model provides features of the speech signal and their utility is assessed in the context of speaker identification. In order to explore the features that are more informative for predicting a speaker’s identity, the auditory image is used within the framework of cutting it into rectangles. Then, a novel strategy is incorporated for the enrolment of speakers, which is used for specifying the regions of the image that contain features that make a speaker discriminative. Afterwards, the new speaker-specific feature representation is assessed in noisy conditions that simulate a real-world environment. Their performance is compared with the results obtained adopting MFCC features in the context of a Vector Quantization (VQ) classification system. The results for the identification accuracy suggest that the new parameterization provides better results compared to conventional MFCCs especially for low SNRs.
283-303
Iliadi, Konstantina
ed728e5b-c03f-427e-bbd5-39ca7330acb9
Bleeck, Stefan
c888ccba-e64c-47bf-b8fa-a687e87ec16c
2017
Iliadi, Konstantina
ed728e5b-c03f-427e-bbd5-39ca7330acb9
Bleeck, Stefan
c888ccba-e64c-47bf-b8fa-a687e87ec16c
Iliadi, Konstantina and Bleeck, Stefan
(2017)
Speaker identification using auditory modelling and vector quantization.
Advances in Modelling and Analysis B, 60 (2), .
Abstract
This paper presents an experimental evaluation of different features for use in speaker identification (SID). The features are tested using speech data provided by the EUROM1 database, in a text-independent closed-set speaker identification task. The main objective of the paper is to present a novel parameterization of speech that is based on an auditory model called Auditory Image Model (AIM). This model provides features of the speech signal and their utility is assessed in the context of speaker identification. In order to explore the features that are more informative for predicting a speaker’s identity, the auditory image is used within the framework of cutting it into rectangles. Then, a novel strategy is incorporated for the enrolment of speakers, which is used for specifying the regions of the image that contain features that make a speaker discriminative. Afterwards, the new speaker-specific feature representation is assessed in noisy conditions that simulate a real-world environment. Their performance is compared with the results obtained adopting MFCC features in the context of a Vector Quantization (VQ) classification system. The results for the identification accuracy suggest that the new parameterization provides better results compared to conventional MFCCs especially for low SNRs.
Text
Speaker Identification Iliada and Bleeck - final draft
- Accepted Manuscript
More information
Accepted/In Press date: 25 September 2017
e-pub ahead of print date: 25 December 2017
Published date: 2017
Identifiers
Local EPrints ID: 418223
URI: http://eprints.soton.ac.uk/id/eprint/418223
ISSN: 1240-4543
PURE UUID: 29d52f59-12c5-428f-bcb0-1675db44f720
Catalogue record
Date deposited: 23 Feb 2018 17:30
Last modified: 16 Mar 2024 03:49
Export record
Contributors
Author:
Konstantina Iliadi
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics