The University of Southampton
University of Southampton Institutional Repository

A discrete wavelet transform-based voice activity detection and noise classification with sub-band selection

A discrete wavelet transform-based voice activity detection and noise classification with sub-band selection
A discrete wavelet transform-based voice activity detection and noise classification with sub-band selection

A real-time discrete wavelet transform-based adaptive voice activity detector and sub-band selection for feature extraction are proposed for noise classification, which can be used in a speech processing pipeline. The voice activity detection and sub-band selection rely on wavelet energy features and the feature extraction process involves the extraction of mel-frequency cepstral coefficients from selected wavelet sub-bands and mean absolute values of all sub-bands. The method combined with a feedforward neural network with two hidden layers could be added to speech enhancement systems and deployed in hearing devices such as cochlear implants. In comparison to the conventional short-time Fourier transform-based technique, it has higher F1 scores and classification accuracies (with a mean of 0.916 and 90.1%, respectively) across five different noise types (babble, factory, pink, Volvo (car) and white noise), a significantly smaller feature set with 21 features, reduced memory requirement, faster training convergence and about half the computational cost.

Discrete wavelet transform, Mel-frequency cepstral coefficients, Multilayer perceptron, Noise classification, Sub-band selection, Voice activity detection
0271-4310
IEEE
Abdullah, Salinna
89e5e2a6-7778-4cd8-ba08-ed4e0df53050
Zamani, Majid
431788cc-0702-4fa9-9709-f5777a2d0d25
Demosthenous, Andreas
bed19531-d770-4f48-8464-59d225ddea8d
Abdullah, Salinna
89e5e2a6-7778-4cd8-ba08-ed4e0df53050
Zamani, Majid
431788cc-0702-4fa9-9709-f5777a2d0d25
Demosthenous, Andreas
bed19531-d770-4f48-8464-59d225ddea8d

Abdullah, Salinna, Zamani, Majid and Demosthenous, Andreas (2021) A discrete wavelet transform-based voice activity detection and noise classification with sub-band selection. In 2021 IEEE International Symposium on Circuits and Systems, ISCAS 2021 - Proceedings. vol. 2021-May, IEEE.. (doi:10.1109/ISCAS51556.2021.9401647).

Record type: Conference or Workshop Item (Paper)

Abstract

A real-time discrete wavelet transform-based adaptive voice activity detector and sub-band selection for feature extraction are proposed for noise classification, which can be used in a speech processing pipeline. The voice activity detection and sub-band selection rely on wavelet energy features and the feature extraction process involves the extraction of mel-frequency cepstral coefficients from selected wavelet sub-bands and mean absolute values of all sub-bands. The method combined with a feedforward neural network with two hidden layers could be added to speech enhancement systems and deployed in hearing devices such as cochlear implants. In comparison to the conventional short-time Fourier transform-based technique, it has higher F1 scores and classification accuracies (with a mean of 0.916 and 90.1%, respectively) across five different noise types (babble, factory, pink, Volvo (car) and white noise), a significantly smaller feature set with 21 features, reduced memory requirement, faster training convergence and about half the computational cost.

This record has no associated files available for download.

More information

Published date: 27 April 2021
Additional Information: Publisher Copyright: © 2021 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.
Venue - Dates: 53rd IEEE International Symposium on Circuits and Systems, ISCAS 2021, , Daegu, Korea, Republic of, 2021-05-22 - 2021-05-28
Keywords: Discrete wavelet transform, Mel-frequency cepstral coefficients, Multilayer perceptron, Noise classification, Sub-band selection, Voice activity detection

Identifiers

Local EPrints ID: 489169
URI: http://eprints.soton.ac.uk/id/eprint/489169
ISSN: 0271-4310
PURE UUID: dba1fc39-768d-4aab-b798-254934b8d9d5
ORCID for Majid Zamani: ORCID iD orcid.org/0009-0007-0844-473X

Catalogue record

Date deposited: 16 Apr 2024 16:36
Last modified: 18 Apr 2024 02:09

Export record

Altmetrics

Contributors

Author: Salinna Abdullah
Author: Majid Zamani ORCID iD
Author: Andreas Demosthenous

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×