A discrete wavelet transform-based voice activity detection and noise classification with sub-band selection
A discrete wavelet transform-based voice activity detection and noise classification with sub-band selection
A real-time discrete wavelet transform-based adaptive voice activity detector and sub-band selection for feature extraction are proposed for noise classification, which can be used in a speech processing pipeline. The voice activity detection and sub-band selection rely on wavelet energy features and the feature extraction process involves the extraction of mel-frequency cepstral coefficients from selected wavelet sub-bands and mean absolute values of all sub-bands. The method combined with a feedforward neural network with two hidden layers could be added to speech enhancement systems and deployed in hearing devices such as cochlear implants. In comparison to the conventional short-time Fourier transform-based technique, it has higher F1 scores and classification accuracies (with a mean of 0.916 and 90.1%, respectively) across five different noise types (babble, factory, pink, Volvo (car) and white noise), a significantly smaller feature set with 21 features, reduced memory requirement, faster training convergence and about half the computational cost.
Discrete wavelet transform, Mel-frequency cepstral coefficients, Multilayer perceptron, Noise classification, Sub-band selection, Voice activity detection
Abdullah, Salinna
89e5e2a6-7778-4cd8-ba08-ed4e0df53050
Zamani, Majid
431788cc-0702-4fa9-9709-f5777a2d0d25
Demosthenous, Andreas
bed19531-d770-4f48-8464-59d225ddea8d
27 April 2021
Abdullah, Salinna
89e5e2a6-7778-4cd8-ba08-ed4e0df53050
Zamani, Majid
431788cc-0702-4fa9-9709-f5777a2d0d25
Demosthenous, Andreas
bed19531-d770-4f48-8464-59d225ddea8d
Abdullah, Salinna, Zamani, Majid and Demosthenous, Andreas
(2021)
A discrete wavelet transform-based voice activity detection and noise classification with sub-band selection.
In 2021 IEEE International Symposium on Circuits and Systems, ISCAS 2021 - Proceedings.
vol. 2021-May,
IEEE..
(doi:10.1109/ISCAS51556.2021.9401647).
Record type:
Conference or Workshop Item
(Paper)
Abstract
A real-time discrete wavelet transform-based adaptive voice activity detector and sub-band selection for feature extraction are proposed for noise classification, which can be used in a speech processing pipeline. The voice activity detection and sub-band selection rely on wavelet energy features and the feature extraction process involves the extraction of mel-frequency cepstral coefficients from selected wavelet sub-bands and mean absolute values of all sub-bands. The method combined with a feedforward neural network with two hidden layers could be added to speech enhancement systems and deployed in hearing devices such as cochlear implants. In comparison to the conventional short-time Fourier transform-based technique, it has higher F1 scores and classification accuracies (with a mean of 0.916 and 90.1%, respectively) across five different noise types (babble, factory, pink, Volvo (car) and white noise), a significantly smaller feature set with 21 features, reduced memory requirement, faster training convergence and about half the computational cost.
This record has no associated files available for download.
More information
Published date: 27 April 2021
Additional Information:
Publisher Copyright:
© 2021 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.
Venue - Dates:
53rd IEEE International Symposium on Circuits and Systems, ISCAS 2021, , Daegu, Korea, Republic of, 2021-05-22 - 2021-05-28
Keywords:
Discrete wavelet transform, Mel-frequency cepstral coefficients, Multilayer perceptron, Noise classification, Sub-band selection, Voice activity detection
Identifiers
Local EPrints ID: 489169
URI: http://eprints.soton.ac.uk/id/eprint/489169
ISSN: 0271-4310
PURE UUID: dba1fc39-768d-4aab-b798-254934b8d9d5
Catalogue record
Date deposited: 16 Apr 2024 16:36
Last modified: 18 Apr 2024 02:09
Export record
Altmetrics
Contributors
Author:
Salinna Abdullah
Author:
Majid Zamani
Author:
Andreas Demosthenous
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics