Wavelet filter banks for cochlear implants
Wavelet filter banks for cochlear implants
Cochlear implant (CI) users regularly perform as well as normal-hearing (NH) listeners in quiet conditions. However, CI users have reduced speech perception in noise. CI users suffer more in terms of speech intelligibility than NH listeners in the same noisy environment. Speech coding strategies with noise reduction algorithms for CI devices play an important role, allowing CI users to benefit more from their implants. This thesis investigates a wavelet packet-based speech coding strategy with envelope-based noise reduction algorithms to enhance speech intelligibility in noisy conditions.
The advantages of wavelet packet transforms (WPTs), in terms of time-frequency analysis, the sparseness property, and low computational complexity, might make WPT appropriate for speech coding and denoising in CI devices. In cases with an optimal set of parameters for a wavelet packet-based speech coding strategy, the 23- and 64-band WPTs with sym8 and frame length of 8 ms were found to be more suitable than other combinations for this strategy. These parameters can optimise speech intelligibility to benefit CI users. However, both the standard ACE strategy and the wavelet packet-based strategy provided almost the same results in either quiet or noisy conditions.
Cases using envelope-based denoising techniques in a wavelet packet-based strategy, namely time-adaptive wavelet thresholding (TAWT) and time-frequency spectral subtraction (TFSS) were developed and evaluated by objective and subjective intelligibility measures and compared to ideal binary masking (IdBM) as a baseline for denoising performance. IdBM can restore intelligibility to nearly the same level as NH listeners in all noisy conditions. Both TAWT and TFSS showed slight intelligibility improvements in some noisy conditions. This may result from noise estimation in denoising techniques. Noise level may be under- or overestimated, and this results in distortion in enhanced speech and difficult in speech discrimination.
Both objective and subjective intelligibility measures can predict the trend of the performance of different denoising techniques for CI users. However, NH listeners can achieve better intelligibility at higher SNR levels without noise reduction, since they are less sensitive to noise but more sensitive to speech distortion when compared to CI listeners. Therefore, denoising techniques may work well for CI users in further investigations.
Dachasilaruk, Siriporn
b535096b-4ffb-4833-8188-1c8dc7f83d2d
December 2014
Dachasilaruk, Siriporn
b535096b-4ffb-4833-8188-1c8dc7f83d2d
Bleeck, Stefan
c888ccba-e64c-47bf-b8fa-a687e87ec16c
Dachasilaruk, Siriporn
(2014)
Wavelet filter banks for cochlear implants.
University of Southampton, Engineering and the Environment, Doctoral Thesis, 219pp.
Record type:
Thesis
(Doctoral)
Abstract
Cochlear implant (CI) users regularly perform as well as normal-hearing (NH) listeners in quiet conditions. However, CI users have reduced speech perception in noise. CI users suffer more in terms of speech intelligibility than NH listeners in the same noisy environment. Speech coding strategies with noise reduction algorithms for CI devices play an important role, allowing CI users to benefit more from their implants. This thesis investigates a wavelet packet-based speech coding strategy with envelope-based noise reduction algorithms to enhance speech intelligibility in noisy conditions.
The advantages of wavelet packet transforms (WPTs), in terms of time-frequency analysis, the sparseness property, and low computational complexity, might make WPT appropriate for speech coding and denoising in CI devices. In cases with an optimal set of parameters for a wavelet packet-based speech coding strategy, the 23- and 64-band WPTs with sym8 and frame length of 8 ms were found to be more suitable than other combinations for this strategy. These parameters can optimise speech intelligibility to benefit CI users. However, both the standard ACE strategy and the wavelet packet-based strategy provided almost the same results in either quiet or noisy conditions.
Cases using envelope-based denoising techniques in a wavelet packet-based strategy, namely time-adaptive wavelet thresholding (TAWT) and time-frequency spectral subtraction (TFSS) were developed and evaluated by objective and subjective intelligibility measures and compared to ideal binary masking (IdBM) as a baseline for denoising performance. IdBM can restore intelligibility to nearly the same level as NH listeners in all noisy conditions. Both TAWT and TFSS showed slight intelligibility improvements in some noisy conditions. This may result from noise estimation in denoising techniques. Noise level may be under- or overestimated, and this results in distortion in enhanced speech and difficult in speech discrimination.
Both objective and subjective intelligibility measures can predict the trend of the performance of different denoising techniques for CI users. However, NH listeners can achieve better intelligibility at higher SNR levels without noise reduction, since they are less sensitive to noise but more sensitive to speech distortion when compared to CI listeners. Therefore, denoising techniques may work well for CI users in further investigations.
Text
THESIS_SIRIPORN-Final.pdf
- Other
More information
Published date: December 2014
Organisations:
University of Southampton, Human Sciences Group
Identifiers
Local EPrints ID: 388109
URI: http://eprints.soton.ac.uk/id/eprint/388109
PURE UUID: f4b7a291-479e-4970-9f59-d04ba0e5a588
Catalogue record
Date deposited: 22 Feb 2016 12:29
Last modified: 15 Mar 2024 03:25
Export record
Contributors
Author:
Siriporn Dachasilaruk
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics