(2014) Wavelet filter banks for cochlear implants. University of Southampton, Engineering and the Environment, Doctoral Thesis, 219pp.
Abstract
Cochlear implant (CI) users regularly perform as well as normal-hearing (NH) listeners in quiet conditions. However, CI users have reduced speech perception in noise. CI users suffer more in terms of speech intelligibility than NH listeners in the same noisy environment. Speech coding strategies with noise reduction algorithms for CI devices play an important role, allowing CI users to benefit more from their implants. This thesis investigates a wavelet packet-based speech coding strategy with envelope-based noise reduction algorithms to enhance speech intelligibility in noisy conditions.
The advantages of wavelet packet transforms (WPTs), in terms of time-frequency analysis, the sparseness property, and low computational complexity, might make WPT appropriate for speech coding and denoising in CI devices. In cases with an optimal set of parameters for a wavelet packet-based speech coding strategy, the 23- and 64-band WPTs with sym8 and frame length of 8 ms were found to be more suitable than other combinations for this strategy. These parameters can optimise speech intelligibility to benefit CI users. However, both the standard ACE strategy and the wavelet packet-based strategy provided almost the same results in either quiet or noisy conditions.
Cases using envelope-based denoising techniques in a wavelet packet-based strategy, namely time-adaptive wavelet thresholding (TAWT) and time-frequency spectral subtraction (TFSS) were developed and evaluated by objective and subjective intelligibility measures and compared to ideal binary masking (IdBM) as a baseline for denoising performance. IdBM can restore intelligibility to nearly the same level as NH listeners in all noisy conditions. Both TAWT and TFSS showed slight intelligibility improvements in some noisy conditions. This may result from noise estimation in denoising techniques. Noise level may be under- or overestimated, and this results in distortion in enhanced speech and difficult in speech discrimination.
Both objective and subjective intelligibility measures can predict the trend of the performance of different denoising techniques for CI users. However, NH listeners can achieve better intelligibility at higher SNR levels without noise reduction, since they are less sensitive to noise but more sensitive to speech distortion when compared to CI listeners. Therefore, denoising techniques may work well for CI users in further investigations.
More information
Identifiers
Catalogue record
Export record
Contributors
University divisions
- Faculties (pre 2018 reorg) > Faculty of Engineering and the Environment (pre 2018 reorg) > Inst. Sound & Vibration Research (pre 2018 reorg) > Human Sciences Group (pre 2018 reorg)
Current Faculties > Faculty of Engineering and Physical Sciences > School of Engineering > Institute of Sound and Vibration Research > Inst. Sound & Vibration Research (pre 2018 reorg) > Human Sciences Group (pre 2018 reorg)
Institute of Sound and Vibration Research > Inst. Sound & Vibration Research (pre 2018 reorg) > Human Sciences Group (pre 2018 reorg)
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.