A Multiband Excited Waveform-Interpolated 2.35-kbps Speech Codec for Bandlimited Channels
A Multiband Excited Waveform-Interpolated 2.35-kbps Speech Codec for Bandlimited Channels
Following a brief portrayal of the activities in 2.4-kbps speech coding, a wavelet-based pitch detector is invoked, which reduces the complexity of conventional autocorrelation-based pitch detectors, while ensuring smooth pitch trajectory evolution. This scheme is incorporated in a waveform-interpolated codec, which uses voiced–unvoiced (V/U) classification, and instead of simple Dirac pulses, an unconventional zinc basis function excitation is employed for modeling the voiced excitation. The required zinc-function parameters are determined in an analysis-by-synthesis loop, and for the sake of smooth waveform evolution and reduced complexity, a focused search strategy and a few further suboptimum restrictions are imposed without seriously affecting the speech quality. This baseline codec operates at a rate of 1.9 kbps, but it suffers from slight buzziness during the periods of excessive voicing. This impediment is then mitigated by invoking a mixed V/U multiband excitation, which slightly increases the bit rate to 2.35 kbps due to the transmission of the 3-b voicing strength code in each of the three excitation bands. Index Terms—Low-rate speech coding, multiband excitation, waveform-interpolated speech coding, wavelet-based pitch estimation.
766-777
Brooks, F.C.A.
b96b5810-fbdb-4aac-b5ee-c7eccd6bcb16
Hanzo, L.
66e7266f-3066-4fc0-8391-e000acce71a1
May 2000
Brooks, F.C.A.
b96b5810-fbdb-4aac-b5ee-c7eccd6bcb16
Hanzo, L.
66e7266f-3066-4fc0-8391-e000acce71a1
Brooks, F.C.A. and Hanzo, L.
(2000)
A Multiband Excited Waveform-Interpolated 2.35-kbps Speech Codec for Bandlimited Channels.
IEEE Transactions on Vehicular Technology, 49 (3), .
Abstract
Following a brief portrayal of the activities in 2.4-kbps speech coding, a wavelet-based pitch detector is invoked, which reduces the complexity of conventional autocorrelation-based pitch detectors, while ensuring smooth pitch trajectory evolution. This scheme is incorporated in a waveform-interpolated codec, which uses voiced–unvoiced (V/U) classification, and instead of simple Dirac pulses, an unconventional zinc basis function excitation is employed for modeling the voiced excitation. The required zinc-function parameters are determined in an analysis-by-synthesis loop, and for the sake of smooth waveform evolution and reduced complexity, a focused search strategy and a few further suboptimum restrictions are imposed without seriously affecting the speech quality. This baseline codec operates at a rate of 1.9 kbps, but it suffers from slight buzziness during the periods of excessive voicing. This impediment is then mitigated by invoking a mixed V/U multiband excitation, which slightly increases the bit rate to 2.35 kbps due to the transmission of the 3-b voicing strength code in each of the three excitation bands. Index Terms—Low-rate speech coding, multiband excitation, waveform-interpolated speech coding, wavelet-based pitch estimation.
Text
49vt03-brooks.pdf
- Other
More information
Published date: May 2000
Organisations:
Southampton Wireless Group
Identifiers
Local EPrints ID: 253685
URI: http://eprints.soton.ac.uk/id/eprint/253685
ISSN: 0018-9545
PURE UUID: 7cf0abfe-12bf-4163-aa73-6b5c2512a4af
Catalogue record
Date deposited: 12 Jan 2004
Last modified: 18 Mar 2024 02:33
Export record
Contributors
Author:
F.C.A. Brooks
Author:
L. Hanzo
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics