Performance of the pitch-scaled harmonic filter and applications in speech analysis
Performance of the pitch-scaled harmonic filter and applications in speech analysis
The pitch-scaled harmonic filter (PSHF) is a technique for decomposing speech signals into their voiced and unvoiced constituents. In this paper, we evaluate its ability to reconstruct the time series of the two components accurately using a variety of synthetic, speech-like signals, and discuss its performance. These results determine the degree of confidence that can be expected for real speech signals: typically, 5 dB improvement in the signal-to-noise ratio of the harmonic component and approximately 5 dB more than the initial harmonics-to-noise ratio (HNR) in the anharmonic component. A selection of the analysis opportunities that the decomposition offers is demonstrated on speech recordings, including dynamic HNR estimation and separate linear prediction analyses of the two components. These new capabilities provided by the PSHF can facilitate discovering previously hidden features and investigating interactions of unvoiced sources, such as frication, with voicing.
1311-1314
Jackson, P.J.B.
81dc3458-f913-44b4-9829-ecb626df5278
Shadle, C.H.
dc56253d-9926-466f-a27c-b9a8252a5304
June 2000
Jackson, P.J.B.
81dc3458-f913-44b4-9829-ecb626df5278
Shadle, C.H.
dc56253d-9926-466f-a27c-b9a8252a5304
Jackson, P.J.B. and Shadle, C.H.
(2000)
Performance of the pitch-scaled harmonic filter and applications in speech analysis.
IEEE-ICASSP 2000.
.
Record type:
Conference or Workshop Item
(Other)
Abstract
The pitch-scaled harmonic filter (PSHF) is a technique for decomposing speech signals into their voiced and unvoiced constituents. In this paper, we evaluate its ability to reconstruct the time series of the two components accurately using a variety of synthetic, speech-like signals, and discuss its performance. These results determine the degree of confidence that can be expected for real speech signals: typically, 5 dB improvement in the signal-to-noise ratio of the harmonic component and approximately 5 dB more than the initial harmonics-to-noise ratio (HNR) in the anharmonic component. A selection of the analysis opportunities that the decomposition offers is demonstrated on speech recordings, including dynamic HNR estimation and separate linear prediction analyses of the two components. These new capabilities provided by the PSHF can facilitate discovering previously hidden features and investigating interactions of unvoiced sources, such as frication, with voicing.
This record has no associated files available for download.
More information
Published date: June 2000
Additional Information:
Organisation: IEEE Address: Piscataway, NJ, USA
Venue - Dates:
IEEE-ICASSP 2000, 2000-05-31
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 253675
URI: http://eprints.soton.ac.uk/id/eprint/253675
PURE UUID: 31187388-2985-471d-8ac0-cd64c382d626
Catalogue record
Date deposited: 29 Jun 2000
Last modified: 08 Jan 2022 05:42
Export record
Contributors
Author:
P.J.B. Jackson
Author:
C.H. Shadle
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics