Analysis of mixed-source speech sounds: aspiration, voiced fricatives and breathiness

Our initial goal was to model the source characteristics of aspiration more accurately. The term is used inconsistently in the literature, but there is general agreement that aspiration is produced by turbulence noise generated in the vicinity of the glottis. Thus, in order to model aspiration, we must refine its concept, and in particular define its relation to other kinds of noise produced near the glottis, such as breathiness and hoarseness. For instance, do similar aeroacoustic processes operate transiently during a plosive release and steadily during a breathy vowel? In unvoiced fricatives, localized sources produce well-defined spectral troughs. We have therefore developed a series of analysis methods that generate spectra for transient and voice-and-noise-excited sounds. These methods include pitch-synchronous decomposition into harmonic and anharmonic components (based on a hoarseness metric of Muta et al., 1988), short-time spectra, ensemble averaging, and short-time harmonics-to-noise ratios (Jackson and Shadle, 1998). These have been applied to a corpus of repeated nonsense words consisting of aspirated stops in three vowel contexts and voiced and unvoiced fricatives, spoken in four voice qualities, thus providing multiple examples of mixed-source and transient-source speech sounds. Ensemble-averaged spectra derived throughout a stop release show evidence of a highly-localized noise source becoming more distributed. Variations by place are also apparent, complementing and extending previous work (Stevens and Blumstein, 1978; Stevens, 1993). The coordination of glottal and supraglottal articulation, described and modelled for aspiration by Scully and Mair (1995), is in a sense reversed for voiced fricatives. Use of the decomposition algorithm on voiced fricatives revealed greater complexity than expected: the anharmonic component appears sometimes to be modulated by the harmonic component, sometimes to be independent of it, and tends to change from one case to the other in the course of the fricative. In sum, we have made some progress in describing not only spectral but time-varying properties of an aspiration model, and in so doing, have improved our descriptions of other mixed-source, time-varying speech sounds.

Jackson, P.J.B.

81dc3458-f913-44b4-9829-ecb626df5278

Shadle, C.H.

dc56253d-9926-466f-a27c-b9a8252a5304

March 1999

Jackson, P.J.B.

81dc3458-f913-44b4-9829-ecb626df5278

Shadle, C.H.

dc56253d-9926-466f-a27c-b9a8252a5304

Jackson, P.J.B. and Shadle, C.H. (1999) Analysis of mixed-source speech sounds: aspiration, voiced fricatives and breathiness. 2nd Int. Conf. on Voice Phys. and Biomechanics. p. 30 .