Aero-acoustic modelling of voiced and unvoiced fricatives based on MRI data
Aero-acoustic modelling of voiced and unvoiced fricatives based on MRI data
We would like to develop a more realistic production model of unvoiced speech sounds, namely fricatives, plosives and aspiration noise. All three involve turbulence noise generation, with place-dependent source characteristics that vary with time (rapidly, in plosives). In this study, we aimed to produce, using an aero-acoustic model of the vocal-tract filter and source, voiced as well as unvoiced fricatives that provide a good match to analyses of speech recordings. The vocal-tract transfer function (VTTF) was computed by the vocal-tract acoustics program, VOAC [Davies, McGowan and Shadle. Vocal Fold Physiology: Frontiers in Basic Science, ed. Titze, Singular Pub., CA, 93-142, 1993], using geometrical data, in the form of cross-sectional area and hydraulic radius functions, along the length of the tract. VOAC incorporates the effects of net flow into the transmission of plane waves through a tubular representation of the tract, and relaxes assumptions of rrigid walls and isentropic propagation. The geometry functions were derived from multiple-slice, dynamic, magnetic resonance images (MRI) [Mohammad. PhD thesis, Dept. ECS, U. Southampton, UK, 1999; Shadle, Mohammad, Carter, and Jackson. Proc. ICPhS, S.F. CA, 1:623-626, 1999], using a method of converting from the pixel outlines that was improved over earlier efforts on vowels. A coloured noise source signal was combined with the VTTF and radiation characteristic to synthesize the unvoiced fricative [s]. For its voiced counterpart [z], many researchers have noted that the noise source appears to be modulated by voicing. Furthermore, the phase of the modulation has been shown to be perceptually significant. Based on our analysis [Jackson and Shadle. Proc. IEEE-ICASSP, Istanbul, 2000.] of recordings by the same subject, the frication source of [z] was varied periodically according to fluctuations in the flow velocity at the constriction exit, and the modulation phase was governed by the convection time for the flow perturbation to travel from the constriction to the obstacle. The synthesized fricatives were compared to the speech recordings in a simple listening test, and comparisons of the predicted and measured time series suggested that the model, which brings together physical, aerodynamic and acoustic information, can replicate characteristics of real speech, such as the modulation in voiced fricatives [http://www.isis.ecs.soton.ac.uk/research/ projects/nephthys/].
185-188
Jackson, P.J.B.
81dc3458-f913-44b4-9829-ecb626df5278
Shadle, C.H.
dc56253d-9926-466f-a27c-b9a8252a5304
May 2000
Jackson, P.J.B.
81dc3458-f913-44b4-9829-ecb626df5278
Shadle, C.H.
dc56253d-9926-466f-a27c-b9a8252a5304
Jackson, P.J.B. and Shadle, C.H.
(2000)
Aero-acoustic modelling of voiced and unvoiced fricatives based on MRI data.
5th Speech Production Seminar.
.
Record type:
Conference or Workshop Item
(Other)
Abstract
We would like to develop a more realistic production model of unvoiced speech sounds, namely fricatives, plosives and aspiration noise. All three involve turbulence noise generation, with place-dependent source characteristics that vary with time (rapidly, in plosives). In this study, we aimed to produce, using an aero-acoustic model of the vocal-tract filter and source, voiced as well as unvoiced fricatives that provide a good match to analyses of speech recordings. The vocal-tract transfer function (VTTF) was computed by the vocal-tract acoustics program, VOAC [Davies, McGowan and Shadle. Vocal Fold Physiology: Frontiers in Basic Science, ed. Titze, Singular Pub., CA, 93-142, 1993], using geometrical data, in the form of cross-sectional area and hydraulic radius functions, along the length of the tract. VOAC incorporates the effects of net flow into the transmission of plane waves through a tubular representation of the tract, and relaxes assumptions of rrigid walls and isentropic propagation. The geometry functions were derived from multiple-slice, dynamic, magnetic resonance images (MRI) [Mohammad. PhD thesis, Dept. ECS, U. Southampton, UK, 1999; Shadle, Mohammad, Carter, and Jackson. Proc. ICPhS, S.F. CA, 1:623-626, 1999], using a method of converting from the pixel outlines that was improved over earlier efforts on vowels. A coloured noise source signal was combined with the VTTF and radiation characteristic to synthesize the unvoiced fricative [s]. For its voiced counterpart [z], many researchers have noted that the noise source appears to be modulated by voicing. Furthermore, the phase of the modulation has been shown to be perceptually significant. Based on our analysis [Jackson and Shadle. Proc. IEEE-ICASSP, Istanbul, 2000.] of recordings by the same subject, the frication source of [z] was varied periodically according to fluctuations in the flow velocity at the constriction exit, and the modulation phase was governed by the convection time for the flow perturbation to travel from the constriction to the obstacle. The synthesized fricatives were compared to the speech recordings in a simple listening test, and comparisons of the predicted and measured time series suggested that the model, which brings together physical, aerodynamic and acoustic information, can replicate characteristics of real speech, such as the modulation in voiced fricatives [http://www.isis.ecs.soton.ac.uk/research/ projects/nephthys/].
This record has no associated files available for download.
More information
Published date: May 2000
Additional Information:
Address: Seeon, FRG
Venue - Dates:
5th Speech Production Seminar, 2000-04-30
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 253337
URI: http://eprints.soton.ac.uk/id/eprint/253337
PURE UUID: ed4334b9-b050-40a0-b463-6732ccd47bbb
Catalogue record
Date deposited: 17 May 2000
Last modified: 10 Dec 2021 20:32
Export record
Contributors
Author:
P.J.B. Jackson
Author:
C.H. Shadle
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics