A comparative study of two area function derivation techniques for fricative synthesis
A comparative study of two area function derivation techniques for fricative synthesis
It is still unclear to speech scientists how the brain is able to distinguish fricative sounds from the many cues within the speech signal. This thesis makes an acoustic and articulatory investigation mint the problem, using fricative utterances and magnetic resonance (MR) images.
The results suggest that the cues to a fricative’s identity lie in the higher frequency region. For the sibilant fricatives, these are the relatively higher amplitude levels about 3.0kHz, while the distinct peak in the vicinity of 2.8 kHz gives the sound a “pitch quality” that was significantly perceived by the listeners. For the nonsibilant fricatives, the cues within the frication period did not provide sufficient information as to the fricatives’ identity even though coarticulatory effects due to vowel context persisted; nevertheless, rough matches in amplitude levels were adequate to produce a synthetic version which listeners decided were compared to the natural sound. Accurate measurements of the region posterior to the constriction were not essential for synthesis, supported by the fact that pyriform sinuses do not play an important role in modelling. The sublingual cavity was observed to affect the amplitude of the significant peak in /S/ but it is sufficient to include it as an increase in area. The Mermelstein technique is proposed to be most suitable for fricative synthesis: it is easier to implement and side branches do not need to be modelled realistically.
University of Southampton
Subari, Khazaimatol Shima
8cec0e06-9499-48bb-8422-91c7057f21ff
2006
Subari, Khazaimatol Shima
8cec0e06-9499-48bb-8422-91c7057f21ff
Subari, Khazaimatol Shima
(2006)
A comparative study of two area function derivation techniques for fricative synthesis.
University of Southampton, Doctoral Thesis.
Record type:
Thesis
(Doctoral)
Abstract
It is still unclear to speech scientists how the brain is able to distinguish fricative sounds from the many cues within the speech signal. This thesis makes an acoustic and articulatory investigation mint the problem, using fricative utterances and magnetic resonance (MR) images.
The results suggest that the cues to a fricative’s identity lie in the higher frequency region. For the sibilant fricatives, these are the relatively higher amplitude levels about 3.0kHz, while the distinct peak in the vicinity of 2.8 kHz gives the sound a “pitch quality” that was significantly perceived by the listeners. For the nonsibilant fricatives, the cues within the frication period did not provide sufficient information as to the fricatives’ identity even though coarticulatory effects due to vowel context persisted; nevertheless, rough matches in amplitude levels were adequate to produce a synthetic version which listeners decided were compared to the natural sound. Accurate measurements of the region posterior to the constriction were not essential for synthesis, supported by the fact that pyriform sinuses do not play an important role in modelling. The sublingual cavity was observed to affect the amplitude of the significant peak in /S/ but it is sufficient to include it as an increase in area. The Mermelstein technique is proposed to be most suitable for fricative synthesis: it is easier to implement and side branches do not need to be modelled realistically.
Text
1019842.pdf
- Version of Record
More information
Published date: 2006
Identifiers
Local EPrints ID: 465963
URI: http://eprints.soton.ac.uk/id/eprint/465963
PURE UUID: bf38d2e2-9c01-4741-884e-b931048b00d3
Catalogue record
Date deposited: 05 Jul 2022 03:48
Last modified: 16 Mar 2024 20:27
Export record
Contributors
Author:
Khazaimatol Shima Subari
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics