The University of Southampton
University of Southampton Institutional Repository

Automatic tracking of 3D vocal tract features during speech production using MRI

Automatic tracking of 3D vocal tract features during speech production using MRI
Automatic tracking of 3D vocal tract features during speech production using MRI

Magnetic resonance imaging has many advantages for visualising the process of speech production, but an important disadvantage is the long scanning acquisition time relative to the characteristic time of articulator motion of tenth of second. Southampton Dynamic Magnetic Resonance Imaging is a technique developed in a previous project to solve this problem. This technique achieves an apparent high temporal resolution suitable for dynamic studies. Consequently, a large number of images can be generated describing the evolution of the vocal tract shape. This makes a manual extraction of the vocal tract shape a tedious and time consuming process. The aim of this project firstly is to improve and extent the SDMRI method, and secondly, to determine the outline of the vocal tract automatically. Different features extraction techniques were analysed and two of them were combined to make a new automatic shape extraction tool, i.e. the active shape models and the Hough transform. Active shape models describe the shape of the articulators while the Hough transform locates it with no initialisation. Initially, the new algorithm was tested analysing isolated magnetic resonance images for extracting tongue shapes; however, although the results were satisfactory the algorithm often fails when multiple solutions are present. A global analysis of the image sequence overcomes these difficulties and the dynamic Hough transform was adapted for our purposes. Experimental results reveal that the algorithm does indeed find the correct shape and position of the tongue and also that it is robust under noisy conditions. The model was extended to other articulators, i.e. the lips. This approach leads to a new algorithm for automatic extraction of articulatory shape in magnetic resonance image sequences as evident in the results presented in this thesis.

University of Southampton
Avila García, María Susana
d6779be2-6d56-499f-aba0-7d5cac673b4f
Avila García, María Susana
d6779be2-6d56-499f-aba0-7d5cac673b4f

Avila García, María Susana (2006) Automatic tracking of 3D vocal tract features during speech production using MRI. University of Southampton, Doctoral Thesis.

Record type: Thesis (Doctoral)

Abstract

Magnetic resonance imaging has many advantages for visualising the process of speech production, but an important disadvantage is the long scanning acquisition time relative to the characteristic time of articulator motion of tenth of second. Southampton Dynamic Magnetic Resonance Imaging is a technique developed in a previous project to solve this problem. This technique achieves an apparent high temporal resolution suitable for dynamic studies. Consequently, a large number of images can be generated describing the evolution of the vocal tract shape. This makes a manual extraction of the vocal tract shape a tedious and time consuming process. The aim of this project firstly is to improve and extent the SDMRI method, and secondly, to determine the outline of the vocal tract automatically. Different features extraction techniques were analysed and two of them were combined to make a new automatic shape extraction tool, i.e. the active shape models and the Hough transform. Active shape models describe the shape of the articulators while the Hough transform locates it with no initialisation. Initially, the new algorithm was tested analysing isolated magnetic resonance images for extracting tongue shapes; however, although the results were satisfactory the algorithm often fails when multiple solutions are present. A global analysis of the image sequence overcomes these difficulties and the dynamic Hough transform was adapted for our purposes. Experimental results reveal that the algorithm does indeed find the correct shape and position of the tongue and also that it is robust under noisy conditions. The model was extended to other articulators, i.e. the lips. This approach leads to a new algorithm for automatic extraction of articulatory shape in magnetic resonance image sequences as evident in the results presented in this thesis.

Text
1035439.pdf - Version of Record
Available under License University of Southampton Thesis Licence.
Download (7MB)

More information

Published date: 2006

Identifiers

Local EPrints ID: 466088
URI: http://eprints.soton.ac.uk/id/eprint/466088
PURE UUID: d066c58b-365d-4a40-b0da-51a40b0472e4

Catalogue record

Date deposited: 05 Jul 2022 04:16
Last modified: 16 Mar 2024 20:30

Export record

Contributors

Author: María Susana Avila García

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×