The University of Southampton
University of Southampton Institutional Repository

Towards an understanding of speech and song perception

Towards an understanding of speech and song perception
Towards an understanding of speech and song perception
The human singing voice plays an important role in music of all societies. It is an extremely flexible instrument and is capable of producing a tremendous range of sounds. As such, the human voice can be hard to classify and poses a major challenge for automatic audio discrimination and classification systems. Speech/song discrimination is an implicit goal of speech/music discrimination, where a division is sought between speech and song, such that the singing voice can be grouped together with other musical instruments in the same category. However, the division between speech and song is unclear and even human attempts at speech/song discrimination can be highly subjective and open to discussion. In this paper we present the results of a test that was designed to investigate differences in auditory perception for speech and song. Twenty-four subjects were instructed to attend to either the words or pitch, or both words and pitch of context-free spoken and sung phrases. After presentation of each phrase, subjects were asked to either type the words that they recalled, or select the correct pitch contour from a choice of four graphical representations, or do both, depending on the task specified before presentation of the phrase. The results of the experiment show a decrease in the amount of linguistic information retained by subjects for sung phrases and also a decrease in accuracy of response for the sung phrases when subjects attended to both words and pitch instead of words or pitch alone.
melodic and language processing, perception, song, speech, word recall
1401-5439
129-135
van Besouw, Rachel M.
464435ed-eadc-4fcc-9d69-eb267d8fe81b
Howard, David M.
918df556-3e7b-4a32-b59f-e16ff23093aa
Ternström, Sten
5cc6edcc-a75b-4f54-90be-e2a78b3d32f9
van Besouw, Rachel M.
464435ed-eadc-4fcc-9d69-eb267d8fe81b
Howard, David M.
918df556-3e7b-4a32-b59f-e16ff23093aa
Ternström, Sten
5cc6edcc-a75b-4f54-90be-e2a78b3d32f9

van Besouw, Rachel M., Howard, David M. and Ternström, Sten (2005) Towards an understanding of speech and song perception. Logopedics Phoniatrics Vocology, 30 (3 & 4), 129-135. (doi:10.1080/14015430500262160).

Record type: Article

Abstract

The human singing voice plays an important role in music of all societies. It is an extremely flexible instrument and is capable of producing a tremendous range of sounds. As such, the human voice can be hard to classify and poses a major challenge for automatic audio discrimination and classification systems. Speech/song discrimination is an implicit goal of speech/music discrimination, where a division is sought between speech and song, such that the singing voice can be grouped together with other musical instruments in the same category. However, the division between speech and song is unclear and even human attempts at speech/song discrimination can be highly subjective and open to discussion. In this paper we present the results of a test that was designed to investigate differences in auditory perception for speech and song. Twenty-four subjects were instructed to attend to either the words or pitch, or both words and pitch of context-free spoken and sung phrases. After presentation of each phrase, subjects were asked to either type the words that they recalled, or select the correct pitch contour from a choice of four graphical representations, or do both, depending on the task specified before presentation of the phrase. The results of the experiment show a decrease in the amount of linguistic information retained by subjects for sung phrases and also a decrease in accuracy of response for the sung phrases when subjects attended to both words and pitch instead of words or pitch alone.

This record has no associated files available for download.

More information

Published date: 2005
Keywords: melodic and language processing, perception, song, speech, word recall
Organisations: Human Sciences Group

Identifiers

Local EPrints ID: 46243
URI: http://eprints.soton.ac.uk/id/eprint/46243
ISSN: 1401-5439
PURE UUID: 41a3fe5d-8ed2-48a6-a69b-4af26b1b5f91

Catalogue record

Date deposited: 07 Jun 2007
Last modified: 15 Mar 2024 09:20

Export record

Altmetrics

Contributors

Author: Rachel M. van Besouw
Author: David M. Howard
Author: Sten Ternström

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×