The University of Southampton
University of Southampton Institutional Repository

A new approach to voice authenticity

A new approach to voice authenticity
A new approach to voice authenticity
Voice faking poses significant societal challenges. Currently, the prevailing assumption is that unaltered human speech can always be considered genuine, while fake speech usually comes from text-to-speech (TTS) synthesis. We argue that this type of binary distinction is oversimplified. For instance, altered playback speeds can maliciously deceive listeners, as in the ‘Drunken Nancy Pelosi’ incident. Similarly, editing of audio clips can be done ethically, e.g. for brevity or summarization in news reporting or podcasts, but editing can also create misleading narratives. In this paper, we propose a conceptual shift away from the longstanding binary paradigm of speech audio being either ‘fake’ or ‘real’. Instead, we focus on pinpointing ‘voice edits’, which encompass traditional modifications like filters and cuts, as well as neural synthesis. We delineate six categories of voice edits and curate a new challenge dataset, for which we present baseline voice edit detection systems.
Müller, Nicolas M.
e054cb2d-3ad5-4674-b44e-406a6c2c1dfe
Kawa, Piotr
fece3d41-4ee8-465e-b2f2-51853b983609
Hu, Shen
7e195648-d116-445c-b9ba-73ea177ac7be
Neu, Matthias
1d77c078-4a65-4dfa-8b2a-d1f950e64d65
Williams, Jennifer
3a1568b4-8a0b-41d2-8635-14fe69fbb360
Sperl, Philip
2d9a03d7-ae76-4c3a-bf9e-96d3fb99560d
Böttinger, Konstantin
ec031c04-8af1-411a-871b-e31201458053
Müller, Nicolas M.
e054cb2d-3ad5-4674-b44e-406a6c2c1dfe
Kawa, Piotr
fece3d41-4ee8-465e-b2f2-51853b983609
Hu, Shen
7e195648-d116-445c-b9ba-73ea177ac7be
Neu, Matthias
1d77c078-4a65-4dfa-8b2a-d1f950e64d65
Williams, Jennifer
3a1568b4-8a0b-41d2-8635-14fe69fbb360
Sperl, Philip
2d9a03d7-ae76-4c3a-bf9e-96d3fb99560d
Böttinger, Konstantin
ec031c04-8af1-411a-871b-e31201458053

Müller, Nicolas M., Kawa, Piotr, Hu, Shen, Neu, Matthias, Williams, Jennifer, Sperl, Philip and Böttinger, Konstantin (2024) A new approach to voice authenticity. Interspeech 2024, Kos Island, Greece. 01 - 05 Sep 2024. 5 pp .

Record type: Conference or Workshop Item (Paper)

Abstract

Voice faking poses significant societal challenges. Currently, the prevailing assumption is that unaltered human speech can always be considered genuine, while fake speech usually comes from text-to-speech (TTS) synthesis. We argue that this type of binary distinction is oversimplified. For instance, altered playback speeds can maliciously deceive listeners, as in the ‘Drunken Nancy Pelosi’ incident. Similarly, editing of audio clips can be done ethically, e.g. for brevity or summarization in news reporting or podcasts, but editing can also create misleading narratives. In this paper, we propose a conceptual shift away from the longstanding binary paradigm of speech audio being either ‘fake’ or ‘real’. Instead, we focus on pinpointing ‘voice edits’, which encompass traditional modifications like filters and cuts, as well as neural synthesis. We delineate six categories of voice edits and curate a new challenge dataset, for which we present baseline voice edit detection systems.

Text
muller24_interspeech - Accepted Manuscript
Download (211kB)

More information

Published date: 1 September 2024
Venue - Dates: Interspeech 2024, Kos Island, Greece, 2024-09-01 - 2024-09-05

Identifiers

Local EPrints ID: 502676
URI: http://eprints.soton.ac.uk/id/eprint/502676
PURE UUID: 01d256ee-ee75-4fa0-b8f0-7be4a9f4d473
ORCID for Jennifer Williams: ORCID iD orcid.org/0000-0003-1410-0427

Catalogue record

Date deposited: 04 Jul 2025 16:34
Last modified: 22 Aug 2025 02:34

Export record

Contributors

Author: Nicolas M. Müller
Author: Piotr Kawa
Author: Shen Hu
Author: Matthias Neu
Author: Jennifer Williams ORCID iD
Author: Philip Sperl
Author: Konstantin Böttinger

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×