The University of Southampton
University of Southampton Institutional Repository

On the latency of voice conversion in an active voice cloning device

On the latency of voice conversion in an active voice cloning device
On the latency of voice conversion in an active voice cloning device
Although significant progress has been made in voice conversion (VC), the presence of the source speaker’s original voice becomes problematic, especially in a real-time voice cloning scenario where listeners are close to the source speaker. An overlapping of the source speaker’s voice and the converted voice results in a degradation of the sense of immersion to the converted voice. In this paper, we conceptualize an active voice cloning (AVC) device, which can convert one’s voice timbre to another’s while confining the source speaker’s voice with active noise control (ANC). The VC system is realized through a low-latency deep neural network model, and the ANC system is constructed by a feedforward single-channel implementation. The mockup of the AVC device is assembled in a short open tube that can be worn on the source speaker’s mouth. Taking into consideration that the latency in the VC system introduces a phase difference between the converted voice and the residual voice of the source speaker, we further assess its effect on the intelligibility of the converted voice, as well as
the overall performance of the AVC device in ameliorating the perceptual experience.
International Commission for Acoustics
Irihose, Obed
3000079b-9cbf-4acb-b08f-085d450e96a2
Xie, Rong
c236a271-fe47-4fdb-b1ed-2598ef36ed4d
Shi, Chuang
c46f72bd-54c7-45ee-ac5d-285691fccf81
Irihose, Obed
3000079b-9cbf-4acb-b08f-085d450e96a2
Xie, Rong
c236a271-fe47-4fdb-b1ed-2598ef36ed4d
Shi, Chuang
c46f72bd-54c7-45ee-ac5d-285691fccf81

Irihose, Obed, Xie, Rong and Shi, Chuang (2022) On the latency of voice conversion in an active voice cloning device. In 24th International Congress on Acoustics Proceedings. International Commission for Acoustics. 7 pp .

Record type: Conference or Workshop Item (Paper)

Abstract

Although significant progress has been made in voice conversion (VC), the presence of the source speaker’s original voice becomes problematic, especially in a real-time voice cloning scenario where listeners are close to the source speaker. An overlapping of the source speaker’s voice and the converted voice results in a degradation of the sense of immersion to the converted voice. In this paper, we conceptualize an active voice cloning (AVC) device, which can convert one’s voice timbre to another’s while confining the source speaker’s voice with active noise control (ANC). The VC system is realized through a low-latency deep neural network model, and the ANC system is constructed by a feedforward single-channel implementation. The mockup of the AVC device is assembled in a short open tube that can be worn on the source speaker’s mouth. Taking into consideration that the latency in the VC system introduces a phase difference between the converted voice and the residual voice of the source speaker, we further assess its effect on the intelligibility of the converted voice, as well as
the overall performance of the AVC device in ameliorating the perceptual experience.

Text
ICA24_VC_Submission_ABS-0679 - Accepted Manuscript
Restricted to Repository staff only
Request a copy

More information

Published date: 24 October 2022
Venue - Dates: 24th International Congress on Acoustics, Hwabaek International Convention Center, Gyeongju, Korea, Republic of, 2022-10-24 - 2022-10-28

Identifiers

Local EPrints ID: 484696
URI: http://eprints.soton.ac.uk/id/eprint/484696
PURE UUID: 69447eb2-ba55-4cfc-82ce-2371b33c5c8f
ORCID for Chuang Shi: ORCID iD orcid.org/0000-0002-1517-2775

Catalogue record

Date deposited: 20 Nov 2023 17:43
Last modified: 18 Mar 2024 04:13

Export record

Contributors

Author: Obed Irihose
Author: Rong Xie
Author: Chuang Shi ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×