The University of Southampton
University of Southampton Institutional Repository

Self-supervised mean opinion score prediction of phase-vocoder-based virtual bass system

Self-supervised mean opinion score prediction of phase-vocoder-based virtual bass system
Self-supervised mean opinion score prediction of phase-vocoder-based virtual bass system

The virtual bass system (VBS) leverages a psychoacoustic phenomenon known as the “missing fundamental” to trick listeners into perceiving the fundamental frequency from its higher harmonics. The VBS finds common use in consumer electronic devices where miniature and flat panel loudspeakers are integrated, as they cannot reproduce satisfactory low-frequency components. The additional harmonics introduced by the VBS can lead to perceptual distortion. Therefore, evaluating the perceptual quality of the VBS necessitates subjective listening tests. Previous studies have attempted to derive objective metrics and identify combinations of model output variables to predict the perceptual quality of the VBS. However, due to the limited number of subjective test results used to obtain the combination coefficients, inconsistencies may arise in predictions. This paper proposes to adopt self-supervised deep learning models to predict the mean opinion score (MOS) of the VBS. Experiment results demonstrate a strong linear correlation between the model outputs and the human-rated MOS, indicating that a linear mapping is sufficient to convert a model output into an accurate MOS prediction.
210942025
Gou, Jiacheng
4f92c3b1-a1a8-4cf4-805b-4cba2852561d
Song, Yuheng
9d29644f-ed92-4937-8be3-2848755224f2
Shi, Chuang
c46f72bd-54c7-45ee-ac5d-285691fccf81
Li, Huiyong
55372056-e82d-4ca8-93c3-88c7f9d4216c
Gou, Jiacheng
4f92c3b1-a1a8-4cf4-805b-4cba2852561d
Song, Yuheng
9d29644f-ed92-4937-8be3-2848755224f2
Shi, Chuang
c46f72bd-54c7-45ee-ac5d-285691fccf81
Li, Huiyong
55372056-e82d-4ca8-93c3-88c7f9d4216c

Gou, Jiacheng, Song, Yuheng, Shi, Chuang and Li, Huiyong (2024) Self-supervised mean opinion score prediction of phase-vocoder-based virtual bass system. 32rd European Signal Processing Conference, Lyon Convention Center, Lyon, France. 26 - 30 Aug 2024. (210942025).

Record type: Conference or Workshop Item (Paper)

Abstract


The virtual bass system (VBS) leverages a psychoacoustic phenomenon known as the “missing fundamental” to trick listeners into perceiving the fundamental frequency from its higher harmonics. The VBS finds common use in consumer electronic devices where miniature and flat panel loudspeakers are integrated, as they cannot reproduce satisfactory low-frequency components. The additional harmonics introduced by the VBS can lead to perceptual distortion. Therefore, evaluating the perceptual quality of the VBS necessitates subjective listening tests. Previous studies have attempted to derive objective metrics and identify combinations of model output variables to predict the perceptual quality of the VBS. However, due to the limited number of subjective test results used to obtain the combination coefficients, inconsistencies may arise in predictions. This paper proposes to adopt self-supervised deep learning models to predict the mean opinion score (MOS) of the VBS. Experiment results demonstrate a strong linear correlation between the model outputs and the human-rated MOS, indicating that a linear mapping is sufficient to convert a model output into an accurate MOS prediction.

Text
EUSIPCO24_Gou_v1.9 - Accepted Manuscript
Download (345kB)

More information

Accepted/In Press date: 22 May 2024
Published date: 23 October 2024
Venue - Dates: 32rd European Signal Processing Conference, Lyon Convention Center, Lyon, France, 2024-08-26 - 2024-08-30

Identifiers

Local EPrints ID: 490871
URI: http://eprints.soton.ac.uk/id/eprint/490871
DOI: 210942025
PURE UUID: 392e7315-3dbe-411f-9d3a-c2010f7d9741
ORCID for Chuang Shi: ORCID iD orcid.org/0000-0002-1517-2775

Catalogue record

Date deposited: 07 Jun 2024 16:40
Last modified: 06 Nov 2025 17:53

Export record

Altmetrics

Contributors

Author: Jiacheng Gou
Author: Yuheng Song
Author: Chuang Shi ORCID iD
Author: Huiyong Li

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×