Self-supervised mean opinion score prediction of phase-vocoder-based virtual bass system
Self-supervised mean opinion score prediction of phase-vocoder-based virtual bass system
The virtual bass system (VBS) leverages a psychoacoustic phenomenon known as the “missing fundamental” to trick listeners into perceiving the fundamental frequency from its higher harmonics. The VBS finds common use in consumer electronic devices where miniature and flat panel loudspeakers are integrated, as they cannot reproduce satisfactory low-frequency components. The additional harmonics introduced by the VBS can lead to perceptual distortion. Therefore, evaluating the perceptual quality of the VBS necessitates subjective listening tests. Previous studies have attempted to derive objective metrics and identify combinations of model output variables to predict the perceptual quality of the VBS. However, due to the limited number of subjective test results used to obtain the combination coefficients, inconsistencies may arise in predictions. This paper proposes to adopt self-supervised deep learning models to predict the mean opinion score (MOS) of the VBS. Experiment results demonstrate a strong linear correlation between the model outputs and the human-rated MOS, indicating that a linear mapping is sufficient to convert a model output into an accurate MOS prediction.
210942025
Gou, Jiacheng
4f92c3b1-a1a8-4cf4-805b-4cba2852561d
Song, Yuheng
9d29644f-ed92-4937-8be3-2848755224f2
Shi, Chuang
c46f72bd-54c7-45ee-ac5d-285691fccf81
Li, Huiyong
55372056-e82d-4ca8-93c3-88c7f9d4216c
23 October 2024
Gou, Jiacheng
4f92c3b1-a1a8-4cf4-805b-4cba2852561d
Song, Yuheng
9d29644f-ed92-4937-8be3-2848755224f2
Shi, Chuang
c46f72bd-54c7-45ee-ac5d-285691fccf81
Li, Huiyong
55372056-e82d-4ca8-93c3-88c7f9d4216c
Gou, Jiacheng, Song, Yuheng, Shi, Chuang and Li, Huiyong
(2024)
Self-supervised mean opinion score prediction of phase-vocoder-based virtual bass system.
32rd European Signal Processing Conference, Lyon Convention Center, Lyon, France.
26 - 30 Aug 2024.
(210942025).
Record type:
Conference or Workshop Item
(Paper)
Abstract
The virtual bass system (VBS) leverages a psychoacoustic phenomenon known as the “missing fundamental” to trick listeners into perceiving the fundamental frequency from its higher harmonics. The VBS finds common use in consumer electronic devices where miniature and flat panel loudspeakers are integrated, as they cannot reproduce satisfactory low-frequency components. The additional harmonics introduced by the VBS can lead to perceptual distortion. Therefore, evaluating the perceptual quality of the VBS necessitates subjective listening tests. Previous studies have attempted to derive objective metrics and identify combinations of model output variables to predict the perceptual quality of the VBS. However, due to the limited number of subjective test results used to obtain the combination coefficients, inconsistencies may arise in predictions. This paper proposes to adopt self-supervised deep learning models to predict the mean opinion score (MOS) of the VBS. Experiment results demonstrate a strong linear correlation between the model outputs and the human-rated MOS, indicating that a linear mapping is sufficient to convert a model output into an accurate MOS prediction.
Text
EUSIPCO24_Gou_v1.9
- Accepted Manuscript
More information
Accepted/In Press date: 22 May 2024
Published date: 23 October 2024
Venue - Dates:
32rd European Signal Processing Conference, Lyon Convention Center, Lyon, France, 2024-08-26 - 2024-08-30
Identifiers
Local EPrints ID: 490871
URI: http://eprints.soton.ac.uk/id/eprint/490871
DOI: 210942025
PURE UUID: 392e7315-3dbe-411f-9d3a-c2010f7d9741
Catalogue record
Date deposited: 07 Jun 2024 16:40
Last modified: 06 Nov 2025 17:53
Export record
Altmetrics
Contributors
Author:
Jiacheng Gou
Author:
Yuheng Song
Author:
Chuang Shi
Author:
Huiyong Li
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics