A system architecture for semantically informed rendering of object-based audio
A system architecture for semantically informed rendering of object-based audio
Object-based audio promises format-agnostic reproduction and extensive personalization of spatial audio content. However, in practical listening scenarios, such as in consumer audio, ideal reproduction is typically not possible. To maximize the quality of listening experience, a different approach is required, for example modifications of metadata to adjust for the reproduction layout or personalization choices. In this paper we propose a novel system architecture for semantically informed rendering (SIR), that combines object audio rendering with high-level processing of object metadata. In many cases, this processing uses novel, advanced metadata describing the objects to optimally adjust the audio scene to the reproduction system or listener preferences. The proposed system is evaluated with several adaptation strategies, including semantically motivated downmix to layouts with few loudspeakers, manipulation of perceptual attributes, perceptual reverberation compensation, and orchestration of mobile devices for immersive reproduction. These examples demonstrate how SIR can significantly improve the media experience and provide advanced personalization controls, for example by maintaining smooth object trajectories on systems with few loudspeakers, or providing personalized envelopment levels. An example implementation of the proposed system architecture is described and provided as an open, extensible software framework that combines object-based audio rendering and high-level processing of advanced object metadata.
object-based Audio; spatial audio; audio rendering
498-509
Franck, Andreas
fa179b73-6a83-4c42-b300-81f1dfe9ef6d
Francombe, Jon
b214bf4d-72d7-4ad6-8b05-8a1b56dba6b5
Woodcock, James
5f2edb7e-7a42-4080-9379-7090bcc33234
Hughes, Richard
4e3bc357-639e-4e19-b65e-0d87f21242f5
Coleman, Philip
b44c9939-5e45-442b-948e-b6aa08ad2fbd
Menzies-Gow, Robert
0cc76abc-8a10-4b7f-96e5-56eceb0b2c5c
Cox, Trevor J.
13f915fb-1615-4913-882b-b8bfb8c6f78a
Jackson, Philip J. B.
01e45068-e098-486c-85ad-4333f4f0a33f
Fazi, Filippo
e5aefc08-ab45-47c1-ad69-c3f12d07d807
Franck, Andreas
fa179b73-6a83-4c42-b300-81f1dfe9ef6d
Francombe, Jon
b214bf4d-72d7-4ad6-8b05-8a1b56dba6b5
Woodcock, James
5f2edb7e-7a42-4080-9379-7090bcc33234
Hughes, Richard
4e3bc357-639e-4e19-b65e-0d87f21242f5
Coleman, Philip
b44c9939-5e45-442b-948e-b6aa08ad2fbd
Menzies-Gow, Robert
0cc76abc-8a10-4b7f-96e5-56eceb0b2c5c
Cox, Trevor J.
13f915fb-1615-4913-882b-b8bfb8c6f78a
Jackson, Philip J. B.
01e45068-e098-486c-85ad-4333f4f0a33f
Fazi, Filippo
e5aefc08-ab45-47c1-ad69-c3f12d07d807
Franck, Andreas, Francombe, Jon, Woodcock, James, Hughes, Richard, Coleman, Philip, Menzies-Gow, Robert, Cox, Trevor J., Jackson, Philip J. B. and Fazi, Filippo
(2019)
A system architecture for semantically informed rendering of object-based audio.
Journal of the Audio Engineering Society, 67 (7/8), .
(doi:10.17743/jaes.2019.0025).
Abstract
Object-based audio promises format-agnostic reproduction and extensive personalization of spatial audio content. However, in practical listening scenarios, such as in consumer audio, ideal reproduction is typically not possible. To maximize the quality of listening experience, a different approach is required, for example modifications of metadata to adjust for the reproduction layout or personalization choices. In this paper we propose a novel system architecture for semantically informed rendering (SIR), that combines object audio rendering with high-level processing of object metadata. In many cases, this processing uses novel, advanced metadata describing the objects to optimally adjust the audio scene to the reproduction system or listener preferences. The proposed system is evaluated with several adaptation strategies, including semantically motivated downmix to layouts with few loudspeakers, manipulation of perceptual attributes, perceptual reverberation compensation, and orchestration of mobile devices for immersive reproduction. These examples demonstrate how SIR can significantly improve the media experience and provide advanced personalization controls, for example by maintaining smooth object trajectories on systems with few loudspeakers, or providing personalized envelopment levels. An example implementation of the proposed system architecture is described and provided as an open, extensible software framework that combines object-based audio rendering and high-level processing of advanced object metadata.
Text
framework semantically informed rendering preprint
- Author's Original
Restricted to Repository staff only
Request a copy
More information
Accepted/In Press date: 16 May 2019
e-pub ahead of print date: 14 August 2019
Keywords:
object-based Audio; spatial audio; audio rendering
Identifiers
Local EPrints ID: 431491
URI: http://eprints.soton.ac.uk/id/eprint/431491
ISSN: 1549-4950
PURE UUID: a0d20654-eece-48a8-b923-38a5aee946c3
Catalogue record
Date deposited: 05 Jun 2019 16:30
Last modified: 13 Jun 2024 01:46
Export record
Altmetrics
Contributors
Author:
Andreas Franck
Author:
Jon Francombe
Author:
James Woodcock
Author:
Richard Hughes
Author:
Philip Coleman
Author:
Trevor J. Cox
Author:
Philip J. B. Jackson
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics