The University of Southampton
University of Southampton Institutional Repository

Decoding and compression of channel and scene objects for spatial audio

Decoding and compression of channel and scene objects for spatial audio
Decoding and compression of channel and scene objects for spatial audio
Sound fields can be encoded with a fixed number of signals, using microphones or panning functions. The sound field may later be reproduced approximately by decoding the signals to a loudspeaker array. The Stereo and Ambisonic systems provide examples. A framework is presented for addressing general questions about such encodings. The first problem considered is the conversion between encodings. The solution is applied to the decoding of scene encodings to a loudspeaker array. This is generalised to the decoding of {\em sub-scenes} where the resolution is focused in an angular window. Within an object based audio framework such sub-scenes are useful for representing complex objects without using all the channels required for a full scene. The second problem considered is the compression of a scene encoding to a smaller encoding, from which the original can be reconstructed. The spatial distribution of compression error can be controlled.
spatial audio, object based audio, ambisonics, stereo
Menzies, Dylan
0cc76abc-8a10-4b7f-96e5-56eceb0b2c5c
Fazi, Filippo
e5aefc08-ab45-47c1-ad69-c3f12d07d807
Menzies, Dylan
0cc76abc-8a10-4b7f-96e5-56eceb0b2c5c
Fazi, Filippo
e5aefc08-ab45-47c1-ad69-c3f12d07d807

Menzies, Dylan and Fazi, Filippo (2017) Decoding and compression of channel and scene objects for spatial audio. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25 (11). (doi:10.1109/TASLP.2017.2744264).

Record type: Article

Abstract

Sound fields can be encoded with a fixed number of signals, using microphones or panning functions. The sound field may later be reproduced approximately by decoding the signals to a loudspeaker array. The Stereo and Ambisonic systems provide examples. A framework is presented for addressing general questions about such encodings. The first problem considered is the conversion between encodings. The solution is applied to the decoding of scene encodings to a loudspeaker array. This is generalised to the decoding of {\em sub-scenes} where the resolution is focused in an angular window. Within an object based audio framework such sub-scenes are useful for representing complex objects without using all the channels required for a full scene. The second problem considered is the compression of a scene encoding to a smaller encoding, from which the original can be reconstructed. The spatial distribution of compression error can be controlled.

Text
paper_micSet - Accepted Manuscript
Download (1MB)

More information

Accepted/In Press date: 19 August 2017
e-pub ahead of print date: 24 August 2017
Published date: November 2017
Keywords: spatial audio, object based audio, ambisonics, stereo

Identifiers

Local EPrints ID: 413944
URI: http://eprints.soton.ac.uk/id/eprint/413944
PURE UUID: d6e8924e-c711-4c4f-9898-182fead623b2
ORCID for Dylan Menzies: ORCID iD orcid.org/0000-0003-1475-8798
ORCID for Filippo Fazi: ORCID iD orcid.org/0000-0003-4129-1433

Catalogue record

Date deposited: 11 Sep 2017 16:31
Last modified: 16 Mar 2024 03:59

Export record

Altmetrics

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×