The University of Southampton
University of Southampton Institutional Repository

3D audio-visual indoor scene reconstruction and semantics completion for virtual reality from a single 360◦ RGB-D image

3D audio-visual indoor scene reconstruction and semantics completion for virtual reality from a single 360◦ RGB-D image
3D audio-visual indoor scene reconstruction and semantics completion for virtual reality from a single 360◦ RGB-D image
We introduce a new approach for constructing immersive virtual spaces by generating comprehensive 3D voxelised models that encompass both geometric and semantic scene representations from a single 360° RGB-D input. The proposed approach utilises a deep convolutional neural network for semantic scene completion (SSC), allowing the estimation of complete semantics and geometries of the scene. We design MDBNet a dual head model that simultaneously processes RGB and depth data using a perspective camera. Depth information is encoded using a flipped transcribed signed distance function (F-TSDF), capturing essential geometric shape characteristics. We extend the inference capabilities of MDBNet on RGB-D input of the perspective camera to accommodate 360° RGB-D by proposing MDBNet360. We employ RGB spherical-to-cubic projection and 3D rotation for depth point clouds, allowing for virtual reality (VR) space design with comprehensive spatial coverage. To our knowledge, this is the first work to extend a pre-trained SSC model, originally using perspective camera RGB-D input, to infer a 3D model from 360º RGB-D input. To assess acoustic properties, we measure parameters such as early decay time (EDT) and reverberation time (RT60) using the exponential sine sweep method (ESS). We used Unity with the Steam Audio plug-in for conducting simulations in virtual space. The proposed framework demonstrates better virtual space reconstruction and immersive sound generation, advancing semantically rich and spatially accurate virtual environments compared to the state-of-the-art (SOTA). Code and rendered sounds are available on GitHub: https://github.com/MonaIA1/Repo360.
Alawadh, Mona
60613079-426e-425a-81d3-09a6fbb7a92c
Alinaghi, Atiyeh
69c051f1-9b47-4c47-b9e3-52fa3079f9a3
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Kim, Hansung
2c7c135c-f00b-4409-acb2-85b3a9e8225f
Alawadh, Mona
60613079-426e-425a-81d3-09a6fbb7a92c
Alinaghi, Atiyeh
69c051f1-9b47-4c47-b9e3-52fa3079f9a3
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Kim, Hansung
2c7c135c-f00b-4409-acb2-85b3a9e8225f

Alawadh, Mona, Alinaghi, Atiyeh, Niranjan, Mahesan and Kim, Hansung (2026) 3D audio-visual indoor scene reconstruction and semantics completion for virtual reality from a single 360◦ RGB-D image. Virtual Reality, 30, [55]. (doi:10.1007/s10055-026-01312-7).

Record type: Article

Abstract

We introduce a new approach for constructing immersive virtual spaces by generating comprehensive 3D voxelised models that encompass both geometric and semantic scene representations from a single 360° RGB-D input. The proposed approach utilises a deep convolutional neural network for semantic scene completion (SSC), allowing the estimation of complete semantics and geometries of the scene. We design MDBNet a dual head model that simultaneously processes RGB and depth data using a perspective camera. Depth information is encoded using a flipped transcribed signed distance function (F-TSDF), capturing essential geometric shape characteristics. We extend the inference capabilities of MDBNet on RGB-D input of the perspective camera to accommodate 360° RGB-D by proposing MDBNet360. We employ RGB spherical-to-cubic projection and 3D rotation for depth point clouds, allowing for virtual reality (VR) space design with comprehensive spatial coverage. To our knowledge, this is the first work to extend a pre-trained SSC model, originally using perspective camera RGB-D input, to infer a 3D model from 360º RGB-D input. To assess acoustic properties, we measure parameters such as early decay time (EDT) and reverberation time (RT60) using the exponential sine sweep method (ESS). We used Unity with the Steam Audio plug-in for conducting simulations in virtual space. The proposed framework demonstrates better virtual space reconstruction and immersive sound generation, advancing semantically rich and spatially accurate virtual environments compared to the state-of-the-art (SOTA). Code and rendered sounds are available on GitHub: https://github.com/MonaIA1/Repo360.

Text
Virtual_Reality_jornal_MDBNet360-Accepted - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (45MB)
Text
s10055-026-01312-7 - Version of Record
Available under License Creative Commons Attribution.
Download (8MB)

More information

Accepted/In Press date: 5 January 2026
e-pub ahead of print date: 6 February 2026
Published date: 9 February 2026

Identifiers

Local EPrints ID: 509693
URI: http://eprints.soton.ac.uk/id/eprint/509693
PURE UUID: a69db72b-58b5-4618-988a-3b1366cb390f
ORCID for Mona Alawadh: ORCID iD orcid.org/0000-0001-5354-7681
ORCID for Mahesan Niranjan: ORCID iD orcid.org/0000-0001-7021-140X
ORCID for Hansung Kim: ORCID iD orcid.org/0000-0003-4907-0491

Catalogue record

Date deposited: 02 Mar 2026 17:58
Last modified: 07 Mar 2026 04:03

Export record

Altmetrics

Contributors

Author: Mona Alawadh ORCID iD
Author: Atiyeh Alinaghi
Author: Mahesan Niranjan ORCID iD
Author: Hansung Kim ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×