SliceFormer: deep dense depth estimation from a single indoor omnidirectional image using a slice-based transformer

Wu, Yihong, Heng, Yuwen, Niranjan, Mahesan and Kim, Hansung (2024) SliceFormer: deep dense depth estimation from a single indoor omnidirectional image using a slice-based transformer. In 2024 International Conference on Electronics, Information, and Communication, ICEIC 2024. IEEE. pp. 678-681 . (doi:10.1109/ICEIC61013.2024.10457276).

Record type: Conference or Workshop Item (Paper)

Abstract

In this research, we tackle the task of estimating depth from a single indoor omnidirectional image. Acknowledging gravity's critical influence in artificially constructed indoor environments, we process the input from equirectangular projection by dividing it into vertical slices. These slices are then utilized as patch embeddings for the transformer encoder, a strategy designed to recreate an equirectangular depth map. Our architecture is evaluated against leading models using real-world datasets, namely Stanford2D3D and Matterport3D, demonstrating its superior performance. These results underscore the significance of our gravity-aligned approach for depth estimation in omnidirectional images, especially in man-made settings.

Text

131_24ICEIC-Yihong-Published

Restricted to Repository staff only

Request a copy

More information

Published date: 19 March 2024

Venue - Dates: International Conference on Electronics, Information, and Communication, Taipei Marriott Hotel, Taipei, Taiwan, 2024-01-28 - 2024-01-31

Keywords: depth estimation, single omnidirectional image, slice-based transformer

Learn more about Vision, Learning and Control research Learn more about Institute for Life Sciences research Learn more about School of Electronics and Computer Science research Learn more about Institute for Life Sciences research