SliceFormer: deep dense depth estimation from a single indoor omnidirectional image using a slice-based transformer
SliceFormer: deep dense depth estimation from a single indoor omnidirectional image using a slice-based transformer
In this research, we tackle the task of estimating depth from a single indoor omnidirectional image. Acknowledging gravity's critical influence in artificially constructed indoor environments, we process the input from equirectangular projection by dividing it into vertical slices. These slices are then utilized as patch embeddings for the transformer encoder, a strategy designed to recreate an equirectangular depth map. Our architecture is evaluated against leading models using real-world datasets, namely Stanford2D3D and Matterport3D, demonstrating its superior performance. These results underscore the significance of our gravity-aligned approach for depth estimation in omnidirectional images, especially in man-made settings.
depth estimation, single omnidirectional image, slice-based transformer
678-681
Wu, Yihong
2876bede-25f1-47a5-9e08-b98be99b2d31
Heng, Yuwen
a3edf9da-2d3b-450c-8d6d-85f76c861849
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Kim, Hansung
2c7c135c-f00b-4409-acb2-85b3a9e8225f
19 March 2024
Wu, Yihong
2876bede-25f1-47a5-9e08-b98be99b2d31
Heng, Yuwen
a3edf9da-2d3b-450c-8d6d-85f76c861849
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Kim, Hansung
2c7c135c-f00b-4409-acb2-85b3a9e8225f
Wu, Yihong, Heng, Yuwen, Niranjan, Mahesan and Kim, Hansung
(2024)
SliceFormer: deep dense depth estimation from a single indoor omnidirectional image using a slice-based transformer.
In 2024 International Conference on Electronics, Information, and Communication, ICEIC 2024.
IEEE.
.
(doi:10.1109/ICEIC61013.2024.10457276).
Record type:
Conference or Workshop Item
(Paper)
Abstract
In this research, we tackle the task of estimating depth from a single indoor omnidirectional image. Acknowledging gravity's critical influence in artificially constructed indoor environments, we process the input from equirectangular projection by dividing it into vertical slices. These slices are then utilized as patch embeddings for the transformer encoder, a strategy designed to recreate an equirectangular depth map. Our architecture is evaluated against leading models using real-world datasets, namely Stanford2D3D and Matterport3D, demonstrating its superior performance. These results underscore the significance of our gravity-aligned approach for depth estimation in omnidirectional images, especially in man-made settings.
Text
131_24ICEIC-Yihong-Published
Restricted to Repository staff only
Request a copy
More information
Published date: 19 March 2024
Additional Information:
Publisher Copyright:
© 2024 IEEE.
Venue - Dates:
International Conference on Electronics, Information, and Communication, Taipei Marriott Hotel, Taipei, Taiwan, 2024-01-28 - 2024-01-31
Keywords:
depth estimation, single omnidirectional image, slice-based transformer
Identifiers
Local EPrints ID: 490528
URI: http://eprints.soton.ac.uk/id/eprint/490528
PURE UUID: 9ee5773c-5029-40c3-b912-33eceed62cb2
Catalogue record
Date deposited: 29 May 2024 16:50
Last modified: 01 Aug 2024 02:01
Export record
Altmetrics
Contributors
Author:
Yihong Wu
Author:
Yuwen Heng
Author:
Mahesan Niranjan
Author:
Hansung Kim
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics