Depth estimation for indoor single omnidirectional images
Depth estimation for indoor single omnidirectional images
Omnidirectional cameras are becoming popular in various applications owing to their ability to capture the full surrounding scene in one frame. However, depth estimation for an omnidirectional scene is more difficult than for general images due to its different system properties and distortions. Monocular depth estimation for single-view using deep learning can be a good solution, but it requires a large labelled depth dataset with various scenes. Currently published omnidirectional depth datasets cover limited types of scenes and are not suitable for depth estimation for various real-world scenes. In addition, the existing methods are basically data-driven, and the depth estimation process based on deep learning is still a black box. In order to overcome these problems, we first proposed a depth estimation architecture for a single omnidirectional image using domain adaptation, only with limited labelled real-world scenes. With the challenge of getting labelled real-world datasets and stability of the performance, we updated the components of architecture and proposed a reverse-gradient warming-up threshold discriminator (RWTD) to estimate real-world depth maps from synthetic ground truth. It takes labelled synthetic scenes of a source domain and unlabelled real-world scenes of a target domain as inputs to predict the corresponding depth maps. To solve the black-box depth estimation process, we analyse the role of gravity in depth estimation and propose a slicing method based on the gravity direction. Equally crucial to our investigation is the examination of the contributions of different cues to the results of indoor depth estimation. The results show that the four factors of colour, saturation, local texture and shape show different extent contributions, and among them, the shape feature plays a dominant role in the performance of depth estimation. These works present solutions for depth estimation of omnidirectional images in real-world applications and delve into the critical role of gravity alignment, as well as the exploration of how machines perceive depth, providing a foundation for subsequent research.
University of Southampton
Wu, Yihong
2876bede-25f1-47a5-9e08-b98be99b2d31
2024
Wu, Yihong
2876bede-25f1-47a5-9e08-b98be99b2d31
Kim, Hansung
2c7c135c-f00b-4409-acb2-85b3a9e8225f
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Wu, Yihong
(2024)
Depth estimation for indoor single omnidirectional images.
University of Southampton, Doctoral Thesis, 116pp.
Record type:
Thesis
(Doctoral)
Abstract
Omnidirectional cameras are becoming popular in various applications owing to their ability to capture the full surrounding scene in one frame. However, depth estimation for an omnidirectional scene is more difficult than for general images due to its different system properties and distortions. Monocular depth estimation for single-view using deep learning can be a good solution, but it requires a large labelled depth dataset with various scenes. Currently published omnidirectional depth datasets cover limited types of scenes and are not suitable for depth estimation for various real-world scenes. In addition, the existing methods are basically data-driven, and the depth estimation process based on deep learning is still a black box. In order to overcome these problems, we first proposed a depth estimation architecture for a single omnidirectional image using domain adaptation, only with limited labelled real-world scenes. With the challenge of getting labelled real-world datasets and stability of the performance, we updated the components of architecture and proposed a reverse-gradient warming-up threshold discriminator (RWTD) to estimate real-world depth maps from synthetic ground truth. It takes labelled synthetic scenes of a source domain and unlabelled real-world scenes of a target domain as inputs to predict the corresponding depth maps. To solve the black-box depth estimation process, we analyse the role of gravity in depth estimation and propose a slicing method based on the gravity direction. Equally crucial to our investigation is the examination of the contributions of different cues to the results of indoor depth estimation. The results show that the four factors of colour, saturation, local texture and shape show different extent contributions, and among them, the shape feature plays a dominant role in the performance of depth estimation. These works present solutions for depth estimation of omnidirectional images in real-world applications and delve into the critical role of gravity alignment, as well as the exploration of how machines perceive depth, providing a foundation for subsequent research.
Text
PhD_Thesis_Final_PDFA
- Version of Record
Text
Final-thesis-submission-Examination-Mr-Yihong-Wu
Restricted to Repository staff only
More information
Published date: 2024
Identifiers
Local EPrints ID: 493911
URI: http://eprints.soton.ac.uk/id/eprint/493911
PURE UUID: 77d9e54f-3fea-4543-9ba9-3aa758fab2ce
Catalogue record
Date deposited: 17 Sep 2024 16:50
Last modified: 07 Nov 2024 02:59
Export record
Contributors
Author:
Yihong Wu
Thesis advisor:
Hansung Kim
Thesis advisor:
Mahesan Niranjan
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics