Human-centric scene understanding from single view 360 video

In this paper, we propose an approach to indoor scene understanding from observation of people in single view spherical video. As input, our approach takes a centrally located spherical video capture of an indoor scene, estimating the 3D localisation of human actions performed throughout the long term capture. The central contribution of this work is a deep convolutional encoder-decoder network trained on a synthetic dataset to reconstruct regions of affordance from captured human activity. The predicted affordance segmentation is then applied to compose a reconstruction of the complete 3D scene, integrating the affordance segmentation into 3D space. The mapping learnt between human activity and affordance segmentation demonstrates that omnidirectional observation of human activity can be applied to scene understanding tasks such as 3D reconstruction. We show that our approach using only observation of people performs well against previous approaches, allowing reconstruction of occluded regions and labelling of scene affordances. © 2018 IEEE.

3D reconstruction, Affordances, Convolutional encoders, Human activities, Human-centric, Observation of human activities, Scene interactions, Scene understanding, Image reconstruction

10.1109/3DV.2018.00046

334-342

IEEE

Fowler, S.

a4d9db52-325d-44ba-84e9-64db8cbc029e

Kim, H.

2c7c135c-f00b-4409-acb2-85b3a9e8225f

Hilton, A.

12782a55-4c4d-4dfb-a690-62505f6665db

2018

Fowler, S.

a4d9db52-325d-44ba-84e9-64db8cbc029e

Kim, H.

2c7c135c-f00b-4409-acb2-85b3a9e8225f

Hilton, A.

12782a55-4c4d-4dfb-a690-62505f6665db

Fowler, S., Kim, H. and Hilton, A. (2018) Human-centric scene understanding from single view 360 video. In 10.1109/3DV.2018.00046. IEEE. pp. 334-342 . (doi:10.1109/3DV.2018.00046).

Record type: Conference or Workshop Item (Paper)

Abstract

This record has no associated files available for download.

More information

Published date: 2018

Additional Information: cited By 0

Venue - Dates: International Conference on 3D Vision, 2018-10-19

Keywords: 3D reconstruction, Affordances, Convolutional encoders, Human activities, Human-centric, Observation of human activities, Scene interactions, Scene understanding, Image reconstruction

Learn more about Vision, Learning and Control research