The University of Southampton
University of Southampton Institutional Repository

Spatio-temporal manifold learning for human motions via long-horizon modeling

Spatio-temporal manifold learning for human motions via long-horizon modeling
Spatio-temporal manifold learning for human motions via long-horizon modeling
Data-driven modeling of human motions is ubiquitous in computer graphics and computer vision applications, such as synthesizing realistic motions or recognizing actions. Recent research has shown that such problems can be approached by learning a natural motion manifold using deep learning on a large amount data, to address the shortcomings of traditional data-driven approaches. However, previous deep learning methods can be sub-optimal for two reasons. First, the skeletal information has not been fully utilized for feature extraction. Unlike images, it is difficult to define spatial proximity in skeletal motions in the way that deep networks can be applied for feature extraction. Second, motion is time-series data with strong multi-modal temporal correlations between frames. On the one hand, a frame could be followed by several candidate frames leading to different motions; on the other hand, long-range dependencies exist where a number of frames in the beginning are correlated with a number of frames later. Ineffective temporal modeling would either under-estimate the multi-modality and variance, resulting in featureless mean motion or over-estimate them resulting in jittery motions, which is a major source of visual artifacts. In this paper, we propose a new deep network to tackle these challenges by creating a natural motion manifold that is versatile for many applications. The network has a new spatial component for feature extraction. It is also equipped with a new batch prediction model that predicts a large number of frames at once, such that long-term temporally-based objective functions can be employed to correctly learn the motion multi-modality and variances. With our system, long-duration motions can be predicted/synthesized using an open-loop setup where the motion retains the dynamics accurately. It can also be used for denoising corrupted motions and synthesizing new motions with given control signals. We demonstrate that our system can create superior results comparing to existing work in multiple applications.
216-227
Wang, He
0bb8fa1d-de57-42f1-935c-369731be4407
Ho, Edmond SL
047b6fa2-b4de-4533-b3c6-03dbcc635dc3
Shum, Hubert PH
159a2e94-636c-4dde-9658-a272667138ec
Zhu, Zhanxing
e55e7385-8ba2-4a85-8bae-e00defb7d7f0
Wang, He
0bb8fa1d-de57-42f1-935c-369731be4407
Ho, Edmond SL
047b6fa2-b4de-4533-b3c6-03dbcc635dc3
Shum, Hubert PH
159a2e94-636c-4dde-9658-a272667138ec
Zhu, Zhanxing
e55e7385-8ba2-4a85-8bae-e00defb7d7f0

Wang, He, Ho, Edmond SL, Shum, Hubert PH and Zhu, Zhanxing (2021) Spatio-temporal manifold learning for human motions via long-horizon modeling. IEEE Transactions on Visualization and Computer Graphics, 27 (1), 216-227. (doi:10.1109/TVCG.2019.2936810).

Record type: Article

Abstract

Data-driven modeling of human motions is ubiquitous in computer graphics and computer vision applications, such as synthesizing realistic motions or recognizing actions. Recent research has shown that such problems can be approached by learning a natural motion manifold using deep learning on a large amount data, to address the shortcomings of traditional data-driven approaches. However, previous deep learning methods can be sub-optimal for two reasons. First, the skeletal information has not been fully utilized for feature extraction. Unlike images, it is difficult to define spatial proximity in skeletal motions in the way that deep networks can be applied for feature extraction. Second, motion is time-series data with strong multi-modal temporal correlations between frames. On the one hand, a frame could be followed by several candidate frames leading to different motions; on the other hand, long-range dependencies exist where a number of frames in the beginning are correlated with a number of frames later. Ineffective temporal modeling would either under-estimate the multi-modality and variance, resulting in featureless mean motion or over-estimate them resulting in jittery motions, which is a major source of visual artifacts. In this paper, we propose a new deep network to tackle these challenges by creating a natural motion manifold that is versatile for many applications. The network has a new spatial component for feature extraction. It is also equipped with a new batch prediction model that predicts a large number of frames at once, such that long-term temporally-based objective functions can be employed to correctly learn the motion multi-modality and variances. With our system, long-duration motions can be predicted/synthesized using an open-loop setup where the motion retains the dynamics accurately. It can also be used for denoising corrupted motions and synthesizing new motions with given control signals. We demonstrate that our system can create superior results comparing to existing work in multiple applications.

This record has no associated files available for download.

More information

Published date: January 2021

Identifiers

Local EPrints ID: 486255
URI: http://eprints.soton.ac.uk/id/eprint/486255
PURE UUID: 96e8d825-8808-48ec-a877-2cb5cebc4c86

Catalogue record

Date deposited: 16 Jan 2024 17:33
Last modified: 17 Mar 2024 06:51

Export record

Altmetrics

Contributors

Author: He Wang
Author: Edmond SL Ho
Author: Hubert PH Shum
Author: Zhanxing Zhu

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×