The University of Southampton
University of Southampton Institutional Repository

Efficient video recognition with convolutional neural networks by exploiting temporal correlation in video data

Efficient video recognition with convolutional neural networks by exploiting temporal correlation in video data
Efficient video recognition with convolutional neural networks by exploiting temporal correlation in video data
Object detection in images is one of the most successful applications of convolutional neural networks (CNNs). However, applying deep CNNs to large numbers of video frames has recently emerged as a new challenge beyond image data due to the high computational requirements. Due to their similar appearances, CNNs often extract similar features from video frames. Conventional video object detection pipelines extract features of individual frames with a fixed computational effort, resulting in numerous redundant computations and an inefficient use of energy resources, particularly for edge computing. By exploiting frame-to-frame similarity, this thesis shows that the computational complexity of video object detection pipelines can be reduced. Similarity-aware CNNs are proposed to identify and avoid computations on similar feature pixels across frames. The proposed similarity-aware quantization scheme (SQS) increases the average number of unchanged feature pixels across frame pairs by up to 85% with a loss of less than 1% in detection accuracy. Second, by minimising redundant computations and memory accesses across frame pairs, the proposed similarity-aware row stationary (SRS) dataflow reduces energy consumption. According to simulation experiments, the proposed dataflow reduces video frame processing energy consumption by up to 30%. To further improve the efficiency of video object detection, a new temporal early exit module (TEEM) is proposed. Semantic differences between consecutive frames can be detected using TEEM with low computation overhead, avoiding redundant video frame feature extraction. Multiple TEEMs are inserted into the pipelines’ feature network at various early layers. TEEM-enabled pipelines only require full computation effort when a frame is determined to be semantically distinct from previous frames; otherwise, previous frame detection results are reused. Experiments on ImangeNet VID and TVnet demonstrate that TEEMs accelerate SOTA video object detection pipelines by 1.7× while maintaining a < 1% mean average precision reduction.
University of Southampton
Sabetsarvestani, Mohammadamin
f5c0e55f-6f0c-4f56-9d6d-7de19d6fb136
Sabetsarvestani, Mohammadamin
f5c0e55f-6f0c-4f56-9d6d-7de19d6fb136
Merrett, Geoffrey
89b3a696-41de-44c3-89aa-b0aa29f54020

Sabetsarvestani, Mohammadamin (2022) Efficient video recognition with convolutional neural networks by exploiting temporal correlation in video data. University of Southampton, Doctoral Thesis, 164pp.

Record type: Thesis (Doctoral)

Abstract

Object detection in images is one of the most successful applications of convolutional neural networks (CNNs). However, applying deep CNNs to large numbers of video frames has recently emerged as a new challenge beyond image data due to the high computational requirements. Due to their similar appearances, CNNs often extract similar features from video frames. Conventional video object detection pipelines extract features of individual frames with a fixed computational effort, resulting in numerous redundant computations and an inefficient use of energy resources, particularly for edge computing. By exploiting frame-to-frame similarity, this thesis shows that the computational complexity of video object detection pipelines can be reduced. Similarity-aware CNNs are proposed to identify and avoid computations on similar feature pixels across frames. The proposed similarity-aware quantization scheme (SQS) increases the average number of unchanged feature pixels across frame pairs by up to 85% with a loss of less than 1% in detection accuracy. Second, by minimising redundant computations and memory accesses across frame pairs, the proposed similarity-aware row stationary (SRS) dataflow reduces energy consumption. According to simulation experiments, the proposed dataflow reduces video frame processing energy consumption by up to 30%. To further improve the efficiency of video object detection, a new temporal early exit module (TEEM) is proposed. Semantic differences between consecutive frames can be detected using TEEM with low computation overhead, avoiding redundant video frame feature extraction. Multiple TEEMs are inserted into the pipelines’ feature network at various early layers. TEEM-enabled pipelines only require full computation effort when a frame is determined to be semantically distinct from previous frames; otherwise, previous frame detection results are reused. Experiments on ImangeNet VID and TVnet demonstrate that TEEMs accelerate SOTA video object detection pipelines by 1.7× while maintaining a < 1% mean average precision reduction.

Text
Final Thesis for award - Version of Record
Available under License University of Southampton Thesis Licence.
Download (102MB)
Text
PTD SIGNED
Restricted to Repository staff only

More information

Published date: December 2022

Identifiers

Local EPrints ID: 473997
URI: http://eprints.soton.ac.uk/id/eprint/473997
PURE UUID: 402862d1-dee5-4971-a974-11a086f494e3
ORCID for Geoffrey Merrett: ORCID iD orcid.org/0000-0003-4980-3894

Catalogue record

Date deposited: 08 Feb 2023 17:43
Last modified: 27 Oct 2023 01:46

Export record

Contributors

Author: Mohammadamin Sabetsarvestani
Thesis advisor: Geoffrey Merrett ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×