TransNet: a transfer learning-based network for human action recognition
TransNet: a transfer learning-based network for human action recognition
Human action recognition (HAR) is a high-level and significant research area in computer vision due to its ubiquitous applications. The main limitations of the current HAR models are their complex structures and lengthy training time. In this paper, we propose a simple yet versatile and effective end-to-end deep learning architecture, coined as TransNet, for HAR. TransNet decomposes the complex 3D-CNNs into 2D- and 1D-CNNs, where the 2D- and 1D-CNN components extract spatial features and temporal patterns in videos, respectively. Benefiting from its concise architecture, TransNet is ideally compatible with any pretrained state-of-the-art 2D-CNN models in other fields, being transferred to serve the HAR task. In other words, it naturally leverages the power and success of transfer learning for HAR, bringing huge advantages in terms of efficiency and effectiveness. Extensive experimental results and the comparison with the state-of-the-art models demonstrate the superior performance of the proposed TransNet in HAR in terms of flexibility, model complexity, training speed and classification accuracy.
1825-1832
Alomar, Khaled
ff1cdb20-40a5-42e3-82db-935881354868
Cai, Xiaohao
de483445-45e9-4b21-a4e8-b0427fc72cee
19 March 2023
Alomar, Khaled
ff1cdb20-40a5-42e3-82db-935881354868
Cai, Xiaohao
de483445-45e9-4b21-a4e8-b0427fc72cee
Alomar, Khaled and Cai, Xiaohao
(2023)
TransNet: a transfer learning-based network for human action recognition.
Arif Wani, M., Boicu, Mihai, Sayed-Mouchaweh, Moamar, Abreu, Pedro Henriques and Gama, João
(eds.)
In 2023 International Conference on Machine Learning and Applications (ICMLA).
IEEE.
.
(doi:10.1109/ICMLA58977.2023.00277).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Human action recognition (HAR) is a high-level and significant research area in computer vision due to its ubiquitous applications. The main limitations of the current HAR models are their complex structures and lengthy training time. In this paper, we propose a simple yet versatile and effective end-to-end deep learning architecture, coined as TransNet, for HAR. TransNet decomposes the complex 3D-CNNs into 2D- and 1D-CNNs, where the 2D- and 1D-CNN components extract spatial features and temporal patterns in videos, respectively. Benefiting from its concise architecture, TransNet is ideally compatible with any pretrained state-of-the-art 2D-CNN models in other fields, being transferred to serve the HAR task. In other words, it naturally leverages the power and success of transfer learning for HAR, bringing huge advantages in terms of efficiency and effectiveness. Extensive experimental results and the comparison with the state-of-the-art models demonstrate the superior performance of the proposed TransNet in HAR in terms of flexibility, model complexity, training speed and classification accuracy.
This record has no associated files available for download.
More information
Published date: 19 March 2023
Venue - Dates:
22nd IEEE International Conference on Machine Learning and Applications, ICMLA 2023, , Jacksonville, United States, 2023-12-15 - 2023-12-17
Identifiers
Local EPrints ID: 491926
URI: http://eprints.soton.ac.uk/id/eprint/491926
PURE UUID: 8382ba62-77d3-40a6-b922-1d3366abb39c
Catalogue record
Date deposited: 08 Jul 2024 16:59
Last modified: 11 Jul 2024 02:07
Export record
Altmetrics
Contributors
Author:
Khaled Alomar
Author:
Xiaohao Cai
Editor:
M. Arif Wani
Editor:
Mihai Boicu
Editor:
Moamar Sayed-Mouchaweh
Editor:
Pedro Henriques Abreu
Editor:
João Gama
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics