The University of Southampton
University of Southampton Institutional Repository

Redundancy-reduced MobileNet acceleration on reconfigurable logic for ImageNet classification

Redundancy-reduced MobileNet acceleration on reconfigurable logic for ImageNet classification
Redundancy-reduced MobileNet acceleration on reconfigurable logic for ImageNet classification

Modern Convolutional Neural Networks (CNNs) excel in image classification and recognition applications on large-scale datasets such as ImageNet, compared to many conventional feature-based computer vision algorithms. However, the high computational complexity of CNN models can lead to low system performance in power-efficient applications. In this work, we firstly highlight two levels of model redundancy which widely exist in modern CNNs. Additionally, we use MobileNet as a design example and propose an efficient system design for a Redundancy-Reduced MobileNet (RR-MobileNet) in which off-chip memory traffic is only used for inputs/outputs transfer while parameters and intermediate values are saved in on-chip BRAM blocks. Compared to AlexNet, our RR-mobileNet has 25 × less parameters, 3.2 × less operations per image inference but 9%/5.2% higher Top1/Top5 classification accuracy on ImageNet classification task. The latency of a single image inference is only 7.85 ms.

Algorithm acceleration, CNN, FPGA, Pruning, Quantization
0302-9743
16-28
Springer
Su, Jiang
610a2aed-be1b-4cda-b0fa-61b912d26802
Faraone, Julian
c209dd52-f621-434f-95dd-746c60e1ba05
Liu, Junyi
7993d7d4-9623-4ab2-b26a-35dd8d19ec9a
Zhao, Yiren
a20ea167-571b-484a-8195-6a2db6aa4dcf
Thomas, David B.
5701997d-7de3-4e57-a802-ea2bd3e6ab6c
Leong, Philip H.W.
fe9776b3-cce8-4e69-920a-27c8b1f7015d
Cheung, Peter Y.K.
7a175b08-9e60-4f7c-bf75-bda5e529fefd
Voros, Nikolaos
Keramidas, Georgios
Antonopoulos, Christos
Huebner, Michael
Diniz, Pedro C.
Goehringer, Diana
Su, Jiang
610a2aed-be1b-4cda-b0fa-61b912d26802
Faraone, Julian
c209dd52-f621-434f-95dd-746c60e1ba05
Liu, Junyi
7993d7d4-9623-4ab2-b26a-35dd8d19ec9a
Zhao, Yiren
a20ea167-571b-484a-8195-6a2db6aa4dcf
Thomas, David B.
5701997d-7de3-4e57-a802-ea2bd3e6ab6c
Leong, Philip H.W.
fe9776b3-cce8-4e69-920a-27c8b1f7015d
Cheung, Peter Y.K.
7a175b08-9e60-4f7c-bf75-bda5e529fefd
Voros, Nikolaos
Keramidas, Georgios
Antonopoulos, Christos
Huebner, Michael
Diniz, Pedro C.
Goehringer, Diana

Su, Jiang, Faraone, Julian, Liu, Junyi, Zhao, Yiren, Thomas, David B., Leong, Philip H.W. and Cheung, Peter Y.K. (2018) Redundancy-reduced MobileNet acceleration on reconfigurable logic for ImageNet classification. Voros, Nikolaos, Keramidas, Georgios, Antonopoulos, Christos, Huebner, Michael, Diniz, Pedro C. and Goehringer, Diana (eds.) In Applied Reconfigurable Computing: Architectures, Tools, and Applications - 14th International Symposium, ARC 2018, Proceedings. vol. 10824 LNCS, Springer. pp. 16-28 . (doi:10.1007/978-3-319-78890-6_2).

Record type: Conference or Workshop Item (Paper)

Abstract

Modern Convolutional Neural Networks (CNNs) excel in image classification and recognition applications on large-scale datasets such as ImageNet, compared to many conventional feature-based computer vision algorithms. However, the high computational complexity of CNN models can lead to low system performance in power-efficient applications. In this work, we firstly highlight two levels of model redundancy which widely exist in modern CNNs. Additionally, we use MobileNet as a design example and propose an efficient system design for a Redundancy-Reduced MobileNet (RR-MobileNet) in which off-chip memory traffic is only used for inputs/outputs transfer while parameters and intermediate values are saved in on-chip BRAM blocks. Compared to AlexNet, our RR-mobileNet has 25 × less parameters, 3.2 × less operations per image inference but 9%/5.2% higher Top1/Top5 classification accuracy on ImageNet classification task. The latency of a single image inference is only 7.85 ms.

This record has no associated files available for download.

More information

Published date: 8 April 2018
Additional Information: Publisher Copyright: © Springer International Publishing AG, part of Springer Nature 2018. Copyright: Copyright 2018 Elsevier B.V., All rights reserved.
Venue - Dates: 14th International Symposium on Applied Reconfigurable Computing, ARC 2018, , Santorini, Greece, 2018-05-02 - 2018-05-04
Keywords: Algorithm acceleration, CNN, FPGA, Pruning, Quantization

Identifiers

Local EPrints ID: 453689
URI: http://eprints.soton.ac.uk/id/eprint/453689
ISSN: 0302-9743
PURE UUID: 12bbfb6e-fe4c-4a01-969b-b747d611a544
ORCID for David B. Thomas: ORCID iD orcid.org/0000-0002-9671-0917

Catalogue record

Date deposited: 20 Jan 2022 17:46
Last modified: 06 Jun 2024 02:12

Export record

Altmetrics

Contributors

Author: Jiang Su
Author: Julian Faraone
Author: Junyi Liu
Author: Yiren Zhao
Author: David B. Thomas ORCID iD
Author: Philip H.W. Leong
Author: Peter Y.K. Cheung
Editor: Nikolaos Voros
Editor: Georgios Keramidas
Editor: Christos Antonopoulos
Editor: Michael Huebner
Editor: Pedro C. Diniz
Editor: Diana Goehringer

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×