Accuracy to throughput trade-offs for reduced precision neural networks on reconfigurable logic
Accuracy to throughput trade-offs for reduced precision neural networks on reconfigurable logic
Modern Convolutional Neural Networks (CNNs) are typically based on floating point linear algebra based implementations. Recently, reduced precision Neural Networks (NNs) have been gaining popularity as they require significantly less memory and computational resources compared to floating point. This is particularly important in power constrained compute environments. However, in many cases a reduction in precision comes at a small cost to the accuracy of the resultant network. In this work, we investigate the accuracy-throughput trade-off for various parameter precision applied to different types of NN models. We firstly propose a quantization training strategy that allows reduced precision NN inference with a lower memory footprint and competitive model accuracy. Then, we quantitatively formulate the relationship between data representation and hardware efficiency. Our experiments finally provide insightful observation. For example, one of our tests show 32-bit floating point is more hardware efficient than 1-bit parameters to achieve 99% MNIST accuracy. In general, 2-bit and 4-bit fixed point parameters show better hardware trade-off on small-scale datasets like MNIST and CIFAR-10 while 4-bit provide the best trade-off in large-scale tasks like AlexNet on ImageNet dataset within our tested problem domain.
Algorithm acceleration, FPGA, Neural networks, Reduced precision
29-42
Su, Jiang
610a2aed-be1b-4cda-b0fa-61b912d26802
Fraser, Nicholas J.
cd76a560-911e-4c50-a2e2-f2958a6b0c28
Gambardella, Giulio
3ab848b3-4492-4d9f-984e-095ed5668a66
Blott, Michaela
782fc1cd-357a-4d6c-b729-4c4d9095a18c
Durelli, Gianluca
894da643-612b-40b9-939f-bcbbac9435d4
Thomas, David B.
5701997d-7de3-4e57-a802-ea2bd3e6ab6c
Leong, Philip H.W.
fe9776b3-cce8-4e69-920a-27c8b1f7015d
Cheung, Peter Y.K.
7a175b08-9e60-4f7c-bf75-bda5e529fefd
8 April 2018
Su, Jiang
610a2aed-be1b-4cda-b0fa-61b912d26802
Fraser, Nicholas J.
cd76a560-911e-4c50-a2e2-f2958a6b0c28
Gambardella, Giulio
3ab848b3-4492-4d9f-984e-095ed5668a66
Blott, Michaela
782fc1cd-357a-4d6c-b729-4c4d9095a18c
Durelli, Gianluca
894da643-612b-40b9-939f-bcbbac9435d4
Thomas, David B.
5701997d-7de3-4e57-a802-ea2bd3e6ab6c
Leong, Philip H.W.
fe9776b3-cce8-4e69-920a-27c8b1f7015d
Cheung, Peter Y.K.
7a175b08-9e60-4f7c-bf75-bda5e529fefd
Su, Jiang, Fraser, Nicholas J., Gambardella, Giulio, Blott, Michaela, Durelli, Gianluca, Thomas, David B., Leong, Philip H.W. and Cheung, Peter Y.K.
(2018)
Accuracy to throughput trade-offs for reduced precision neural networks on reconfigurable logic.
Voros, Nikolaos, Keramidas, Georgios, Antonopoulos, Christos, Huebner, Michael, Diniz, Pedro C. and Goehringer, Diana
(eds.)
In Applied Reconfigurable Computing: Architectures, Tools, and Applications - 14th International Symposium, ARC 2018, Proceedings.
vol. 10824 LNCS,
Springer.
.
(doi:10.1007/978-3-319-78890-6_3).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Modern Convolutional Neural Networks (CNNs) are typically based on floating point linear algebra based implementations. Recently, reduced precision Neural Networks (NNs) have been gaining popularity as they require significantly less memory and computational resources compared to floating point. This is particularly important in power constrained compute environments. However, in many cases a reduction in precision comes at a small cost to the accuracy of the resultant network. In this work, we investigate the accuracy-throughput trade-off for various parameter precision applied to different types of NN models. We firstly propose a quantization training strategy that allows reduced precision NN inference with a lower memory footprint and competitive model accuracy. Then, we quantitatively formulate the relationship between data representation and hardware efficiency. Our experiments finally provide insightful observation. For example, one of our tests show 32-bit floating point is more hardware efficient than 1-bit parameters to achieve 99% MNIST accuracy. In general, 2-bit and 4-bit fixed point parameters show better hardware trade-off on small-scale datasets like MNIST and CIFAR-10 while 4-bit provide the best trade-off in large-scale tasks like AlexNet on ImageNet dataset within our tested problem domain.
This record has no associated files available for download.
More information
Published date: 8 April 2018
Additional Information:
Funding Information:
Acknowledgments. The authors from Imperial College London would like to acknowledge the support of UK’s research council (RCUK) with the following grants: EP/K034448, P010040 and N031768. The authors from The University of Sydney acknowledge support from the Australian Research Council Linkage Project LP130101034.
Funding Information:
The authors from Imperial College London would like to acknowledge the support of UK’s research council (RCUK) with the following grants: EP/K034448, P010040 and N031768. The authors from The University of Sydney acknowledge support from the Australian Research Council Linkage Project LP130101034.
Publisher Copyright:
© Springer International Publishing AG, part of Springer Nature 2018.
Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.
Venue - Dates:
14th International Symposium on Applied Reconfigurable Computing, ARC 2018, , Santorini, Greece, 2018-05-02 - 2018-05-04
Keywords:
Algorithm acceleration, FPGA, Neural networks, Reduced precision
Identifiers
Local EPrints ID: 453676
URI: http://eprints.soton.ac.uk/id/eprint/453676
ISSN: 0302-9743
PURE UUID: a4918b95-606f-4fdd-af1c-835ad2f802b4
Catalogue record
Date deposited: 20 Jan 2022 17:45
Last modified: 06 Jun 2024 02:12
Export record
Altmetrics
Contributors
Author:
Jiang Su
Author:
Nicholas J. Fraser
Author:
Giulio Gambardella
Author:
Michaela Blott
Author:
Gianluca Durelli
Author:
David B. Thomas
Author:
Philip H.W. Leong
Author:
Peter Y.K. Cheung
Editor:
Nikolaos Voros
Editor:
Georgios Keramidas
Editor:
Christos Antonopoulos
Editor:
Michael Huebner
Editor:
Pedro C. Diniz
Editor:
Diana Goehringer
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics