Accuracy to throughput trade-offs for reduced precision neural networks on reconfigurable logic

Modern Convolutional Neural Networks (CNNs) are typically based on floating point linear algebra based implementations. Recently, reduced precision Neural Networks (NNs) have been gaining popularity as they require significantly less memory and computational resources compared to floating point. This is particularly important in power constrained compute environments. However, in many cases a reduction in precision comes at a small cost to the accuracy of the resultant network. In this work, we investigate the accuracy-throughput trade-off for various parameter precision applied to different types of NN models. We firstly propose a quantization training strategy that allows reduced precision NN inference with a lower memory footprint and competitive model accuracy. Then, we quantitatively formulate the relationship between data representation and hardware efficiency. Our experiments finally provide insightful observation. For example, one of our tests show 32-bit floating point is more hardware efficient than 1-bit parameters to achieve 99% MNIST accuracy. In general, 2-bit and 4-bit fixed point parameters show better hardware trade-off on small-scale datasets like MNIST and CIFAR-10 while 4-bit provide the best trade-off in large-scale tasks like AlexNet on ImageNet dataset within our tested problem domain.

Algorithm acceleration, FPGA, Neural networks, Reduced precision

10.1007/978-3-319-78890-6_3

0302-9743

29-42

Springer

Su, Jiang

610a2aed-be1b-4cda-b0fa-61b912d26802

Fraser, Nicholas J.

cd76a560-911e-4c50-a2e2-f2958a6b0c28

Gambardella, Giulio

3ab848b3-4492-4d9f-984e-095ed5668a66

Blott, Michaela

782fc1cd-357a-4d6c-b729-4c4d9095a18c

Durelli, Gianluca

894da643-612b-40b9-939f-bcbbac9435d4

Thomas, David B.

5701997d-7de3-4e57-a802-ea2bd3e6ab6c

Leong, Philip H.W.

fe9776b3-cce8-4e69-920a-27c8b1f7015d

Cheung, Peter Y.K.

7a175b08-9e60-4f7c-bf75-bda5e529fefd

Voros, Nikolaos

Keramidas, Georgios

Antonopoulos, Christos

Huebner, Michael

Diniz, Pedro C.

Goehringer, Diana

8 April 2018

Su, Jiang

610a2aed-be1b-4cda-b0fa-61b912d26802

Fraser, Nicholas J.

cd76a560-911e-4c50-a2e2-f2958a6b0c28

Gambardella, Giulio

3ab848b3-4492-4d9f-984e-095ed5668a66

Blott, Michaela

782fc1cd-357a-4d6c-b729-4c4d9095a18c

Durelli, Gianluca

894da643-612b-40b9-939f-bcbbac9435d4

Thomas, David B.

5701997d-7de3-4e57-a802-ea2bd3e6ab6c

Leong, Philip H.W.

fe9776b3-cce8-4e69-920a-27c8b1f7015d

Cheung, Peter Y.K.

7a175b08-9e60-4f7c-bf75-bda5e529fefd

Voros, Nikolaos

Keramidas, Georgios

Antonopoulos, Christos

Huebner, Michael

Diniz, Pedro C.

Goehringer, Diana

Su, Jiang, Fraser, Nicholas J., Gambardella, Giulio, Blott, Michaela, Durelli, Gianluca, Thomas, David B., Leong, Philip H.W. and Cheung, Peter Y.K. (2018) Accuracy to throughput trade-offs for reduced precision neural networks on reconfigurable logic. Voros, Nikolaos, Keramidas, Georgios, Antonopoulos, Christos, Huebner, Michael, Diniz, Pedro C. and Goehringer, Diana (eds.) In Applied Reconfigurable Computing: Architectures, Tools, and Applications - 14th International Symposium, ARC 2018, Proceedings. vol. 10824 LNCS, Springer. pp. 29-42 . (doi:10.1007/978-3-319-78890-6_3).

Record type: Conference or Workshop Item (Paper)

Abstract

This record has no associated files available for download.

More information

Published date: 8 April 2018

Additional Information: Funding Information: Acknowledgments. The authors from Imperial College London would like to acknowledge the support of UK’s research council (RCUK) with the following grants: EP/K034448, P010040 and N031768. The authors from The University of Sydney acknowledge support from the Australian Research Council Linkage Project LP130101034. Funding Information: The authors from Imperial College London would like to acknowledge the support of UK’s research council (RCUK) with the following grants: EP/K034448, P010040 and N031768. The authors from The University of Sydney acknowledge support from the Australian Research Council Linkage Project LP130101034. Publisher Copyright: © Springer International Publishing AG, part of Springer Nature 2018. Copyright: Copyright 2018 Elsevier B.V., All rights reserved.

Venue - Dates: 14th International Symposium on Applied Reconfigurable Computing, ARC 2018, , Santorini, Greece, 2018-05-02 - 2018-05-04

Keywords: Algorithm acceleration, FPGA, Neural networks, Reduced precision

Learn more about Cyber Physical Systems research