Increasing network size and training throughput of FPGA restricted Boltzmann machines using dropout
Increasing network size and training throughput of FPGA restricted Boltzmann machines using dropout
Restricted Boltzmann Machines (RBMs) are widely used in modern machine learning tasks. Existing implementations are limited in network size and training throughput by available DSP resources. In this work we propose a new algorithm and architecture for FPGAs called dropout-RBM (dRBM) system. Compared to the state-of-art design methods on the same FPGA, dRBM with a dropout rate 0.5 doubles the maximum affordable network size using only half of DSP and BRAM resources. This is achieved by an application of a technique called dropout, which is a relatively new method used to avoid overfitting of data. Here we instead apply dropout as a technique for reducing the required DSPs and BRAM resources, while also having the side-effect of increasing robustness of training. Also to improve the processing throughput, we propose a multi-mode matrix multiplication module that maximizes the DSP efficiency. For the MNIST classificationbenchmark, a Stratix IV EP4SGX530 FPGA running dRBM is 34x faster than a single-precision Matlab implementation running on Intel i7 2.9GHz CPU.
Algorithm Acceleration, Dropout, Restricted Boltzmann Machine
48-51
Su, Jiang
610a2aed-be1b-4cda-b0fa-61b912d26802
Thomas, David B.
5701997d-7de3-4e57-a802-ea2bd3e6ab6c
Cheung, Peter Y.K.
7a175b08-9e60-4f7c-bf75-bda5e529fefd
18 August 2016
Su, Jiang
610a2aed-be1b-4cda-b0fa-61b912d26802
Thomas, David B.
5701997d-7de3-4e57-a802-ea2bd3e6ab6c
Cheung, Peter Y.K.
7a175b08-9e60-4f7c-bf75-bda5e529fefd
Su, Jiang, Thomas, David B. and Cheung, Peter Y.K.
(2016)
Increasing network size and training throughput of FPGA restricted Boltzmann machines using dropout.
In Proceedings - 24th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016.
IEEE.
.
(doi:10.1109/FCCM.2016.23).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Restricted Boltzmann Machines (RBMs) are widely used in modern machine learning tasks. Existing implementations are limited in network size and training throughput by available DSP resources. In this work we propose a new algorithm and architecture for FPGAs called dropout-RBM (dRBM) system. Compared to the state-of-art design methods on the same FPGA, dRBM with a dropout rate 0.5 doubles the maximum affordable network size using only half of DSP and BRAM resources. This is achieved by an application of a technique called dropout, which is a relatively new method used to avoid overfitting of data. Here we instead apply dropout as a technique for reducing the required DSPs and BRAM resources, while also having the side-effect of increasing robustness of training. Also to improve the processing throughput, we propose a multi-mode matrix multiplication module that maximizes the DSP efficiency. For the MNIST classificationbenchmark, a Stratix IV EP4SGX530 FPGA running dRBM is 34x faster than a single-precision Matlab implementation running on Intel i7 2.9GHz CPU.
This record has no associated files available for download.
More information
Published date: 18 August 2016
Venue - Dates:
24th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016, , Washington, United States, 2016-05-01 - 2016-05-03
Keywords:
Algorithm Acceleration, Dropout, Restricted Boltzmann Machine
Identifiers
Local EPrints ID: 453679
URI: http://eprints.soton.ac.uk/id/eprint/453679
PURE UUID: b818abc4-af2a-4e56-8bde-58caaccfa422
Catalogue record
Date deposited: 20 Jan 2022 17:45
Last modified: 17 Mar 2024 04:10
Export record
Altmetrics
Contributors
Author:
Jiang Su
Author:
David B. Thomas
Author:
Peter Y.K. Cheung
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
Loading...
View more statistics