Weight fixing networks
Weight fixing networks
Modern iterations of deep learning models contain millions (billions) of unique parameters-each represented by a b-bit number. Popular attempts at compressing neural networks (such as pruning and quan-tisation) have shown that many of the parameters are superfluous, which we can remove (pruning) or express with b
′ < b bits (quantisation) without hindering performance. Here we look to go much further in minimis-ing the information content of networks. Rather than a channel or layer-wise encoding, we look to lossless whole-network quantisation to min-imise the entropy and number of unique parameters in a network. We propose a new method, which we call Weight Fixing Networks (WFN) that we design to realise four model outcome objectives: i) very few unique weights, ii) low-entropy weight encodings, iii) unique weight values which are amenable to energy-saving versions of hardware multiplication, and iv) lossless task-performance. Some of these goals are conflicting. To best balance these conflicts, we combine a few novel (and some well-trodden) tricks; a novel regularisation term, (i, ii) a view of clustering cost as relative distance change (i, ii, iv), and a focus on whole-network re-use of weights (i, iii). Our Imagenet experiments demonstrate lossless compression using 56x fewer unique weights and a 1.9x lower weight-space entropy than SOTA quantisation approaches. Code and model saves can be found at github.com/subiawaud/Weight Fix Networks.
Compression, Deep learning accelerators, Minimal description length, Quantization
415-431
Subia-Waud, Christopher
1d5426c0-f3ac-4f02-9dd2-83cdc2a8f2fc
Dasmahapatra, Srinandan
eb5fd76f-4335-4ae9-a88a-20b9e2b3f698
Farinella, Giovanni Maria
2022
Subia-Waud, Christopher
1d5426c0-f3ac-4f02-9dd2-83cdc2a8f2fc
Dasmahapatra, Srinandan
eb5fd76f-4335-4ae9-a88a-20b9e2b3f698
Farinella, Giovanni Maria
Subia-Waud, Christopher and Dasmahapatra, Srinandan
(2022)
Weight fixing networks.
Avidan, Shai, Brostow, Gabriel, Cissé, Moustapha, Farinella, Giovanni Maria and Hassner, Tal
(eds.)
In Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI.
vol. 13671 LNCS,
.
(doi:10.1007/978-3-031-20083-0_25).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Modern iterations of deep learning models contain millions (billions) of unique parameters-each represented by a b-bit number. Popular attempts at compressing neural networks (such as pruning and quan-tisation) have shown that many of the parameters are superfluous, which we can remove (pruning) or express with b
′ < b bits (quantisation) without hindering performance. Here we look to go much further in minimis-ing the information content of networks. Rather than a channel or layer-wise encoding, we look to lossless whole-network quantisation to min-imise the entropy and number of unique parameters in a network. We propose a new method, which we call Weight Fixing Networks (WFN) that we design to realise four model outcome objectives: i) very few unique weights, ii) low-entropy weight encodings, iii) unique weight values which are amenable to energy-saving versions of hardware multiplication, and iv) lossless task-performance. Some of these goals are conflicting. To best balance these conflicts, we combine a few novel (and some well-trodden) tricks; a novel regularisation term, (i, ii) a view of clustering cost as relative distance change (i, ii, iv), and a focus on whole-network re-use of weights (i, iii). Our Imagenet experiments demonstrate lossless compression using 56x fewer unique weights and a 1.9x lower weight-space entropy than SOTA quantisation approaches. Code and model saves can be found at github.com/subiawaud/Weight Fix Networks.
Text
Weight Fixing Networks
- Accepted Manuscript
More information
e-pub ahead of print date: 3 November 2022
Published date: 2022
Additional Information:
Funding Information:
Acknowledgements. This work was supported by the UK Research and Innovation Centre for Doctoral Training in Machine Intelligence for Nano-electronic Devices and Systems [EP/S024298/1]. Thank you to Sulaiman Sadiq for insightful discussions.
Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Venue - Dates:
European Conference on Computer Vision, Israel, Tel Aviv, 2021-10-23 - 2022-10-27
Keywords:
Compression, Deep learning accelerators, Minimal description length, Quantization
Identifiers
Local EPrints ID: 472693
URI: http://eprints.soton.ac.uk/id/eprint/472693
ISSN: 0302-9743
PURE UUID: 1fd9195d-37b9-4347-b47e-adc35431b3eb
Catalogue record
Date deposited: 14 Dec 2022 17:48
Last modified: 05 Jun 2024 18:48
Export record
Altmetrics
Contributors
Author:
Christopher Subia-Waud
Author:
Srinandan Dasmahapatra
Editor:
Shai Avidan
Editor:
Gabriel Brostow
Editor:
Moustapha Cissé
Editor:
Giovanni Maria Farinella
Editor:
Tal Hassner
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics