Probabilistic weight fixing: large-scale training of neural network weight uncertainties for quantisation
Probabilistic weight fixing: large-scale training of neural network weight uncertainties for quantisation
Weight-sharing quantization has emerged as a technique to reduce energy expenditure during inference in large neural networks by constraining their weights to a limited set of values. However, existing methods often assume weights are treated solely based on value, neglecting the unique role of weight position. This paper proposes a probabilistic framework based on Bayesian neural networks (BNNs) and a variational relaxation to identify which weights can be moved to which cluster center and to what degree based on their individual position-specific learned uncertainty distributions. We introduce a new initialization setting and a regularization term, enabling the training of BNNs with complex dataset-model combinations. Leveraging the flexibility of weight values from probability distributions, we enhance noise resilience and compressibility. Our iterative clustering procedure demonstrates superior compressibility and higher accuracy compared to state-of-the-art methods on both ResNet models and the more complex transformer-based architectures. In particular, our method outperforms the state-of-the-art quantization method top-1 accuracy by 1.6\% on ImageNet using DeiT-Tiny, with its 5 million+ weights now represented by only 296 unique values.
Code available at https://github.com/subiawaud/PWFN.
59410-59424
Subia-Waud, Christopher
1d5426c0-f3ac-4f02-9dd2-83cdc2a8f2fc
Dasmahapatra, Srinandan
eb5fd76f-4335-4ae9-a88a-20b9e2b3f698
10 December 2023
Subia-Waud, Christopher
1d5426c0-f3ac-4f02-9dd2-83cdc2a8f2fc
Dasmahapatra, Srinandan
eb5fd76f-4335-4ae9-a88a-20b9e2b3f698
Subia-Waud, Christopher and Dasmahapatra, Srinandan
(2023)
Probabilistic weight fixing: large-scale training of neural network weight uncertainties for quantisation.
Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M. and Levine, S.
(eds.)
In NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing Systems.
Curran Associates, Inc.
.
(doi:10.5555/3666122.3668718).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Weight-sharing quantization has emerged as a technique to reduce energy expenditure during inference in large neural networks by constraining their weights to a limited set of values. However, existing methods often assume weights are treated solely based on value, neglecting the unique role of weight position. This paper proposes a probabilistic framework based on Bayesian neural networks (BNNs) and a variational relaxation to identify which weights can be moved to which cluster center and to what degree based on their individual position-specific learned uncertainty distributions. We introduce a new initialization setting and a regularization term, enabling the training of BNNs with complex dataset-model combinations. Leveraging the flexibility of weight values from probability distributions, we enhance noise resilience and compressibility. Our iterative clustering procedure demonstrates superior compressibility and higher accuracy compared to state-of-the-art methods on both ResNet models and the more complex transformer-based architectures. In particular, our method outperforms the state-of-the-art quantization method top-1 accuracy by 1.6\% on ImageNet using DeiT-Tiny, with its 5 million+ weights now represented by only 296 unique values.
Code available at https://github.com/subiawaud/PWFN.
This record has no associated files available for download.
More information
Published date: 10 December 2023
Identifiers
Local EPrints ID: 502548
URI: http://eprints.soton.ac.uk/id/eprint/502548
PURE UUID: 46888aad-33ce-4d3d-be5e-d6bd3cf1cc28
Catalogue record
Date deposited: 30 Jun 2025 18:48
Last modified: 03 Jul 2025 01:45
Export record
Altmetrics
Contributors
Author:
Christopher Subia-Waud
Author:
Srinandan Dasmahapatra
Editor:
A. Oh
Editor:
T. Naumann
Editor:
A. Globerson
Editor:
K. Saenko
Editor:
M. Hardt
Editor:
S. Levine
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics