The University of Southampton
University of Southampton Institutional Repository

Probabilistic weight fixing: large-scale training of neural network weight uncertainties for quantisation

Probabilistic weight fixing: large-scale training of neural network weight uncertainties for quantisation
Probabilistic weight fixing: large-scale training of neural network weight uncertainties for quantisation
Weight-sharing quantization has emerged as a technique to reduce energy expenditure during inference in large neural networks by constraining their weights to a limited set of values. However, existing methods often assume weights are treated solely based on value, neglecting the unique role of weight position. This paper proposes a probabilistic framework based on Bayesian neural networks (BNNs) and a variational relaxation to identify which weights can be moved to which cluster center and to what degree based on their individual position-specific learned uncertainty distributions. We introduce a new initialization setting and a regularization term, enabling the training of BNNs with complex dataset-model combinations. Leveraging the flexibility of weight values from probability distributions, we enhance noise resilience and compressibility. Our iterative clustering procedure demonstrates superior compressibility and higher accuracy compared to state-of-the-art methods on both ResNet models and the more complex transformer-based architectures. In particular, our method outperforms the state-of-the-art quantization method top-1 accuracy by 1.6\% on ImageNet using DeiT-Tiny, with its 5 million+ weights now represented by only 296 unique values.

Code available at https://github.com/subiawaud/PWFN.
59410-59424
Curran Associates, Inc.
Subia-Waud, Christopher
1d5426c0-f3ac-4f02-9dd2-83cdc2a8f2fc
Dasmahapatra, Srinandan
eb5fd76f-4335-4ae9-a88a-20b9e2b3f698
Oh, A.
Naumann, T.
Globerson, A.
Saenko, K.
Hardt, M.
Levine, S.
Subia-Waud, Christopher
1d5426c0-f3ac-4f02-9dd2-83cdc2a8f2fc
Dasmahapatra, Srinandan
eb5fd76f-4335-4ae9-a88a-20b9e2b3f698
Oh, A.
Naumann, T.
Globerson, A.
Saenko, K.
Hardt, M.
Levine, S.

Subia-Waud, Christopher and Dasmahapatra, Srinandan (2023) Probabilistic weight fixing: large-scale training of neural network weight uncertainties for quantisation. Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M. and Levine, S. (eds.) In NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing Systems. Curran Associates, Inc. pp. 59410-59424 . (doi:10.5555/3666122.3668718).

Record type: Conference or Workshop Item (Paper)

Abstract

Weight-sharing quantization has emerged as a technique to reduce energy expenditure during inference in large neural networks by constraining their weights to a limited set of values. However, existing methods often assume weights are treated solely based on value, neglecting the unique role of weight position. This paper proposes a probabilistic framework based on Bayesian neural networks (BNNs) and a variational relaxation to identify which weights can be moved to which cluster center and to what degree based on their individual position-specific learned uncertainty distributions. We introduce a new initialization setting and a regularization term, enabling the training of BNNs with complex dataset-model combinations. Leveraging the flexibility of weight values from probability distributions, we enhance noise resilience and compressibility. Our iterative clustering procedure demonstrates superior compressibility and higher accuracy compared to state-of-the-art methods on both ResNet models and the more complex transformer-based architectures. In particular, our method outperforms the state-of-the-art quantization method top-1 accuracy by 1.6\% on ImageNet using DeiT-Tiny, with its 5 million+ weights now represented by only 296 unique values.

Code available at https://github.com/subiawaud/PWFN.

This record has no associated files available for download.

More information

Published date: 10 December 2023

Identifiers

Local EPrints ID: 502548
URI: http://eprints.soton.ac.uk/id/eprint/502548
PURE UUID: 46888aad-33ce-4d3d-be5e-d6bd3cf1cc28
ORCID for Srinandan Dasmahapatra: ORCID iD orcid.org/0000-0002-9757-5315

Catalogue record

Date deposited: 30 Jun 2025 18:48
Last modified: 03 Jul 2025 01:45

Export record

Altmetrics

Contributors

Author: Christopher Subia-Waud
Author: Srinandan Dasmahapatra ORCID iD
Editor: A. Oh
Editor: T. Naumann
Editor: A. Globerson
Editor: K. Saenko
Editor: M. Hardt
Editor: S. Levine

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×