The University of Southampton
University of Southampton Institutional Repository

Improving the robustness of neural multiplication units with reversible stochasticity

Improving the robustness of neural multiplication units with reversible stochasticity
Improving the robustness of neural multiplication units with reversible stochasticity
Multilayer Perceptrons struggle to learn certain simple arithmetic tasks. Specialist neural modules for arithmetic can outperform classical architectures with gains in extrapolation, interpretability and convergence speeds, but are highly sensitive to the training range. In this paper, we show that Neural Multiplication Units (NMUs) are unable to reliably learn tasks as simple as multiplying two inputs when given different training ranges. Causes of failure are linked to inductive and input biases which encourage convergence to solutions in undesirable optima. A solution, the stochastic NMU (sNMU), is proposed to apply reversible stochasticity, encouraging avoidance of such optima whilst converging to the true solution. Empirically, we show that stochasticity provides improved robustness with the potential to improve learned representations of upstream networks for numerical and image tasks.
cs.LG, cs.NE
arXiv
Mistry, Bhumika
36ac2f06-1a50-4c50-ab5e-a57c3faab549
Farrahi, Katayoun
bc848b9c-fc32-475c-b241-f6ade8babacb
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9
Mistry, Bhumika
36ac2f06-1a50-4c50-ab5e-a57c3faab549
Farrahi, Katayoun
bc848b9c-fc32-475c-b241-f6ade8babacb
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9

[Unknown type: UNSPECIFIED]

Record type: UNSPECIFIED

Abstract

Multilayer Perceptrons struggle to learn certain simple arithmetic tasks. Specialist neural modules for arithmetic can outperform classical architectures with gains in extrapolation, interpretability and convergence speeds, but are highly sensitive to the training range. In this paper, we show that Neural Multiplication Units (NMUs) are unable to reliably learn tasks as simple as multiplying two inputs when given different training ranges. Causes of failure are linked to inductive and input biases which encourage convergence to solutions in undesirable optima. A solution, the stochastic NMU (sNMU), is proposed to apply reversible stochasticity, encouraging avoidance of such optima whilst converging to the true solution. Empirically, we show that stochasticity provides improved robustness with the potential to improve learned representations of upstream networks for numerical and image tasks.

Text
2211.05624v1 - Author's Original
Available under License Creative Commons Attribution.
Download (2MB)

More information

Published date: 10 November 2022
Additional Information: 26 pages (10 page main body)
Keywords: cs.LG, cs.NE

Identifiers

Local EPrints ID: 489972
URI: http://eprints.soton.ac.uk/id/eprint/489972
PURE UUID: d993d841-3df6-419e-8a7e-24e9f3c22671
ORCID for Bhumika Mistry: ORCID iD orcid.org/0000-0003-4555-0121
ORCID for Katayoun Farrahi: ORCID iD orcid.org/0000-0001-6775-127X
ORCID for Jonathon Hare: ORCID iD orcid.org/0000-0003-2921-4283

Catalogue record

Date deposited: 09 May 2024 16:31
Last modified: 10 May 2024 01:56

Export record

Altmetrics

Contributors

Author: Bhumika Mistry ORCID iD
Author: Katayoun Farrahi ORCID iD
Author: Jonathon Hare ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×