Improving the robustness of neural multiplication units with reversible stochasticity
Improving the robustness of neural multiplication units with reversible stochasticity
Multilayer Perceptrons struggle to learn certain simple arithmetic tasks. Specialist neural modules for arithmetic can outperform classical architectures with gains in extrapolation, interpretability and convergence speeds, but are highly sensitive to the training range. In this paper, we show that Neural Multiplication Units (NMUs) are unable to reliably learn tasks as simple as multiplying two inputs when given different training ranges. Causes of failure are linked to inductive and input biases which encourage convergence to solutions in undesirable optima. A solution, the stochastic NMU (sNMU), is proposed to apply reversible stochasticity, encouraging avoidance of such optima whilst converging to the true solution. Empirically, we show that stochasticity provides improved robustness with the potential to improve learned representations of upstream networks for numerical and image tasks.
cs.LG, cs.NE
Mistry, Bhumika
36ac2f06-1a50-4c50-ab5e-a57c3faab549
Farrahi, Katayoun
bc848b9c-fc32-475c-b241-f6ade8babacb
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9
10 November 2022
Mistry, Bhumika
36ac2f06-1a50-4c50-ab5e-a57c3faab549
Farrahi, Katayoun
bc848b9c-fc32-475c-b241-f6ade8babacb
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9
[Unknown type: UNSPECIFIED]
Abstract
Multilayer Perceptrons struggle to learn certain simple arithmetic tasks. Specialist neural modules for arithmetic can outperform classical architectures with gains in extrapolation, interpretability and convergence speeds, but are highly sensitive to the training range. In this paper, we show that Neural Multiplication Units (NMUs) are unable to reliably learn tasks as simple as multiplying two inputs when given different training ranges. Causes of failure are linked to inductive and input biases which encourage convergence to solutions in undesirable optima. A solution, the stochastic NMU (sNMU), is proposed to apply reversible stochasticity, encouraging avoidance of such optima whilst converging to the true solution. Empirically, we show that stochasticity provides improved robustness with the potential to improve learned representations of upstream networks for numerical and image tasks.
Text
2211.05624v1
- Author's Original
More information
Published date: 10 November 2022
Additional Information:
26 pages (10 page main body)
Keywords:
cs.LG, cs.NE
Identifiers
Local EPrints ID: 489972
URI: http://eprints.soton.ac.uk/id/eprint/489972
PURE UUID: d993d841-3df6-419e-8a7e-24e9f3c22671
Catalogue record
Date deposited: 09 May 2024 16:31
Last modified: 10 May 2024 01:56
Export record
Altmetrics
Contributors
Author:
Bhumika Mistry
Author:
Katayoun Farrahi
Author:
Jonathon Hare
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics