The University of Southampton
University of Southampton Institutional Repository

An investigation into neural arithmetic logic modules

An investigation into neural arithmetic logic modules
An investigation into neural arithmetic logic modules
The human ability to learn and reuse skills in a systematic manner is critical to our daily routines. For example, having the skills for executing the basic arithmetic operations (+,−,×,÷) allows a person to perform a variety of tasks including budgeting expenses, scaling measurements to the desired proportions when cooking/baking, and planning travel schedules. Machine Learning (ML) can reduce the manual workload for humans, inferring underlying relations within the data without the need for heavy feature engineering. However, the ability of such models to extrapolate and generalise to unseen data in an interpretable manner is challenging. With this challenge in mind, Neural Arithmetic Logic Modules (NALMs) have been developed. Such parameterised modules, specialised for arithmetic operations, are designed to guarantee generalisation if weights are correctly learned and be interpretable in what they learn. This thesis seeks to thoroughly investigate the proposition that such specialised differentiable modules with inductive biases toward arithmetic can be learned, uncovering the limitations which remain. In this work, we begin by studying the extent to which NALMs are able to learn arithmetic. We initially provide a comprehensive review of existing NALMs and take our analysis a step further with empirical results on a new benchmark with evaluation metrics specifically for measuring extrapolation performance. From this, we identify two arithmetic operations to further investigate, namely multiplication and division. For multiplication, we show how stochasticity can be applied to alleviate issues regarding falling into local minimas which cannot extrapolate. For division, we show through an extensive set of empirical results the mechanisms which can aid and hinder robustness. Factors other than the architecture are investigated including using images as the input modality, using a different loss criterion and feature scaling. In the final chapter, we draw inspiration from a human cognitive theory, the Global Workspace Theory (GWT), to develop an end-to-end architecture to combine different NALMs for compositional arithmetic.
University of Southampton
Mistry, Bhumika
36ac2f06-1a50-4c50-ab5e-a57c3faab549
Mistry, Bhumika
36ac2f06-1a50-4c50-ab5e-a57c3faab549
Farrahi, Kate
bc848b9c-fc32-475c-b241-f6ade8babacb
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9

Mistry, Bhumika (2023) An investigation into neural arithmetic logic modules. University of Southampton, Doctoral Thesis, 214pp.

Record type: Thesis (Doctoral)

Abstract

The human ability to learn and reuse skills in a systematic manner is critical to our daily routines. For example, having the skills for executing the basic arithmetic operations (+,−,×,÷) allows a person to perform a variety of tasks including budgeting expenses, scaling measurements to the desired proportions when cooking/baking, and planning travel schedules. Machine Learning (ML) can reduce the manual workload for humans, inferring underlying relations within the data without the need for heavy feature engineering. However, the ability of such models to extrapolate and generalise to unseen data in an interpretable manner is challenging. With this challenge in mind, Neural Arithmetic Logic Modules (NALMs) have been developed. Such parameterised modules, specialised for arithmetic operations, are designed to guarantee generalisation if weights are correctly learned and be interpretable in what they learn. This thesis seeks to thoroughly investigate the proposition that such specialised differentiable modules with inductive biases toward arithmetic can be learned, uncovering the limitations which remain. In this work, we begin by studying the extent to which NALMs are able to learn arithmetic. We initially provide a comprehensive review of existing NALMs and take our analysis a step further with empirical results on a new benchmark with evaluation metrics specifically for measuring extrapolation performance. From this, we identify two arithmetic operations to further investigate, namely multiplication and division. For multiplication, we show how stochasticity can be applied to alleviate issues regarding falling into local minimas which cannot extrapolate. For division, we show through an extensive set of empirical results the mechanisms which can aid and hinder robustness. Factors other than the architecture are investigated including using images as the input modality, using a different loss criterion and feature scaling. In the final chapter, we draw inspiration from a human cognitive theory, the Global Workspace Theory (GWT), to develop an end-to-end architecture to combine different NALMs for compositional arithmetic.

Text
Doctoral Thesis PDFA: An Investigation into Neural Arithmetic Logic Modules by Mistry - Version of Record
Available under License University of Southampton Thesis Licence.
Download (7MB)
Text
Final-thesis-submission-Examination-Miss-Bhumika-Mistry
Restricted to Repository staff only

More information

Published date: July 2023

Identifiers

Local EPrints ID: 478926
URI: http://eprints.soton.ac.uk/id/eprint/478926
PURE UUID: a0203875-10ea-42ba-8d8a-af42901361f4
ORCID for Bhumika Mistry: ORCID iD orcid.org/0000-0003-4555-0121
ORCID for Kate Farrahi: ORCID iD orcid.org/0000-0001-6775-127X
ORCID for Jonathon Hare: ORCID iD orcid.org/0000-0003-2921-4283

Catalogue record

Date deposited: 14 Jul 2023 16:31
Last modified: 18 Mar 2024 03:03

Export record

Contributors

Author: Bhumika Mistry ORCID iD
Thesis advisor: Kate Farrahi ORCID iD
Thesis advisor: Jonathon Hare ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×