An investigation into neural arithmetic logic modules

The human ability to learn and reuse skills in a systematic manner is critical to our daily routines. For example, having the skills for executing the basic arithmetic operations (+,−,×,÷) allows a person to perform a variety of tasks including budgeting expenses, scaling measurements to the desired proportions when cooking/baking, and planning travel schedules. Machine Learning (ML) can reduce the manual workload for humans, inferring underlying relations within the data without the need for heavy feature engineering. However, the ability of such models to extrapolate and generalise to unseen data in an interpretable manner is challenging. With this challenge in mind, Neural Arithmetic Logic Modules (NALMs) have been developed. Such parameterised modules, specialised for arithmetic operations, are designed to guarantee generalisation if weights are correctly learned and be interpretable in what they learn. This thesis seeks to thoroughly investigate the proposition that such specialised differentiable modules with inductive biases toward arithmetic can be learned, uncovering the limitations which remain. In this work, we begin by studying the extent to which NALMs are able to learn arithmetic. We initially provide a comprehensive review of existing NALMs and take our analysis a step further with empirical results on a new benchmark with evaluation metrics specifically for measuring extrapolation performance. From this, we identify two arithmetic operations to further investigate, namely multiplication and division. For multiplication, we show how stochasticity can be applied to alleviate issues regarding falling into local minimas which cannot extrapolate. For division, we show through an extensive set of empirical results the mechanisms which can aid and hinder robustness. Factors other than the architecture are investigated including using images as the input modality, using a different loss criterion and feature scaling. In the final chapter, we draw inspiration from a human cognitive theory, the Global Workspace Theory (GWT), to develop an end-to-end architecture to combine different NALMs for compositional arithmetic.

University of Southampton

Mistry, Bhumika

36ac2f06-1a50-4c50-ab5e-a57c3faab549

July 2023

Mistry, Bhumika

36ac2f06-1a50-4c50-ab5e-a57c3faab549

Farrahi, Kate

bc848b9c-fc32-475c-b241-f6ade8babacb

Hare, Jonathon

65ba2cda-eaaf-4767-a325-cd845504e5a9

Mistry, Bhumika (2023) An investigation into neural arithmetic logic modules. University of Southampton, Doctoral Thesis, 214pp.

Record type: Thesis (Doctoral)

Abstract

Text

Doctoral Thesis PDFA: An Investigation into Neural Arithmetic Logic Modules by Mistry - Version of Record

Available under License University of Southampton Thesis Licence.

Download (7MB)

Text

Final-thesis-submission-Examination-Miss-Bhumika-Mistry

Restricted to Repository staff only

More information

Published date: July 2023

Related URLs:

Learn more about School of Electronics and Computer Science research

Identifiers

Local EPrints ID: 478926

URI: http://eprints.soton.ac.uk/id/eprint/478926

PURE UUID: a0203875-10ea-42ba-8d8a-af42901361f4

ORCID for Bhumika Mistry:

orcid.org/0000-0003-4555-0121

ORCID for Kate Farrahi:

orcid.org/0000-0001-6775-127X

ORCID for Jonathon Hare:

orcid.org/0000-0003-2921-4283

Catalogue record

Date deposited: 14 Jul 2023 16:31

Last modified: 18 Mar 2024 03:03

Export record

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Bhumika Mistry

Thesis advisor: Kate Farrahi

Thesis advisor: Jonathon Hare

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information