The University of Southampton
University of Southampton Institutional Repository

Towards an understanding of generalisation in deep learning: an analysis of the transformation of information in convolutional neural networks

Towards an understanding of generalisation in deep learning: an analysis of the transformation of information in convolutional neural networks
Towards an understanding of generalisation in deep learning: an analysis of the transformation of information in convolutional neural networks
Despite their enormous size, Deep Neural Networks are able to achieve exceptional performance across a wide variety of machine learning problems, and have become a de facto standard in many areas of machine learning. The ability of such large models to reliably achieve good generalisation is difficult to reconcile with conventional theory on machine learning, which bounds the generalisation capability based on the size of the model, implying that more complex models should not — but importantly not that they cannot — reliably generalise.

In this work, I investigate generalisation within the specific domain of Convolutional Neural Networks (CNNs) applied to image classification problems. I investigate the way in which the layers of a CNN transform the data, and how this may entail the good generalisation performance these models exhibit. I investigate how margins between classes manifest and change, showing that the different operations in the network can increase or decrease the margin, as well as change the shape of the data in relation to the margin. I combine this with a replication and extension of the use of hidden layer probes to investigate how the classification problem changes through the network, showing that linear separability emerges through the networks, to an extent that almost matches the full classification performance of the network. I show how this linear separability aligns with some of the patterns seen in the class margins, and how the convolutions and activations work in tandem to both increase the margin and the linear separability. Finally, I extend the existing work on hidden layer probes to investigate globally pooled features within the model, showing that the information distilled by the network at each stage is primarily in coarse features, rather than at the pixel level.
University of Southampton
Belcher, Dominic
3ab2a3bc-8594-4eee-ae21-2df69a8d1721
Belcher, Dominic
3ab2a3bc-8594-4eee-ae21-2df69a8d1721
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e
Dasmahapatra, Srinandan
eb5fd76f-4335-4ae9-a88a-20b9e2b3f698

Belcher, Dominic (2025) Towards an understanding of generalisation in deep learning: an analysis of the transformation of information in convolutional neural networks. University of Southampton, Masters Thesis, 107pp.

Record type: Thesis (Masters)

Abstract

Despite their enormous size, Deep Neural Networks are able to achieve exceptional performance across a wide variety of machine learning problems, and have become a de facto standard in many areas of machine learning. The ability of such large models to reliably achieve good generalisation is difficult to reconcile with conventional theory on machine learning, which bounds the generalisation capability based on the size of the model, implying that more complex models should not — but importantly not that they cannot — reliably generalise.

In this work, I investigate generalisation within the specific domain of Convolutional Neural Networks (CNNs) applied to image classification problems. I investigate the way in which the layers of a CNN transform the data, and how this may entail the good generalisation performance these models exhibit. I investigate how margins between classes manifest and change, showing that the different operations in the network can increase or decrease the margin, as well as change the shape of the data in relation to the margin. I combine this with a replication and extension of the use of hidden layer probes to investigate how the classification problem changes through the network, showing that linear separability emerges through the networks, to an extent that almost matches the full classification performance of the network. I show how this linear separability aligns with some of the patterns seen in the class margins, and how the convolutions and activations work in tandem to both increase the margin and the linear separability. Finally, I extend the existing work on hidden layer probes to investigate globally pooled features within the model, showing that the information distilled by the network at each stage is primarily in coarse features, rather than at the pixel level.

Text
MPhil-4 - Version of Record
Available under License University of Southampton Thesis Licence.
Download (8MB)
Text
Final-thesis-submission-Examination-Mr-Dominic-Belcher
Restricted to Repository staff only

More information

Published date: 2025

Identifiers

Local EPrints ID: 502042
URI: http://eprints.soton.ac.uk/id/eprint/502042
PURE UUID: 49cb30bb-846a-4ef4-9ea1-8c4a04aab537
ORCID for Srinandan Dasmahapatra: ORCID iD orcid.org/0000-0002-9757-5315

Catalogue record

Date deposited: 13 Jun 2025 17:38
Last modified: 11 Sep 2025 01:59

Export record

Contributors

Author: Dominic Belcher
Thesis advisor: Adam Prugel-Bennett
Thesis advisor: Srinandan Dasmahapatra ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×