Towards an understanding of generalisation in deep learning: an analysis of the transformation of information in convolutional neural networks

Despite their enormous size, Deep Neural Networks are able to achieve exceptional performance across a wide variety of machine learning problems, and have become a de facto standard in many areas of machine learning. The ability of such large models to reliably achieve good generalisation is difficult to reconcile with conventional theory on machine learning, which bounds the generalisation capability based on the size of the model, implying that more complex models should not — but importantly not that they cannot — reliably generalise.

In this work, I investigate generalisation within the specific domain of Convolutional Neural Networks (CNNs) applied to image classification problems. I investigate the way in which the layers of a CNN transform the data, and how this may entail the good generalisation performance these models exhibit. I investigate how margins between classes manifest and change, showing that the different operations in the network can increase or decrease the margin, as well as change the shape of the data in relation to the margin. I combine this with a replication and extension of the use of hidden layer probes to investigate how the classification problem changes through the network, showing that linear separability emerges through the networks, to an extent that almost matches the full classification performance of the network. I show how this linear separability aligns with some of the patterns seen in the class margins, and how the convolutions and activations work in tandem to both increase the margin and the linear separability. Finally, I extend the existing work on hidden layer probes to investigate globally pooled features within the model, showing that the information distilled by the network at each stage is primarily in coarse features, rather than at the pixel level.

University of Southampton

Belcher, Dominic

3ab2a3bc-8594-4eee-ae21-2df69a8d1721

2025

Belcher, Dominic

3ab2a3bc-8594-4eee-ae21-2df69a8d1721

Prugel-Bennett, Adam

b107a151-1751-4d8b-b8db-2c395ac4e14e

Dasmahapatra, Srinandan

eb5fd76f-4335-4ae9-a88a-20b9e2b3f698

Belcher, Dominic (2025) Towards an understanding of generalisation in deep learning: an analysis of the transformation of information in convolutional neural networks. University of Southampton, Masters Thesis, 107pp.

Record type: Thesis (Masters)