The University of Southampton
University of Southampton Institutional Repository

Deep cascade learning

Deep cascade learning
Deep cascade learning
In this paper, we propose a novel approach for efficient training of deep neural networks in a bottom-up fashion using a layered structure. Our algorithm, which we refer to as Deep Cascade Learning, is motivated by the Cascade Correlation approach of Fahlman who introduced it in the context of perceptrons. We demonstrate our algorithm on networks of convolutional layers, though its applicability is more general. Such training of deep networks in a cascade, directly circumvents the well-known vanishing gradient problem by ensuring that the output is always adjacent to the layer being trained. We present empirical evaluations comparing our deep cascade training with standard End-End training using back propagation of two convolutional neural network architectures on benchmark image classification tasks (CIFAR-10 and CIFAR-100). We then investigate the features learned by the approach and find that better, domain-specific, representations are learned in early layers when compared to what is learned in End-End training. This is partially attributable to the vanishing gradient problem which inhibits early layer filters to change significantly from their initial settings. While both networks perform similarly overall, recognition accuracy increases progressively with each added layer, with discriminative features learnt in every stage of the network, whereas in End-End training, no such systematic feature representation was observed. We also show that such cascade training has significant computational and memory advantages over End-End training, and can be used as a pre-training algorithm to obtain a better performance.
2162-237X
5475-5485
Marquez, Enrique, Salvador
e0489457-4bcb-406a-845a-2f33d2221e93
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Marquez, Enrique, Salvador
e0489457-4bcb-406a-845a-2f33d2221e93
Hare, Jonathon
65ba2cda-eaaf-4767-a325-cd845504e5a9
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Marquez, Enrique, Salvador, Hare, Jonathon and Niranjan, Mahesan (2018) Deep cascade learning. IEEE Transactions on Neural Networks and Learning Systems, 29 (11), 5475-5485, [8307262]. (doi:10.1109/TNNLS.2018.2805098).

Record type: Article

Abstract

In this paper, we propose a novel approach for efficient training of deep neural networks in a bottom-up fashion using a layered structure. Our algorithm, which we refer to as Deep Cascade Learning, is motivated by the Cascade Correlation approach of Fahlman who introduced it in the context of perceptrons. We demonstrate our algorithm on networks of convolutional layers, though its applicability is more general. Such training of deep networks in a cascade, directly circumvents the well-known vanishing gradient problem by ensuring that the output is always adjacent to the layer being trained. We present empirical evaluations comparing our deep cascade training with standard End-End training using back propagation of two convolutional neural network architectures on benchmark image classification tasks (CIFAR-10 and CIFAR-100). We then investigate the features learned by the approach and find that better, domain-specific, representations are learned in early layers when compared to what is learned in End-End training. This is partially attributable to the vanishing gradient problem which inhibits early layer filters to change significantly from their initial settings. While both networks perform similarly overall, recognition accuracy increases progressively with each added layer, with discriminative features learnt in every stage of the network, whereas in End-End training, no such systematic feature representation was observed. We also show that such cascade training has significant computational and memory advantages over End-End training, and can be used as a pre-training algorithm to obtain a better performance.

Text
FINAL VERSION - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (3MB)
Text
08307262 - Version of Record
Available under License Creative Commons Attribution.
Download (2MB)

More information

Accepted/In Press date: 30 January 2018
e-pub ahead of print date: 6 March 2018
Published date: 19 October 2018

Identifiers

Local EPrints ID: 417849
URI: http://eprints.soton.ac.uk/id/eprint/417849
ISSN: 2162-237X
PURE UUID: c35c0048-7c2f-4ec9-9d54-3c9ad90a3b86
ORCID for Jonathon Hare: ORCID iD orcid.org/0000-0003-2921-4283
ORCID for Mahesan Niranjan: ORCID iD orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 15 Feb 2018 17:30
Last modified: 16 Mar 2024 03:55

Export record

Altmetrics

Contributors

Author: Enrique, Salvador Marquez
Author: Jonathon Hare ORCID iD
Author: Mahesan Niranjan ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×