The University of Southampton
University of Southampton Institutional Repository

Cascade learning localises discriminant features in visual scene classification

Cascade learning localises discriminant features in visual scene classification
Cascade learning localises discriminant features in visual scene classification
Lack of interpretability of deep convolutional neural networks (DCNN) is a well-known problem particularly in the medical domain as clinicians want trustworthy automated decisions. One way to improve trust is to demonstrate the localisation of feature representations with respect to expert labeled regions of interest. In this work, we investigate the localisation of features learned via two varied learning paradigms and demonstrate the superiority of one learning approach with respect to localisation. Our analysis on medical and natural datasets show that the traditional end-to-end (E2E) learning strategy has a limited ability to localise discriminative features across multiple network layers. We show that a layer-wise learning strategy, namely cascade learning (CL), results in more localised features. Considering localisation accuracy, we not only show that CL outperforms E2E but that it is a promising method of predicting regions. On the YOLO object detection framework, our best result shows that CL outperforms the E2E scheme by 2% in mAP.
cs.CV
arXiv
Wang, Junwen
fea12e84-8be0-4c8e-bd53-186dd353d55f
Farrahi, Katayoun
bc848b9c-fc32-475c-b241-f6ade8babacb
Wang, Junwen
fea12e84-8be0-4c8e-bd53-186dd353d55f
Farrahi, Katayoun
bc848b9c-fc32-475c-b241-f6ade8babacb

[Unknown type: UNSPECIFIED]

Record type: UNSPECIFIED

Abstract

Lack of interpretability of deep convolutional neural networks (DCNN) is a well-known problem particularly in the medical domain as clinicians want trustworthy automated decisions. One way to improve trust is to demonstrate the localisation of feature representations with respect to expert labeled regions of interest. In this work, we investigate the localisation of features learned via two varied learning paradigms and demonstrate the superiority of one learning approach with respect to localisation. Our analysis on medical and natural datasets show that the traditional end-to-end (E2E) learning strategy has a limited ability to localise discriminative features across multiple network layers. We show that a layer-wise learning strategy, namely cascade learning (CL), results in more localised features. Considering localisation accuracy, we not only show that CL outperforms E2E but that it is a promising method of predicting regions. On the YOLO object detection framework, our best result shows that CL outperforms the E2E scheme by 2% in mAP.

Text
2311.12704v2 - Author's Original
Available under License Creative Commons Attribution.
Download (5MB)

More information

Published date: 21 November 2023
Keywords: cs.CV

Identifiers

Local EPrints ID: 489971
URI: http://eprints.soton.ac.uk/id/eprint/489971
PURE UUID: 65c56843-6a29-4b6a-ad01-ef8cd17c8a57
ORCID for Katayoun Farrahi: ORCID iD orcid.org/0000-0001-6775-127X

Catalogue record

Date deposited: 09 May 2024 16:31
Last modified: 10 May 2024 01:51

Export record

Altmetrics

Contributors

Author: Junwen Wang
Author: Katayoun Farrahi ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×