On properties of deep cascade learning on medical imaging problems
On properties of deep cascade learning on medical imaging problems
This dissertation is on the application of deep neural networks to inference problems in medical imaging. It exploits recent advances in computer vision to solve medical problems. Two issues that motivate this work arise: we need to work with significantly low amount of data and to achieve decisions that are interpretable. We explore Cascade Learning (CL) whereby a neural network is trained layer-by-layer, cast in a transfer learning setting, whereby a model trained using natural image data is adapted (or fine-tuned) to the medical domain, to achieve these two objectives. Cascade-trained network has the property of extracting image features in a coarse to fine way. Hence, as we show in empirical work reported here, the features are better suited for transfer learning than similar networks trained E2E. Empirical results demonstrated on nine different benchmark datasets covering multiple diseases and five different modalities (chest X-ray, whole slide imaging, dermoscopy, endoscopy, and eye fundus imaging) show that CL with transfer (TCL) outperforms not only equivalent E2E trained models but also more sophisticated models such as ResNet and models that take features from self-supervised learning, trained on medical images. In comparison to ResNet with 25.6𝑀 parameters, TCL achieves the same (and often higher) levels of performance with as few as 6.6𝑀 parameters (a 75% reduction). Localisation of features is one important effect of CL. We show, using X-ray data in which human annotation of relevant regions is available, cascade-trained features overlap far better with the annotated bounding boxes. To quantify compactness of features, we use a morphological image processing metric of granulometry and show CL features are often significantly more localized. We also showing that cascade trained models show better calibration and robustness to additive noise in the input, both of which are relevant and important in medical imaging problems. Another consequence of localisation of discriminant features is that models trained in this way may be easier to attack in an adversarial attack setting. We show that this is indeed true through our data showing cascade-trained models are more vulnerable than E2E models under adversarial attack. However, when trained adversarially, both architectures recover their performance to the same extent.
University of Southampton
Wang, Junwen
fea12e84-8be0-4c8e-bd53-186dd353d55f
November 2023
Wang, Junwen
fea12e84-8be0-4c8e-bd53-186dd353d55f
Farrahi, Kate
bc848b9c-fc32-475c-b241-f6ade8babacb
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Wang, Junwen
(2023)
On properties of deep cascade learning on medical imaging problems.
University of Southampton, Doctoral Thesis, 105pp.
Record type:
Thesis
(Doctoral)
Abstract
This dissertation is on the application of deep neural networks to inference problems in medical imaging. It exploits recent advances in computer vision to solve medical problems. Two issues that motivate this work arise: we need to work with significantly low amount of data and to achieve decisions that are interpretable. We explore Cascade Learning (CL) whereby a neural network is trained layer-by-layer, cast in a transfer learning setting, whereby a model trained using natural image data is adapted (or fine-tuned) to the medical domain, to achieve these two objectives. Cascade-trained network has the property of extracting image features in a coarse to fine way. Hence, as we show in empirical work reported here, the features are better suited for transfer learning than similar networks trained E2E. Empirical results demonstrated on nine different benchmark datasets covering multiple diseases and five different modalities (chest X-ray, whole slide imaging, dermoscopy, endoscopy, and eye fundus imaging) show that CL with transfer (TCL) outperforms not only equivalent E2E trained models but also more sophisticated models such as ResNet and models that take features from self-supervised learning, trained on medical images. In comparison to ResNet with 25.6𝑀 parameters, TCL achieves the same (and often higher) levels of performance with as few as 6.6𝑀 parameters (a 75% reduction). Localisation of features is one important effect of CL. We show, using X-ray data in which human annotation of relevant regions is available, cascade-trained features overlap far better with the annotated bounding boxes. To quantify compactness of features, we use a morphological image processing metric of granulometry and show CL features are often significantly more localized. We also showing that cascade trained models show better calibration and robustness to additive noise in the input, both of which are relevant and important in medical imaging problems. Another consequence of localisation of discriminant features is that models trained in this way may be easier to attack in an adversarial attack setting. We show that this is indeed true through our data showing cascade-trained models are more vulnerable than E2E models under adversarial attack. However, when trained adversarially, both architectures recover their performance to the same extent.
Text
Thesis_JunwenWang_PDFA
- Version of Record
Text
Final-thesis-submission-Examination-Mr-Junwen-Wang
Restricted to Repository staff only
More information
Published date: November 2023
Identifiers
Local EPrints ID: 486653
URI: http://eprints.soton.ac.uk/id/eprint/486653
PURE UUID: 6bdb7a36-8fbd-4c9c-98f1-ad64081f45ca
Catalogue record
Date deposited: 30 Jan 2024 17:57
Last modified: 21 May 2024 04:01
Export record
Contributors
Author:
Junwen Wang
Thesis advisor:
Kate Farrahi
Thesis advisor:
Mahesan Niranjan
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics