Machine learning techniques for medical image analysis with data scarcity

Medical imaging is integral to modern healthcare for diagnosing, treating, and monitoring various conditions. There is considerable interest in leveraging machine learning to derive insights from medical imaging, particularly in tasks such as image classification, segmentation, and anomaly detection. Unlike natural image recognition, medical image analysis faces challenges due to the limited annotated data availability and the complexity of integrating multi-modal medical information.

This thesis aims to develop machine/deep learning techniques to address these challenges, improving diagnostic efficiency and reducing the labour-intensive nature of current medical practices. We designed a data-based few-shot learning scheme to investigate the use of pre-trained deep learning models to extract meaningful data representations, focusing on scenarios where data are sparse relative to feature dimensions. A novel approach using non-negative matrix factorization (NMF), particularly discriminative variants like DNMF and SCNMFS, is explored for dimensionality reduction in low-data settings typical of medical inference tasks.

Additionally, we propose a method for integrating multi-modal medical data to generate standardized medical reports. The proposed 'data-text-data' transformation strategy enhances interpretability and accuracy by converting input indicators into sequential word-embedded representations and then reconstructing them into their original format, ensuring clinically relevant outcomes.

Moreover, to address the scarcity of pixel-level annotations in medical imaging, we introduce a diffusion model with discrepancy-based features. This approach translates inconsistencies in image-level annotations into distribution discrepancies among heterogeneous samples while preserving information within homogeneous samples. Unlike traditional segmentation methods that rely heavily on pairwise annotations, this method enhances segmentation accuracy by implicitly leveraging annotation distributions and generative learning paradigms within medical data.

Overall, these contributions aim to advance the application of machine/deep learning in medical imaging, addressing challenges related to data scarcity and the complexity of integrating multi-modal medical information in clinical settings.

University of Southampton

Fan, Keqiang

0b1613e0-0167-425e-9ab9-986054928dd2

January 2025

Fan, Keqiang

0b1613e0-0167-425e-9ab9-986054928dd2

Cai, Xiaohao

de483445-45e9-4b21-a4e8-b0427fc72cee

Niranjan, Mahesan

5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Fan, Keqiang (2025) Machine learning techniques for medical image analysis with data scarcity. University of Southampton, Doctoral Thesis, 116pp.

Record type: Thesis (Doctoral)

Abstract

Text

Machine Learning Techniques for Medical Image Analysis with Data Scarcity - Version of Record

Available under License University of Southampton Thesis Licence.

Download (26MB)

Text

Final-thesis-submission-Examination-Mr-Keqiang-Fan

Restricted to Repository staff only

More information

Published date: January 2025

Related URLs:

Learn more about School of Electronics and Computer Science research

Identifiers

Local EPrints ID: 497373

URI: http://eprints.soton.ac.uk/id/eprint/497373

PURE UUID: 9364e48c-81a2-4667-8f00-c847c799e50e

ORCID for Keqiang Fan:

orcid.org/0000-0002-9411-2892

ORCID for Xiaohao Cai:

orcid.org/0000-0003-0924-2834

ORCID for Mahesan Niranjan:

orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 21 Jan 2025 17:43

Last modified: 22 Aug 2025 02:29

Export record

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Keqiang Fan

Thesis advisor: Xiaohao Cai

Thesis advisor: Mahesan Niranjan

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information