Applications of machine learning for the diagnosis of sleep apnoea and the causal analysis of sleep studies

This thesis is an exploration of the applications of machine learning (ML) to issues relating to sleep apnoea. A person with sleep apnoea suffers apnoeas (cessation of breathing) and hypopnoeas (less severe events) during sleep. It is a common condition that is associated with negative health outcomes. The aims of the study are to reduce the cost and difficulty of diagnosing sleep apnoea and to uncover the mechanisms of sleep using causal discovery. Subsequently, contributions are made to methods of causal discovery.

Patients with suspected sleep apnoea are required to undertake a sleep study, in which a number of continuous measurements are taken overnight. The data from the study are then assessed by a specialist. This carries a significant cost, due to the equipment required and the demands on clinicians' time, limiting the capacity of healthcare providers to deal with this serious and underdiagnosed condition. It would save resources if this process were automated using machine learning. In addition, clinicians would like to use a standalone pulse oximeter for this, rather than the full sleep study kit. In this thesis, ML models are trained to classify windows of a pulse oximetry trace according to whether or not they contain an apnoea or hypopnoea, and whether or not the patient is asleep. It is demonstrated that models trained on data from the full kit can achieve good results on data from the standalone pulse oximeter. Contrary to expectations, the prediction error from these experiments is not associated with the perfusion index (PI), a measure used to quantify signal quality.

Greater understanding of the physiological mechanisms of sleep could improve treatment. Accordingly, causal reasoning is applied to this field of study. Firstly, Granger causality is applied to a large sleep study dataset. DYNOTEARS, a recently-introduced method of structure learning by continuous optimisation, is then applied to the same data, and the results are compared. With both methods, an association is found between the girth of a person's waist and the presence of a causal link between the heart and the brain.

DYNOTEARS is highly dependent on the choice of hyperparameters, a fact which is often overlooked in the literature. This is demonstrated, and it is shown that Bayesian optimisation can find these hyperparameters quickly and reliably. This improves the ease of use of this method.

The development of time-series structure learning models is limited by the lack of benchmarking data. This data must be synthetic so that the true causal structure is known, but it is challenging to create stable models for this purpose. Initial work is presented towards a new technique to randomly generate stable models for benchmarking structure learning methods.

10.5258/SOTON/T0086

University of Southampton

Thomas, Alexander Edward

cf9b912d-19f4-41f0-9b3c-47eddb240ae6

2026

Thomas, Alexander Edward

cf9b912d-19f4-41f0-9b3c-47eddb240ae6

Niranjan, Mahesan

5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Legg, Julian

72afc985-52a4-4504-84c4-50a8e018891f

MacArthur, Ben

4fa2fa9d-b8e5-48b3-b98d-930e1c7f7fff

Thomas, Alexander Edward (2026) Applications of machine learning for the diagnosis of sleep apnoea and the causal analysis of sleep studies. University of Southampton, Doctoral Thesis, 134pp.

Record type: Thesis (Doctoral)