Optimizing supervised machine learning algorithms with practical applications.
Optimizing supervised machine learning algorithms with practical applications.
Machine learning (ML) and artificial intelligence (AI) are rapidly growing fields with applications in many scientific domains. In this work we focus on supervised learning algorithms for prediction tasks in practical applications. Specifically, we focus on two applications, predicting adverse events in low-carbon energy production and screening patients’ eligibility for implantable defibrillators. We present optimised pre-processing methods, optimise hyperparameter selections and present our own bilevel optimisation model for simultaneous training and hyperparameter tuning. We first consider the problem of predicting infrequently occurring adverse events from time series data (Provided by Andigestion Ltd, a UK-based anaerobic digestion company, and a civil nuclear power plant in the UK) for which we propose a framework for modelling this problem as an imbalanced classification task and we construct and compare numerous models and sampling techniques. The models developed here could be integrated into a decision support tool for providing advanced warning of adverse events which can lead to significant commercial benefits. We then propose an AI tool for automated, prolonged screening of patients’ implantable defibrillator eligibility which is created using real ECG data provided by our partners at the University Hospital Southampton. As we demonstrate in our experiments, this tool is capable of predicting patients’ T:R ratios (a major indication of implantation eligibility) to within 0.0461 of their true values. This level of accuracy is sufficient to facilitate the automation of the measurement process. We show how this tool can enable cardiologists to perform 24-hour automated screenings, thus allowing them to better determine patients’ eligibility for implantation. Finally, we formulate the bilevel problem of hyperparameter selection for non-linear kernel support vector machines via k-fold cross validation and propose an algorithm for solving it, for which we demonstrate some convergence properties. We provide a number of examples of this algorithm in use on a number of real data sets from the UCI repository.
University of Southampton
Dunn, Anthony James
161d9c8e-6813-4909-95ea-6c11bbbca287
11 June 2023
Dunn, Anthony James
161d9c8e-6813-4909-95ea-6c11bbbca287
Zemkoho, Alain
30c79e30-9879-48bd-8d0b-e2fbbc01269e
Coniglio, Stefano
03838248-2ce4-4dbc-a6f4-e010d6fdac67
Qi, Hou-Duo
e9789eb9-c2bc-4b63-9acb-c7e753cc9a85
Dunn, Anthony James
(2023)
Optimizing supervised machine learning algorithms with practical applications.
University of Southampton, Doctoral Thesis, 172pp.
Record type:
Thesis
(Doctoral)
Abstract
Machine learning (ML) and artificial intelligence (AI) are rapidly growing fields with applications in many scientific domains. In this work we focus on supervised learning algorithms for prediction tasks in practical applications. Specifically, we focus on two applications, predicting adverse events in low-carbon energy production and screening patients’ eligibility for implantable defibrillators. We present optimised pre-processing methods, optimise hyperparameter selections and present our own bilevel optimisation model for simultaneous training and hyperparameter tuning. We first consider the problem of predicting infrequently occurring adverse events from time series data (Provided by Andigestion Ltd, a UK-based anaerobic digestion company, and a civil nuclear power plant in the UK) for which we propose a framework for modelling this problem as an imbalanced classification task and we construct and compare numerous models and sampling techniques. The models developed here could be integrated into a decision support tool for providing advanced warning of adverse events which can lead to significant commercial benefits. We then propose an AI tool for automated, prolonged screening of patients’ implantable defibrillator eligibility which is created using real ECG data provided by our partners at the University Hospital Southampton. As we demonstrate in our experiments, this tool is capable of predicting patients’ T:R ratios (a major indication of implantation eligibility) to within 0.0461 of their true values. This level of accuracy is sufficient to facilitate the automation of the measurement process. We show how this tool can enable cardiologists to perform 24-hour automated screenings, thus allowing them to better determine patients’ eligibility for implantation. Finally, we formulate the bilevel problem of hyperparameter selection for non-linear kernel support vector machines via k-fold cross validation and propose an algorithm for solving it, for which we demonstrate some convergence properties. We provide a number of examples of this algorithm in use on a number of real data sets from the UCI repository.
Text
Anthony_Dunn_Doctoral_Thesis
- Version of Record
Text
Final-thesis-submission-Examination-Mr-Anthony-Dunn
Restricted to Repository staff only
More information
Published date: 11 June 2023
Identifiers
Local EPrints ID: 478127
URI: http://eprints.soton.ac.uk/id/eprint/478127
PURE UUID: bbb170b4-e659-4219-91d6-5bd9401d4d41
Catalogue record
Date deposited: 22 Jun 2023 16:34
Last modified: 17 Mar 2024 03:40
Export record
Contributors
Author:
Anthony James Dunn
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics