The University of Southampton
University of Southampton Institutional Repository

Development of childhood asthma prediction models using machine learning and data integration

Development of childhood asthma prediction models using machine learning and data integration
Development of childhood asthma prediction models using machine learning and data integration
Childhood asthma is a chronic respiratory disease with substantial heterogeneity in its pathophysiology, presentation, trajectory and risk factors, particularly in early life. With the difficulty of obtaining an objective diagnosis before the age of five, the ability to predict childhood asthma could facilitate the identification of high-risk children, reduce misdiagnoses of probable asthmatics or encourage the implementation of primary prevention strategies and personalised asthma management. To promote the prediction of childhood asthma, a systematic review of existing prognostic prediction models for childhood asthma was conducted and demonstrated that current models have mainly been developed using traditional regression-based methods, with few independently validated and none being used in routine clinical practice. With the exploration of regression-based methods suggested to have been exhausted, this thesis aimed to explore novel approaches of data integration to improve current childhood asthma predictions using machine learning methods.
Using data from the Isle of Wight Birth Cohort (IOWBC, n=1456), the Childhood Asthma Prediction in Early-life (CAPE) and Childhood Asthma Prediction at Preschool-age (CAPP) models were developed to predict school-age asthma at 10 years using state-of-the-art machine learning methods. The CAPE and CAPP models used clinical and environmental data available from the first two year and first four years of life, respectively. Genome-wide genotype and methylation data were used to develop a polygenic risk score (PRS) and two novel methylation risk scores (MRS) (a newborn MRS, nMRS, and childhood MRS, cMRS) to predict childhood asthma, respectively. These genomic models were subsequently incorporated with the CAPE and CAPP models using a step-wise approach. The generalisability of all developed models was evaluated using data from the Manchester Asthma and Allergy Study (MAAS).
The CAPE and CAPP models demonstrated superior performance against their respective benchmark regression-based models based on area under the curve, with the CAPP model also surpassing the current best performing validated model, the Paediatric Asthma Risk Score (AUC: CAPE=0.71 vs. 0.64, CAPP=0.82 vs. PARS=0.80). The models offered good generalisability in MAAS and offered excellent sensitivity to predict a subgroup of individuals presenting with a persistent wheeze phenotype. Individually, the PRS and novel MRSs demonstrated moderate predictive ability (AUC: PRS=0.64, nMRS=0.61, cMRS=0.61). The integration of these genomic risk scores with the CAPE and CAPP models showed marginal improvement in performance (integrated CAPE=0.75, integrated CAPP=0.84). Overall, the incorporation of genetic and epigenetic data to predict the broad phenotype of asthma offered limited predictive improvement.
Using machine learning approaches, the CAPE and CAPP models were able to improve upon the current regression-based models for the prediction of childhood asthma. Coupled with the excellent sensitivity of the CAPE and CAPP models to predict a subgroup of individuals presenting with a persistent wheeze phenotype, this thesis suggests further exploration of the utility of machine learning methods focused on predicting asthma endotypes is warranted.
University of Southampton
Kothalawala, Dilini Mahesha
c22b9e92-e60a-44b6-a34b-2eb37a3a1212
Kothalawala, Dilini Mahesha
c22b9e92-e60a-44b6-a34b-2eb37a3a1212
Holloway, John
4bbd77e6-c095-445d-a36b-a50a72f6fe1a

Kothalawala, Dilini Mahesha (2021) Development of childhood asthma prediction models using machine learning and data integration. University of Southampton, Doctoral Thesis, 285pp.

Record type: Thesis (Doctoral)

Abstract

Childhood asthma is a chronic respiratory disease with substantial heterogeneity in its pathophysiology, presentation, trajectory and risk factors, particularly in early life. With the difficulty of obtaining an objective diagnosis before the age of five, the ability to predict childhood asthma could facilitate the identification of high-risk children, reduce misdiagnoses of probable asthmatics or encourage the implementation of primary prevention strategies and personalised asthma management. To promote the prediction of childhood asthma, a systematic review of existing prognostic prediction models for childhood asthma was conducted and demonstrated that current models have mainly been developed using traditional regression-based methods, with few independently validated and none being used in routine clinical practice. With the exploration of regression-based methods suggested to have been exhausted, this thesis aimed to explore novel approaches of data integration to improve current childhood asthma predictions using machine learning methods.
Using data from the Isle of Wight Birth Cohort (IOWBC, n=1456), the Childhood Asthma Prediction in Early-life (CAPE) and Childhood Asthma Prediction at Preschool-age (CAPP) models were developed to predict school-age asthma at 10 years using state-of-the-art machine learning methods. The CAPE and CAPP models used clinical and environmental data available from the first two year and first four years of life, respectively. Genome-wide genotype and methylation data were used to develop a polygenic risk score (PRS) and two novel methylation risk scores (MRS) (a newborn MRS, nMRS, and childhood MRS, cMRS) to predict childhood asthma, respectively. These genomic models were subsequently incorporated with the CAPE and CAPP models using a step-wise approach. The generalisability of all developed models was evaluated using data from the Manchester Asthma and Allergy Study (MAAS).
The CAPE and CAPP models demonstrated superior performance against their respective benchmark regression-based models based on area under the curve, with the CAPP model also surpassing the current best performing validated model, the Paediatric Asthma Risk Score (AUC: CAPE=0.71 vs. 0.64, CAPP=0.82 vs. PARS=0.80). The models offered good generalisability in MAAS and offered excellent sensitivity to predict a subgroup of individuals presenting with a persistent wheeze phenotype. Individually, the PRS and novel MRSs demonstrated moderate predictive ability (AUC: PRS=0.64, nMRS=0.61, cMRS=0.61). The integration of these genomic risk scores with the CAPE and CAPP models showed marginal improvement in performance (integrated CAPE=0.75, integrated CAPP=0.84). Overall, the incorporation of genetic and epigenetic data to predict the broad phenotype of asthma offered limited predictive improvement.
Using machine learning approaches, the CAPE and CAPP models were able to improve upon the current regression-based models for the prediction of childhood asthma. Coupled with the excellent sensitivity of the CAPE and CAPP models to predict a subgroup of individuals presenting with a persistent wheeze phenotype, this thesis suggests further exploration of the utility of machine learning methods focused on predicting asthma endotypes is warranted.

Text
Development of Childhood Asthma Prediction Models using Machine Learning and Data Integration - Version of Record
Available under License University of Southampton Thesis Licence.
Download (6MB)
Text
Permission_to_deposit_thesis_form_signed_TAN
Restricted to Repository staff only

More information

Published date: November 2021

Identifiers

Local EPrints ID: 474335
URI: http://eprints.soton.ac.uk/id/eprint/474335
PURE UUID: cb427d48-3004-4179-94f1-55520a23c1e2
ORCID for John Holloway: ORCID iD orcid.org/0000-0001-9998-0464

Catalogue record

Date deposited: 20 Feb 2023 17:50
Last modified: 17 Mar 2024 02:45

Export record

Contributors

Author: Dilini Mahesha Kothalawala
Thesis advisor: John Holloway ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×