Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis
Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis
BACKGROUND: Idiopathic pulmonary fibrosis (IPF) is a debilitating, progressive disease with a median survival time of 3-5 years. Diagnosis remains challenging and disease progression varies greatly, suggesting the possibility of distinct subphenotypes.
METHODS AND RESULTS: We analysed publicly available peripheral blood mononuclear cell expression datasets for 219 IPF, 411 asthma, 362 tuberculosis, 151 healthy, 92 HIV and 83 other disease samples, totalling 1318 patients. We integrated the datasets and split them into train (n=871) and test (n=477) cohorts to investigate the utility of a machine learning model (support vector machine) for predicting IPF. A panel of 44 genes predicted IPF in a background of healthy, tuberculosis, HIV and asthma with an area under the curve of 0.9464, corresponding to a sensitivity of 0.865 and a specificity of 0.89. We then applied topological data analysis to investigate the possibility of subphenotypes within IPF. We identified five molecular subphenotypes of IPF, one of which corresponded to a phenotype enriched for death/transplant. The subphenotypes were molecularly characterised using bioinformatic and pathway analysis tools identifying distinct subphenotype features including one which suggests an extrapulmonary or systemic fibrotic disease.
CONCLUSIONS: Integration of multiple datasets, from the same tissue, enabled the development of a model to accurately predict IPF using a panel of 44 genes. Furthermore, topological data analysis identified distinct subphenotypes of patients with IPF which were defined by differences in molecular pathobiology and clinical characteristics.
idiopathic pulmonary fibrosis
682-689
Shapanis, Andrew
98b07884-92a9-4c00-afad-12194e339cbc
Jones, Mark G
a6fd492e-058e-4e84-a486-34c6035429c1
Schofield, James
529d3c88-857e-4431-93c2-e76577377ba7
Skipp, Paul
1ba7dcf6-9fe7-4b5c-a9d0-e32ed7f42aa5
1 July 2023
Shapanis, Andrew
98b07884-92a9-4c00-afad-12194e339cbc
Jones, Mark G
a6fd492e-058e-4e84-a486-34c6035429c1
Schofield, James
529d3c88-857e-4431-93c2-e76577377ba7
Skipp, Paul
1ba7dcf6-9fe7-4b5c-a9d0-e32ed7f42aa5
Shapanis, Andrew, Jones, Mark G, Schofield, James and Skipp, Paul
(2023)
Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis.
Thorax, 78 (7), .
(doi:10.1136/thorax-2022-219731).
Abstract
BACKGROUND: Idiopathic pulmonary fibrosis (IPF) is a debilitating, progressive disease with a median survival time of 3-5 years. Diagnosis remains challenging and disease progression varies greatly, suggesting the possibility of distinct subphenotypes.
METHODS AND RESULTS: We analysed publicly available peripheral blood mononuclear cell expression datasets for 219 IPF, 411 asthma, 362 tuberculosis, 151 healthy, 92 HIV and 83 other disease samples, totalling 1318 patients. We integrated the datasets and split them into train (n=871) and test (n=477) cohorts to investigate the utility of a machine learning model (support vector machine) for predicting IPF. A panel of 44 genes predicted IPF in a background of healthy, tuberculosis, HIV and asthma with an area under the curve of 0.9464, corresponding to a sensitivity of 0.865 and a specificity of 0.89. We then applied topological data analysis to investigate the possibility of subphenotypes within IPF. We identified five molecular subphenotypes of IPF, one of which corresponded to a phenotype enriched for death/transplant. The subphenotypes were molecularly characterised using bioinformatic and pathway analysis tools identifying distinct subphenotype features including one which suggests an extrapulmonary or systemic fibrotic disease.
CONCLUSIONS: Integration of multiple datasets, from the same tissue, enabled the development of a model to accurately predict IPF using a panel of 44 genes. Furthermore, topological data analysis identified distinct subphenotypes of patients with IPF which were defined by differences in molecular pathobiology and clinical characteristics.
Text
thorax-2022-219731.full
- Version of Record
More information
Accepted/In Press date: 19 January 2023
e-pub ahead of print date: 20 February 2023
Published date: 1 July 2023
Additional Information:
© Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.
Keywords:
idiopathic pulmonary fibrosis
Identifiers
Local EPrints ID: 476706
URI: http://eprints.soton.ac.uk/id/eprint/476706
ISSN: 0040-6376
PURE UUID: 72e2e274-6e13-4db4-b501-6cc28672ad3d
Catalogue record
Date deposited: 11 May 2023 16:59
Last modified: 01 May 2024 01:59
Export record
Altmetrics
Contributors
Author:
James Schofield
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics