Zhou, Yilu (2023) Large-scale data analysis and integration to advance precision prognosis, therapy stratification and understanding of human disease. University of Southampton, Doctoral Thesis, 243pp.
Abstract
Since the traditional one-size-fits-all approach ignores the differences among individuals, successful treatment in some patients may fail in others, leading to frustrating results. In contrast, precision medicine, an approach that fully accounts for each individual's genetic, environmental and lifestyle differences, has received increasing attention. With sequencing breakthroughs and rapid decreases in sequencing costs, the medical data explosion further accelerates the evolution of precision medicine. Large-scale data will provide tremendous information to doctors and researchers, and allow them to identify subgroups of clinically or mechanistically similar patients, take more effective preventive measures and offer optimal strategies, ultimately achieving the goal of providing the right treatment for the right patient at the right time. This data-driven approach is particularly pivotal for some rare diseases, such as idiopathic pulmonary fibrosis (IPF) and neuroblastoma. IPF is a chronic interstitial lung disease characterised by abnormal deposition of extracellular matrix (ECM), but the exact mechanism remains unclear. For example, the role of epithelialmesenchymal transition (EMT), a biological process whereby epithelial cells lose cell polarity and cell junctions and acquire a mesenchymal phenotype, in IPF is also controversial. In addition to this, the lack of suitable models that can be used to predict the prognosis of patients with IPF is another urgent issue that needs to be addressed. Neuroblastoma is a malignant childhood tumour of sympathetic origin. Although a range of oncogenes has been identified, research failed to uncover the most striking phenomenon of it, significant heterogeneity ranging from spontaneous regression to extensive metastasis leading to death. Therefore, large-scale data analysis was performed to address the unmet clinical in both diseases. Through proteomic analysis of RAS or TGF-β activated alveolar type II (ATII) cells, we confirm that RAS activation induces an epithelial-mesenchymal transition (EMT) signature while activation of TGF-β signalling alone only induces a partial EMT under the same conditions. In parallel, the activation of the pseudohypoxic hypoxia-inducible factor (HIF) pathway has been demonstrated to influence pathogenetic collagen structure-function in IPF. We further identify increased HIF pathway activation in IPF with prognostic values by analysing microarray data of bronchoalveolar lavage (BAL) and peripheral blood mononuclear cells (PBMC) from several independent cohorts. For neuroblastoma, systematic integration of neuroblastoma transcriptional data has revealed that MYCN non-amplified neuroblastomas can be divided into 3 subgroups with distinct clinical features and molecular patterns, in which patients may benefit from different therapeutic approaches. Together, these findings illustrate the unique insights generated from large-scale data analysis with regard to mechanistic understanding, precise prognosis, or subgroup stratification. With better data quality, increased data quantity, and more advanced analytical strategies, precision medicine would make an indelible contribution to the health of society.
More information
Identifiers
Catalogue record
Export record
Contributors
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.