Classification and regression analysis of lung tumors from multi-level gene expression data
Classification and regression analysis of lung tumors from multi-level gene expression data
We study classification and regression problems in lung tumors where high throughput gene expression is measured at multiple levels: epi-genetics, transcription and protein. We uncover the correlates of smoking and gender-specificity in lung tumors. Different genes are indicative of smoking levels, gender and survival rates at these different levels. We also carry out an integrative analysis, by feature selection from the pool of all three levels of features. Our results show that the epigenetic information in DNA methylation is a better marker for smoking status than gene expression either at the transcript or protein levels. Further, surprisingly, integrative analysis using multi-level gene expression offers no significant advantage over the individual levels in the classification and survival prediction problems considered.
Lung cancer, Survival Analysis, Multi-omics, Smoking and gene expression, transcriptome, Proteome, Methylation, TCGA, integrative analysis
1-8
Jeyananthan, Pratheeba
f4c533ad-d3f4-43c5-ae6e-5f201729b71d
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Jeyananthan, Pratheeba
f4c533ad-d3f4-43c5-ae6e-5f201729b71d
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Jeyananthan, Pratheeba and Niranjan, Mahesan
(2019)
Classification and regression analysis of lung tumors from multi-level gene expression data.
In Classification and regression analysis of lung tumors from multi-level gene expression data.
IEEE.
.
(doi:10.1109/IJCNN.2019.8852282).
Record type:
Conference or Workshop Item
(Paper)
Abstract
We study classification and regression problems in lung tumors where high throughput gene expression is measured at multiple levels: epi-genetics, transcription and protein. We uncover the correlates of smoking and gender-specificity in lung tumors. Different genes are indicative of smoking levels, gender and survival rates at these different levels. We also carry out an integrative analysis, by feature selection from the pool of all three levels of features. Our results show that the epigenetic information in DNA methylation is a better marker for smoking status than gene expression either at the transcript or protein levels. Further, surprisingly, integrative analysis using multi-level gene expression offers no significant advantage over the individual levels in the classification and survival prediction problems considered.
Text
Classification and Regression Analysis of Lung Tumors from Multi-level Gene Expression Data
- Accepted Manuscript
More information
Accepted/In Press date: 14 July 2019
e-pub ahead of print date: 30 September 2019
Keywords:
Lung cancer, Survival Analysis, Multi-omics, Smoking and gene expression, transcriptome, Proteome, Methylation, TCGA, integrative analysis
Identifiers
Local EPrints ID: 434897
URI: http://eprints.soton.ac.uk/id/eprint/434897
ISSN: 2161-4393
PURE UUID: 39983022-0a56-4454-9f00-f145c9e96969
Catalogue record
Date deposited: 15 Oct 2019 16:30
Last modified: 17 Mar 2024 03:11
Export record
Altmetrics
Contributors
Author:
Pratheeba Jeyananthan
Author:
Mahesan Niranjan
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics