The University of Southampton
University of Southampton Institutional Repository
Warning ePrints Soton is experiencing an issue with some file downloads not being available. We are working hard to fix this. Please bear with us.

In epigenetic studies including cell type adjustments in regression models can introduce multicollinearity, resulting in apparent reversal of direction of association

In epigenetic studies including cell type adjustments in regression models can introduce multicollinearity, resulting in apparent reversal of direction of association
In epigenetic studies including cell type adjustments in regression models can introduce multicollinearity, resulting in apparent reversal of direction of association
Background: Association studies of epigenome-wide DNA methylation and disease can inform biological mechanisms. DNA methylation is often measured in peripheral blood, with heterogeneous cell types with different methylation profiles. Influences such as adiposity-associated inflammation can change cell type proportions, altering measured blood methylation levels. To determine whether associations between loci-specific methylation and outcomes result from cellular heterogeneity many studies adjust for estimated blood cell proportions, but high correlations between methylation and cell type proportions could violate the statistical assumption of no multicollinearity. We examined these assumptions in a population-based study. Methods: CDKN2A promoter CpG methylation was measured in peripheral blood from 812 adolescents aged 17-years (Western Australian RAINE mother-offspring cohort). Loge adolescent BMI was used as the outcome in a regression analysis with DNA methylation as predictor, adjusting for age/sex. Further regression analyses additionally adjusted for estimated cell type proportions using the reference-based Houseman method, and simulations modelled the effects of varying levels of correlation between cell proportions and methylation. Correlations between estimated cell proportions and CpG methylation from Illumina 450K were measured. Results: Lower DNA methylation was associated with higher BMI when cell type adjustment was not included; for CpG4 β=-0.004 logeBMI/%methylation (95%CI -0.0065, -0.001; p=0.003). The direction of association reversed when adjustment for 6 cell types was made; for CpG4 β=0.004 logeBMI/%methylation (-0.0002, 0.0089; p=0.06). Correlations between CpG methylation and cell type proportions were high, and Variance Inflation Factors (VIFs) were extremely high (25 to 113.7). Granulocyte count was correlated with BMI, and removing granulocytes from the regression model reduced all VIFs to <3.1, with persistence of a positive association between methylation and BMI (CpG4 β=0.004 logeBMI/%methylation (-0.0002, 0.0088; p=0.06)). Simulations supported major effects of multicollinearity on regression results. Conclusions: Where cell types are highly correlated with other covariates in regression models the statistical assumption of no multicollinearity may be violated. This can result in reversal of direction of association, particularly when examining associations with phenotypes related to inflammation, as CpG methylation may associate with changes in cell type proportions. Removing predictors with high correlations from regression models may remove the multicollinearity. However this might hinder biological interpretability.
1664-8021
Barton, Sheila
4f674382-ca0b-44ad-9670-e71a0b134ef0
Melton, Phillip E.
0d57b167-a10c-4b22-83da-b983f378966f
Titcombe, Philip
a84c9fad-0580-42f9-8bb6-db0fe20435aa
Murray, Robert
c3e973b5-525c-49b3-96ee-af60a666a0f4
Rauschert, Sebastian
8fd6908a-271d-4361-88eb-091199dedbf8
Lillycrop, Karen
eeaaa78d-0c4d-4033-a178-60ce7345a2cc
Huang, Rae-Chi
d39aca4d-8017-48c3-8f40-0aa2e52dbf66
Holbrook, Joanna D.
69989b79-2710-4f12-946e-c6214e1b6513
Godfrey, Keith
0931701e-fe2c-44b5-8f0d-ec5c7477a6fd
Barton, Sheila
4f674382-ca0b-44ad-9670-e71a0b134ef0
Melton, Phillip E.
0d57b167-a10c-4b22-83da-b983f378966f
Titcombe, Philip
a84c9fad-0580-42f9-8bb6-db0fe20435aa
Murray, Robert
c3e973b5-525c-49b3-96ee-af60a666a0f4
Rauschert, Sebastian
8fd6908a-271d-4361-88eb-091199dedbf8
Lillycrop, Karen
eeaaa78d-0c4d-4033-a178-60ce7345a2cc
Huang, Rae-Chi
d39aca4d-8017-48c3-8f40-0aa2e52dbf66
Holbrook, Joanna D.
69989b79-2710-4f12-946e-c6214e1b6513
Godfrey, Keith
0931701e-fe2c-44b5-8f0d-ec5c7477a6fd

Barton, Sheila, Melton, Phillip E., Titcombe, Philip, Murray, Robert, Rauschert, Sebastian, Lillycrop, Karen, Huang, Rae-Chi, Holbrook, Joanna D. and Godfrey, Keith (2019) In epigenetic studies including cell type adjustments in regression models can introduce multicollinearity, resulting in apparent reversal of direction of association. Frontiers in Genetics, 10, [816]. (doi:10.3389/fgene.2019.00816).

Record type: Article

Abstract

Background: Association studies of epigenome-wide DNA methylation and disease can inform biological mechanisms. DNA methylation is often measured in peripheral blood, with heterogeneous cell types with different methylation profiles. Influences such as adiposity-associated inflammation can change cell type proportions, altering measured blood methylation levels. To determine whether associations between loci-specific methylation and outcomes result from cellular heterogeneity many studies adjust for estimated blood cell proportions, but high correlations between methylation and cell type proportions could violate the statistical assumption of no multicollinearity. We examined these assumptions in a population-based study. Methods: CDKN2A promoter CpG methylation was measured in peripheral blood from 812 adolescents aged 17-years (Western Australian RAINE mother-offspring cohort). Loge adolescent BMI was used as the outcome in a regression analysis with DNA methylation as predictor, adjusting for age/sex. Further regression analyses additionally adjusted for estimated cell type proportions using the reference-based Houseman method, and simulations modelled the effects of varying levels of correlation between cell proportions and methylation. Correlations between estimated cell proportions and CpG methylation from Illumina 450K were measured. Results: Lower DNA methylation was associated with higher BMI when cell type adjustment was not included; for CpG4 β=-0.004 logeBMI/%methylation (95%CI -0.0065, -0.001; p=0.003). The direction of association reversed when adjustment for 6 cell types was made; for CpG4 β=0.004 logeBMI/%methylation (-0.0002, 0.0089; p=0.06). Correlations between CpG methylation and cell type proportions were high, and Variance Inflation Factors (VIFs) were extremely high (25 to 113.7). Granulocyte count was correlated with BMI, and removing granulocytes from the regression model reduced all VIFs to <3.1, with persistence of a positive association between methylation and BMI (CpG4 β=0.004 logeBMI/%methylation (-0.0002, 0.0088; p=0.06)). Simulations supported major effects of multicollinearity on regression results. Conclusions: Where cell types are highly correlated with other covariates in regression models the statistical assumption of no multicollinearity may be violated. This can result in reversal of direction of association, particularly when examining associations with phenotypes related to inflammation, as CpG methylation may associate with changes in cell type proportions. Removing predictors with high correlations from regression models may remove the multicollinearity. However this might hinder biological interpretability.

Text
Multicollinearity_Frontiers_22.07.2019_Revision2 (00000002) - Accepted Manuscript
Download (812kB)
Text
fgene-10-00816 - Version of Record
Available under License Creative Commons Attribution.
Download (1MB)
Text
Multicollinearity Supplementary Info 09.02.2018
Restricted to Repository staff only
Request a copy
Spreadsheet
Copy of Supp Tables 6789
Restricted to Registered users only
Download (23kB)
Request a copy

More information

Accepted/In Press date: 7 August 2019
Published date: 10 September 2019

Identifiers

Local EPrints ID: 433363
URI: http://eprints.soton.ac.uk/id/eprint/433363
ISSN: 1664-8021
PURE UUID: d3a04f26-3f7c-425c-bd79-9d5cec373b9d
ORCID for Sheila Barton: ORCID iD orcid.org/0000-0003-4963-4242
ORCID for Philip Titcombe: ORCID iD orcid.org/0000-0002-7797-8571
ORCID for Karen Lillycrop: ORCID iD orcid.org/0000-0001-7350-5489
ORCID for Joanna D. Holbrook: ORCID iD orcid.org/0000-0003-1791-6894
ORCID for Keith Godfrey: ORCID iD orcid.org/0000-0002-4643-0618

Catalogue record

Date deposited: 15 Aug 2019 16:30
Last modified: 22 Nov 2021 07:36

Export record

Altmetrics

Contributors

Author: Sheila Barton ORCID iD
Author: Phillip E. Melton
Author: Philip Titcombe ORCID iD
Author: Robert Murray
Author: Sebastian Rauschert
Author: Karen Lillycrop ORCID iD
Author: Rae-Chi Huang
Author: Joanna D. Holbrook ORCID iD
Author: Keith Godfrey ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×