The University of Southampton
University of Southampton Institutional Repository

Adjustment for gene expression PCA scores may induce reproducible false positive associations in eQTL analysis partly due to endogenous selection bias

Adjustment for gene expression PCA scores may induce reproducible false positive associations in eQTL analysis partly due to endogenous selection bias
Adjustment for gene expression PCA scores may induce reproducible false positive associations in eQTL analysis partly due to endogenous selection bias
Motivation: Expression quantitative trait loci (eQTL) analysis has been widely applied to map cis and trans regulatory elements of gene expression. Gene expression analysis is plagued with observed and latent biases such as batch effects and cell composition.
Principal component analysis (PCA) has the potenti al to capture these latent variables associated with global gene expression levels.
Adjustment of the eQTL model for principal components (PC) of the gene expression increases the yield of statistically significant eQTLs and consequently has been widely adopted in the field. The explanation that accompanies the large increase in reproducible eQTLs is that adjustment for PCs reduce variation induced by technical and biological latent factors affecting global gene expression.

Results: We here report that such practice may induce reproducible false positive results partly due to endogenous selection bias. That is to say, adjusting for PCs may open a path between a SNP not associated with expression but correlated with a PC that in turn is associated with the expression of a gene. Our simulation shows that false positive results induced by PCA adjustment can be reproducible. Real dataset analysis suggests that regression models with a SNP not associated to expression but correlated to a PC may result in both variables mutually acting as suppressor variables thereby inflating SNP effect sizes. Similarly, adjustment for multiple PCs increases R2 of the regression model and thereby reduces the standard errors of the beta coefficients. These two effects taken together increase significance of p-values and may induce false positives. We propose a simple procedure to detect whether SNPs and PCs are acting as suppressor variables.

Conclusions: We recommend a few techniques to deal with false positive associations in eQTLs and gene expression associations that arise from adjusting models for gene expression PCs
106-106
Karger
Couto Alves, Alexessander
87b9179e-abde-4ca5-abfc-4b7c5ac8b03b
Cordell, Heather
Couto Alves, Alexessander
87b9179e-abde-4ca5-abfc-4b7c5ac8b03b
Cordell, Heather

Couto Alves, Alexessander (2015) Adjustment for gene expression PCA scores may induce reproducible false positive associations in eQTL analysis partly due to endogenous selection bias. Cordell, Heather (ed.) In HUMAN HEREDITY: 44th European Mathematical Genetics Meeting (EMGM) 2016. Newcastle upon Tyne, UK, May 11-12, 2016: Abstracts. vol. 80, Karger. p. 106 . (doi:10.1159/000445228).

Record type: Conference or Workshop Item (Paper)

Abstract

Motivation: Expression quantitative trait loci (eQTL) analysis has been widely applied to map cis and trans regulatory elements of gene expression. Gene expression analysis is plagued with observed and latent biases such as batch effects and cell composition.
Principal component analysis (PCA) has the potenti al to capture these latent variables associated with global gene expression levels.
Adjustment of the eQTL model for principal components (PC) of the gene expression increases the yield of statistically significant eQTLs and consequently has been widely adopted in the field. The explanation that accompanies the large increase in reproducible eQTLs is that adjustment for PCs reduce variation induced by technical and biological latent factors affecting global gene expression.

Results: We here report that such practice may induce reproducible false positive results partly due to endogenous selection bias. That is to say, adjusting for PCs may open a path between a SNP not associated with expression but correlated with a PC that in turn is associated with the expression of a gene. Our simulation shows that false positive results induced by PCA adjustment can be reproducible. Real dataset analysis suggests that regression models with a SNP not associated to expression but correlated to a PC may result in both variables mutually acting as suppressor variables thereby inflating SNP effect sizes. Similarly, adjustment for multiple PCs increases R2 of the regression model and thereby reduces the standard errors of the beta coefficients. These two effects taken together increase significance of p-values and may induce false positives. We propose a simple procedure to detect whether SNPs and PCs are acting as suppressor variables.

Conclusions: We recommend a few techniques to deal with false positive associations in eQTLs and gene expression associations that arise from adjusting models for gene expression PCs

Text
000445228
Restricted to Repository staff only
Request a copy

More information

Accepted/In Press date: 26 April 2015
Published date: 26 April 2015

Identifiers

Local EPrints ID: 509575
URI: http://eprints.soton.ac.uk/id/eprint/509575
PURE UUID: d0a91dde-f96b-4189-b995-890131131fdc
ORCID for Alexessander Couto Alves: ORCID iD orcid.org/0000-0001-8519-7356

Catalogue record

Date deposited: 25 Feb 2026 17:57
Last modified: 26 Feb 2026 03:12

Export record

Altmetrics

Contributors

Author: Alexessander Couto Alves ORCID iD
Editor: Heather Cordell

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×