Covariates... what to do with them and how to handle them?
0
0
Entering edit mode
8.2 years ago
Floris Brenk ★ 1.0k

Hi all,

I'm performing an eQTL analysis on about 120 samples combined with 5 million genotypes. As covariates I initially took RIN, age, processing day and gender and the results look nice and had a replication rate of about 40% compared to previous studies. A colleague said then to me that I should take also the first five components of a PCA of the gene expression as covariates. So I did that and then someone warned me that covariates could orthogonal and then I could be over correcting. Now I redid the analysis with four different sets of covariates (4 original, first 5 PCA, first 20 PCA and first 40 PCA) and I get more significant results when I include more covariates, so now I am a bit confused. I have three questions and was hoping someone could help me out with this

What is the best amount of covariates to include in the analysis? Is there some kind of optimal number of covariates that you can calculate? Looking at the results it doesnt seem to matter that much including the number of covariates... about 75% of the results looks consistent

How can I check whether covariates are orthogonal? Is a simple Pearson correlation above 0.50 or below -0.50 already enough evidence for orthogonal covariates?

Why do the number of significant results increase when including more covariates? I was more thinking in the line that the more you correct the less significant results I should get?

covariates gene expression • 2.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 1762 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6