Question

DE analysis with edgeR and possibly batch effect

1

Entering edit mode

5.1 years ago

seta ★ 1.9k

Hi all,

I'm using edgeR for recognizing differentially expressed genes between control and treatment groups of mice. Here, vd_s is control and vd_r is treatment at 1 day (vd_s1 and vd_r1) and 7 days (vd_s7 and vd_r7) after birth with two replicates. I used the following code with edgeR:

count <- read.delim ("vd_count.txt", row.names=1)
group <- factor(c(rep("vd_s1",2),rep("vd_r1",2),rep("vd_r7",2), rep("vd_s7",2)))
y <- DGEList (counts=count ,group=group)
keep <- rowSums(cpm(y) > 0.5) >= 2
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y)
plotMDS(y, col=as.numeric(y$samples$group))

Here is the MDS plot, I expected two groups of treatment and control; but as you can see, there is a separation between two times of 1 and 7 days, so, vd_s1 and vd_r1 separated from vd_s7 and vd_r7. Could you please let me know if it is a sign of batch effect? If yes, could you please kindly advise me how I can introduce it into R and remove it?

MDS plot

Thanks in advance

analysis edgeR batch-effect DE • 2.1k views

ADD COMMENT • link updated 1 day ago by Ram 43k • written 5.1 years ago by seta ★ 1.9k

0

Entering edit mode

I do not know - should there be a batch effect based on how you performed your laboratory work? Were you not expecting time to be the main source of variation among the samples? - we are talking about post-birth as the baby adjusts to the real World. What does a PCA bi-plot show?

Even if you want to adjust for what you conceive as a batch effect, you will then 'drown out' (eliminate) most or all of the effect of time

ADD REPLY • link 5.1 years ago by Kevin Blighe 87k

0

Entering edit mode

We don't know about which treatment you applied - and which effects you expect from that. I am not surprised to see that there is a bigger difference between 1- and 7-day old mice though.

ADD REPLY • link 5.1 years ago by WouterDeCoster 47k

score 2 · Answer 1 · 2019-03-15

2

Entering edit mode

5.1 years ago

Devon Ryan 104k

I see no evidence of a batch effect in your data. That you have different results by day seems biologically expected and should be added to your model.

ADD COMMENT • link 5.1 years ago by Devon Ryan 104k

0

Entering edit mode

Thank you very much for all the comments. Devon, your mean is using something like lrt <- glmLRT(fit, coef=2), yes?

ADD REPLY • link 5.1 years ago by seta ★ 1.9k

0

Entering edit mode

You would need a factorial design: ~day + treatment or similar.

ADD REPLY • link 5.1 years ago by Devon Ryan 104k