Hi all,
I'm using edgeR for recognizing differentially expressed genes between control and treatment groups of mice. Here, vd_s is control and vd_r is treatment at 1 day (vd_s1 and vd_r1) and 7 days (vd_s7 and vd_r7) after birth with two replicates. I used the following code with edgeR:
count <- read.delim ("vd_count.txt", row.names=1)
group <- factor(c(rep("vd_s1",2),rep("vd_r1",2),rep("vd_r7",2), rep("vd_s7",2)))
y <- DGEList (counts=count ,group=group)
keep <- rowSums(cpm(y) > 0.5) >= 2
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y)
plotMDS(y, col=as.numeric(y$samples$group))
Here is the MDS plot, I expected two groups of treatment and control; but as you can see, there is a separation between two times of 1 and 7 days, so, vd_s1 and vd_r1 separated from vd_s7 and vd_r7. Could you please let me know if it is a sign of batch effect? If yes, could you please kindly advise me how I can introduce it into R and remove it?
Thanks in advance
I do not know - should there be a batch effect based on how you performed your laboratory work? Were you not expecting
time
to be the main source of variation among the samples? - we are talking about post-birth as the baby adjusts to the real World. What does a PCA bi-plot show?Even if you want to adjust for what you conceive as a batch effect, you will then 'drown out' (eliminate) most or all of the effect of
time
We don't know about which treatment you applied - and which effects you expect from that. I am not surprised to see that there is a bigger difference between 1- and 7-day old mice though.