Error in differential analysis for samples with different time points
1
1
Entering edit mode
5.1 years ago
Biologist ▴ 290

I have 4 samples. 2 control and 2 gene_oe (over expression) samples.

I wanted to do differential analysis between Gene_OE vs Control samples. I have the samples column data like following:

coldata:

Samples Type    Time
SampleA Control Day1
SampleB Control Day2
SampleD Gene_OE Day1
SampleE Gene_OE Day2

Using edgeR I did like following:

library(edgeR)
group <- factor(paste0(coldata$Type))

And created design matrix like following:

design2 <- model.matrix(~ 0 + group + coldata$Time)
desgin2

    Control Gene_OE day1 day2
1       1        0    0    0
2       1        0    1    0
3       0        1    0    0
4       0        1    1    0

I see some warning message :

y <- estimateDisp(y, design2, robust=TRUE)
Warning message:
In estimateDisp.default(y = y$counts, design = design, group = group,  :
  No residual df: setting dispersion to NA

And an error like below:

fit <- glmQLFit(y, design2, robust=TRUE)
Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset,  : 
  Design matrix not of full rank.  The following coefficients not estimable:
 day2

What could be the reason for this error? And how to resolve this error?

RNA-Seq r edger differential analysis • 2.9k views
ADD COMMENT
1
Entering edit mode
5.1 years ago
h.mon 35k

The design matrix is not full rank because you have only one sample (no biological replicates) per combination of treatment (type+time). You may either drop day from the analysis, or add more samples per treatment.

ADD COMMENT
0
Entering edit mode

But If my coldata looks like below, I don't see any error:

coldata:

Samples Type    Time
SampleA Control Day1
SampleB Control Day2
SampleC Control Day3
SampleD Gene_OE Day1
SampleE Gene_OE Day2
SampleF Gene_OE Day3

design2 <- model.matrix(~ 0 + group + coldata$Time)
desgin2

    Control Gene_OE Day2 Day3
1       1        0    0    0
2       1        0    1    0
3       1        0    0    1
4       0        1    0    0
5       0        1    1    0
6       0        1    0    1

Is this right?

ADD REPLY
1
Entering edit mode

The above has the very same problem as your original post. You need more replicates per treatment - each day has only one sample, you need more per same day.

ADD REPLY
0
Entering edit mode

Could you please tell me whether the above way it is right or I should add more samples per treatment?

ADD REPLY

Login before adding your answer.

Traffic: 2565 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6