making matrix model for samples with one wild type and 4 different results
1
0
Entering edit mode
5.7 years ago
moru_all • 0

Hello i'm newbie for bioinformatics field, and now studying the R to get the DEG from the samples. (And I apologize with my bad English because i'm NOT USED TO WRITING WITH IN ENGLISH.. )

I have four samples from same cell line which were treated with same treatment, but they are assumed to have different characteristics because of the other factors . (this is just the characteristic of the experiment)

I know that i need to use model.matrix() function in R to design the experiment for edgeR or Voom (limma).

Because of my short knowledge of statistics and BI, i'm really not sure how to design the matrix in proper way.

Is there anybody who can give me the idea how to design the matrix?

There are some ideas that i thought of, but i'm not sure with them because there's no replicate for wild type.

(1) First try:

 ............|      DIFF         |         WT  
ResultCell.1 |                1  |             0
ResultCell.2 |                1  |             0
ResultCell.3 |                1  |             0
ResultCell.4 |                1  |             0
WildTypeC    |                0  |             1

The code:

TAG = factor(c(rep("DIFF",4),"WT"))
design = model.matrix(~0+TAG)
YY = estimateGLMCommonDisp(YY, design, verbose=TRUE)
YY = estimateGLMTrendedDisp(YY, design)
YY = estimateGLMTagwiseDisp(YY, design)
FIT = glmFit(YY, design)
LRT = glmLRT(FIT, contrast = c(1,-1))

(2) Second try:

..............|         DIFF     |      WT 
ResultCell.1  |               1  |            0
WildTypeCel   |               0  |            1
ResultCell.2  |               1  |            0
WildTypeCel   |               0  |            1
ResultCell.3  |               1  |            0
WildTypeCel   |               0  |            1
ResultCell.4  |               1  |            0
WildTypeCel   |               0  |            1

The code:

SAMPLES = factor(c(1,1,2,2,3,3,4,4))
GROUP = factor(c(rep(c("DIFF","WT"),4)))
design = model.matrix(~GROUP+SAMPLES)
y = estimateGLMCommonDisp(y, design, verbose=TRUE)
y = estimateGLMTrendedDisp(y, design)
y = estimateGLMTagwiseDisp(y, design)
FIT = glmFit(y, design)
LRT = glmLRT(FIT, contrast = c(0,-1,0,0,0))

I tried both, and the result really different. To me, it was much better to see the plot of the second design matrix, because it showed much more organized form. I really tried to follow edgeR help page and searched some pages, but there was no case that used only one wild type and comparing other samples...... (in case that i searched carefully).

RNA-Seq R deg edgeR voom • 1.1k views
ADD COMMENT
0
Entering edit mode
5.7 years ago
Mark ★ 1.5k

You do not have enough replicates for DE analysis. There's probably a way to perform 1v1 comparison but it's not statistically sound. Do you have a second replicate of WildTypeCel? Depending on what you aim to do, you will also need more replicates for the other ResultCell.#.

If you have the minimum required replicates, and ResultCell1/2/3/4 are all different your design matrix would look like this:

.............|      DIFF         |         WT  
ResultCell.1 |                1  |             0
ResultCell.1 |                1  |             0
WildTypeC    |                0  |             1
WildTypeC    |                0  |             1

etc for the rest of the `ResultCell.#. There is a way to perform more complicated comparisons uses the formulas but I won't go into this.

If ResultCell.# are all very similar and essentially all replciates of each other, then your matrix would look like this:

.............|      DIFF         |         WT  
ResultCell.1 |                1  |             0
ResultCell.2 |                1  |             0
ResultCell.3 |                1  |             0
ResultCell.4 |                1  |             0
WildTypeC    |                0  |             1
WildTypeC    |                0  |             1

Note the 2 replicates for WildTypeC.

Currently, you do not have enough replicates. Unfortunately, this means you won't be able to perform DE analysis. There is a section the edgeR manual on what to do if you have no replicates. The manual is fantastic and I highly recommended you have a thorough read of it.

Good luck

ADD COMMENT

Login before adding your answer.

Traffic: 2526 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6