DESeq2- R commands for multiple groups
1
1
Entering edit mode
6.2 years ago
pixie@bioinfo ★ 1.5k

My RNA-seq read count table consists of samples fro two tissues (root and shoot), different time-points (1hr, 3hr, 12hr, 1day etc). and stress and no-stress conditions all in one table. For example:

root_ctrl_rep1 root_ctrl_rep2 root_1hr_rep1 root_1hr_rep2 root_3hr_rep1 root_3hr_rep2 shoot_ctrl_rep1 shoot_ctrl_rep2 shoot_1hr_rep1 shoot_1hr_rep2 shoot_3hr_rep1 shoot_3hr_rep2

As of now, I am interested only in a basic DEG analysis between the following pairs

root_ctrl vs root_1hr

root_ctrl vs root_3hr

shoot_ ctrl vs shoot_1hr

shoot_ctrl vs shoot_3hr

----and so on

How do go about this ?.

library("DESeq2")

count_table<-read.table("counts.txt", header=T, row.names=1, sep="\t")

exp_design<-data.frame(row.names=colnames(count_table), condition = c("root_ctrl","root_1hr","root_6hr","root_12hr","shoot_ctrl","shoot_1hr","shoot_6hr","shoot_12hr","shoot_day1","root_ctrl","root_3hr","root_6hr","root_12hr","root_day1","shoot_ctrl","shoot_1hr","shoot_3hr","shoot_6hr","shoot_12hr","shoot_12hr","shoot_day1","root_1hr","root_3hr","root_day1","shoot_3hr","shoot_3hr"))

conditions=exp_design$condition

ds <- DESeqDataSetFromMatrix(countData=count_table, exp_design, formula(~condition))

I am not sure if the above is right and what to do next. Any suggestion is greatly appreciated. Thanks

RNA-Seq deseq2 • 4.3k views
ADD COMMENT
3
Entering edit mode
6.2 years ago

You'll probably want ~0+condition, since I'm not sure if DESeq2's extended model matrix will be used in your model or not (I presume not).

Then for each comparison: res = results(ds, contrast=c("condition", "root_1hr", "root_ctrl")). Note that you'll want to include lfcShrink(), since that's not automatically done anymore.

ADD COMMENT
0
Entering edit mode

Thanks for the pointers. I will try that out

ADD REPLY
0
Entering edit mode

Hi Devon, I was going through the DESeq (and DESeq2) papers to understand the models they are using, and also because I am a novice in this field. In the paper, they are calculating the size-factor which is explained here : http://seqanswers.com/forums/showpost.php?p=16468&postcount=13

So in my case, if I want to compare root_ctrl (2 replicates) with root_12hr(2 replicates), the library size should be calculated over 4 samples or all the samples in the count matrix (26 in this case) ? Thanks

ADD REPLY
0
Entering edit mode

DESEq2 does size-factor normalization automatically. So you don't have to calculate the size factors unless you need it for some specific reason. I would strongly suggest reading the DESeq2 pipeline, which will help to clarify many of your doubts

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html

ADD REPLY
0
Entering edit mode

Yes, I started with the DESeq2 paper, but the kept referring to the previous paper so had to read that one to get an idea.

ADD REPLY
0
Entering edit mode

You can calculate the size factors over all of the samples. The only time that that's problematic is if there's a large (>10x) difference in coverage between some of the groups (in my experience that tends to throw things off). So don't worry about computing size factors only once unless you have funky data like that.

ADD REPLY
0
Entering edit mode

Thanks a lot for the input !

ADD REPLY
0
Entering edit mode

Hi Devon, one last query. I am also interested in constructing co-expression networks with the rlog transformed data.

rlogged_data <- rlog(ds2)

How do I export rlogged_data object as a matrix (text or csv) for further analysis in WGCNA ? Thanks

ADD REPLY
0
Entering edit mode

Presumably writing assay(rlogged_data) to a file would suffice.

ADD REPLY

Login before adding your answer.

Traffic: 2660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6