Efficient way to run a DESeq2 analysis with many design formulae
1
0
Entering edit mode
6.6 years ago
cdiener • 0

Hi,

We have read counts from several hundred 16S RNA samples for which we would like to run differential testing. The thing is that we have a lot of responses we would like to try. So we need to run the following:

  • build a DESeqDataSet with one design formula (few confounders + response)
  • get results
  • change design and repeat
  • assemble results and readjust FDR (using IHW)

As far as I understand DESeq2 the size factor and dispersion estimates should not depend on the actual design formula. So it should be possible to run those calculations only once for all tests. However, we also have many missing data for each response so I would need to subset the count matrix and column data to only those samples that have non-NA entries in the response. Will that conserve the previous estimates for size factors and dispersions?

If not is there a way to achieve that behavior?

Thanks a lot! Chris

microbiome RNA-Seq DESeq2 • 2.8k views
ADD COMMENT
3
Entering edit mode
6.6 years ago
Michael Love ★ 2.6k

hi cdiener!

Dispersion estimation does depend on the design, but size factors do not.

ADD COMMENT
0
Entering edit mode

Thanks Mike! Makes sense since dispersions depend on the grouping ^_^'. So for now I'm running it building a new DESeqDataSet for each design.

ADD REPLY
1
Entering edit mode

Also, from this preprint:

http://www.biorxiv.org/content/early/2017/06/30/157982

We found that estimateSizeFactors() with type="poscounts" is better than the default size factor estimation when there are many zeros. So just run that before DESeq(), and it won't re-estimate size factors.

(That paper also has new software which improves on the NB methods when there are many zeros.)

ADD REPLY
0
Entering edit mode

Edit: found it, nevermind :)

Great will do. Is there a way to calculate the size factors once for the full count matrix and only subset that matrix for different designs. For instance if I already have a DESeqDataSet with estimated size factors for the full matrix can I get a smaller data set with only a subset of the samples without re-estimating the size factors?

ADD REPLY
0
Entering edit mode

Yes you don't need to re estimate

ADD REPLY

Login before adding your answer.

Traffic: 2409 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6