How does deseq2 encode more than 2 levels
1
0
Entering edit mode
6.8 years ago
-_- ★ 1.1k

When there are two levels per factor, it could be encoded as 0 and 1. What about 3 factors, then? Is it one-hot encoding or something like that when DESeq fit a generalized linear model over the factors? I don't find such information in the paper or user guide yet.

If you could even point me in the source code, that would even better. Thanks.

RNA-Seq DESeq2 differential expression • 1.5k views
ADD COMMENT
1
Entering edit mode
6.8 years ago
-_- ★ 1.1k

DESeq2 uses model.matrix so you can just plug your design and colData into this base R function to see how it will be encoded.

Quoted from https://support.bioconductor.org/p/77620/#97059

> model.matrix(~participant+sampleType, coldata)
             (Intercept) participantX8326 participantX8329 sampleTypetumor
X8324_normal           1                0                0               0
X8324_tumour           1                0                0               1
X8326_normal           1                1                0               0
X8326_tumour           1                1                0               1
X8329_normal           1                0                1               0
X8329_tumour           1                0                1               1

So it's not really one-hot encoding, but something like it, where it uses [0, 0] to represent participant X8324.

ADD COMMENT

Login before adding your answer.

Traffic: 1700 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6