Question

Calculate fold change in edgeR with one sample per condition

1

Entering edit mode

2.9 years ago

mohammedtoufiq91 ▴ 250

Hi,

We have run a pilot RNA-Seq study with one sample per condition, this is just a test run. I understand there is no valid statistical test in this case, however just curious to obtain differential expression through edgeR package in R assuming dispersion = 0.4 for the human data. I have a normal (baseline) sample followed by 5 different samples. When I run the edgeR package, I want to indicate my normal sample (as baseline), however, I am unsure what sample here is taken as a baseline for calculation of fold change.

FC calculation

FC = Normal/Test_1 (OR) 
FC = Test_1/Normal

Samples

Normal (baseline) = Normal
Test (Treated) = Test_1

Data

dput(df_data)
structure(list(Normal = c(0L, 184L, 60L, 0L, 7L, 0L, 87L, 0L, 
0L, 21L, 193L, 29L, 0L, 0L, 3L, 50L), Test_1 = c(0, 140.5, 64, 
0, 4, 0, 83, 0, 1, 51.5, 199, 25, 0, 0, 5, 62)), class = "data.frame", row.names = c("Gene1", 
"Gene2", "Gene3", "Gene4", "Gene5", "Gene6", "Gene7", "Gene8", 
"Gene9", "Gene10", "Gene11", "Gene12", "Gene13", "Gene14", "Gene15", 
"Gene16"))


dput(df_metadata)
structure(list(SampleID = c("xxxx1", "xxxx2"), CoreLabID = c("Normal", 
"Test_1")), class = "data.frame", row.names = c("Normal", "Test_1"
))

Here is the code that I am running

bcv <- 0.4
y <- DGEList(counts=df_data, group=df_metadata$CoreLabID)
et <- exactTest(y, dispersion=bcv^2)

View(et$table)
structure(list(logFC = c(0, -0.67280976110796, -0.190706878123648, 
0, -1.06592239047733, 0), logCPM = c(0.456451013758882, 6.84518828528986, 
5.46338499556895, 0.456451013758882, 2.37389406911164, 0.456451013758882
), PValue = c(1, 0.433579402199822, 0.851984371429117, 1, 0.542580328250669, 
1)), row.names = c("Gene1", "Gene2", "Gene3", "Gene4", "Gene5", 
"Gene6"), class = "data.frame")

View(et$comparison)
c("Test_1", "Normal")

Thank you,

Toufiq

R edgeR expression rna-seq differential • 1.5k views

ADD COMMENT • link 2.7 years ago by mohammedtoufiq91 ▴ 250

1

Entering edit mode

2.9 years ago

Michael 54k

How about this work-flow:

dge <- DGEList(counts)
dge <- calcNormFactors(dge)
my.cpm <- cpm(dge, log = F)
my.cpm.log <- cpm (dge, log = T, prior.count = 2)
allFC <- my.cpm[,1] / my.cpm[,2]
allLogFC <- my.cpm.log[,1] - my.cpm.log[,2]

ADD COMMENT • link 2.9 years ago by Michael 54k

0

Entering edit mode

@Michael Dondrup, thank you. This looks good!

Along with the fold change, I would also like to output the p-values (as additional column). In my question using edgeR, how does the fold change gets calculated in this package?

bcv <- 0.4
y <- DGEList(counts=df_data, group=df_metadata$CoreLabID)
et <- exactTest(y, dispersion=bcv^2)

View(et$table)
structure(list(logFC = c(0, -0.67280976110796, -0.190706878123648, 
0, -1.06592239047733, 0), logCPM = c(0.456451013758882, 6.84518828528986, 
5.46338499556895, 0.456451013758882, 2.37389406911164, 0.456451013758882
), PValue = c(1, 0.433579402199822, 0.851984371429117, 1, 0.542580328250669, 
1)), row.names = c("Gene1", "Gene2", "Gene3", "Gene4", "Gene5", 
"Gene6"), class = "data.frame")

View(et$comparison)
c("Test_1", "Normal")

ADD REPLY • link 2.9 years ago by mohammedtoufiq91 ▴ 250

0

Entering edit mode

This is not the same as the edgeR logFC calculation.

ADD REPLY • link 2.7 years ago by Gordon Smyth ★ 7.0k

score 2 · Accepted Answer · 2021-08-22

2

Entering edit mode

2.7 years ago

Gordon Smyth ★ 7.0k

edgeR has compared Test_1 to Normal so, yes, the normal sample has been taken as the baseline. et$comparison records the numerator and denominator for the fold-change respectively.

You can find this out for yourself by reading the help page for exactTest, help(exactTest). See the documentation for the pair argument.

If you use the standard edgeR pipeline

topTags(et)

then the results will be presented in a easy to read form and the output will tell you which sample has been compared to which.

ADD COMMENT • link 2.7 years ago by Gordon Smyth ★ 7.0k

0

Entering edit mode

Gordon Smyth

Thank you very much for the quick response and suggestions. This is very helpful.

ADD REPLY • link 2.7 years ago by mohammedtoufiq91 ▴ 250