edgeR library size
1
0
Entering edit mode
8.7 years ago
schelarina ▴ 50

Hello everyone, one little question on edgeR. I have the matrix counts file, a data.frame samples file, and the annotation file.

The manual says "The data.frame samples contains a column lib.size for the library size or sequencing depth for each sample. If not specified by the user, the library sizes will be computed from the column sums of the counts. For classic edgeR the data.frame samples must also contain a column group, identifying the group membership of each sample."

I tried by introducing a column for the library size like this

             group        lib.size    
sample X     1            8094363
sample X     1            5005492
sample Y     2            7094693
sample Y     2            6094693

etc

so I do like this:

x <- read.delim("counts.txt", stringsAsFactors=FALSE)
group <- (c(1,1,2,2))
genes <- read.delim("genes.txt")
y <- DGEList(counts=x, group=group, genes=genes)
y <- calcNormFactors(y)
y$samples

but then edgeR ricalculates the library size putting a different number and introduce the normalization factor.

How and where to specify this library size in the correct way or avoid the replacement?

Thanks for you help

RNA-Seq R • 10k views
ADD COMMENT
1
Entering edit mode
8.7 years ago
Irsan ★ 7.8k

If you use the lib.size argument in the DGEList() function it will not recalculate. So do

DGEList(counts=x, group=group, genes=genes, lib.sizes=c(1,2,3,4))

instead. And replace the 1, 2, 3, 4 by real numbers

ADD COMMENT
0
Entering edit mode

thanks! it works perfectly now

ADD REPLY

Login before adding your answer.

Traffic: 1720 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6