Biostar Beta. Not for public use.
Question: Best way to address different batches of RNA-seq
0
Entering edit mode

I did two batches of RNA-seq upon same drug treatment but at different time points on the same cell line. For each time point, I have corresponding DMSO controls. All my Fold Change values are calculated over the corresponding DMSO control. I'm trying to trace the gene expression dynamics upon my drug treatment. I found that if I simply merge two datasets into one master table and plot the RPKM and Fold Change values, the numbers do not reflect the real dynamics I saw by q-PCR. For example: I knew one gene (and a few more) that can be induced highest upon 4-day treatment but the RPKM values and Fold Changes do not show the same trend. So my question is:

What's the best way to address this issue?

Appreciate your help.

Hang

ADD COMMENTlink 2.8 years ago tud55122 • 20 • updated 2.7 years ago Biostar 20
Entering edit mode
1

Below are the scripts I used.

> samples=read.table("Samples.txt",sep="\t",header=T)
> counts = readDGE(samples$countf)$counts
> noint = rownames(counts) %in% c("__no_feature","__ambiguous","__too_low_aQual"                               ,"__not_aligned","__alignment_not_unique","no_feature","ambiguous","too_low_aQua                               l","not_aligned","alignment_not_unique")
> counts = counts[!noint,]
> dim(counts)
> cpms = cpm(counts)
> keep = rowSums(cpms>0)>=1
> counts = counts[keep,]
> dim(counts)
> d = DGEList(counts=counts, group=samples$condition)
> d = calcNormFactors(d)
> design <- model.matrix(~0+samples$condition)
> d <- estimateGLMCommonDisp(d,design)
> d <- estimateGLMTrendedDisp(d,design)
Loading required package: splines
> d <- estimateGLMTagwiseDisp(d,design)
> fit <- glmFit(d,design)
> et <- glmLRT(fit, contrast=c(-1,1,0,0,0,0,0,0,0,0))
> tt = topTags(et, n=nrow(d),sort.by="none")
> write.table(tt,"EdgeR/test.txt",sep="\t",row.names=T, quote=F)
ADD REPLYlink 2.8 years ago
tud55122
• 20
• updated 2.8 years ago
genomax
68k
Entering edit mode
0

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

ADD REPLYlink 2.8 years ago
genomax
68k
Entering edit mode
0

Ok, then perhaps you could try to add the batch into the model/design. Although if you processed the DMSO/drug samples at the same time, it shouldn't be very impactful.

ADD REPLYlink 2.8 years ago
Carlo Yague
♦ 4.5k
Entering edit mode
0

Is your fold change also computed based on RPKM values ? RPKM is not a proper between sample normalization metric so I suggest that you try a different method such as edgeR/DESEQ.

ADD REPLYlink 2.8 years ago
Carlo Yague
♦ 4.5k
Entering edit mode
0

Thanks for the reply. My fold change is computed based on counts values. I used edgeR to calculate counts and generated RPKM values.

ADD REPLYlink 2.8 years ago
tud55122
• 20
Entering edit mode
0

What was different between the two batches? In addition, as said above you shouldn't use RPKM, use statistics as in DESeq2

ADD REPLYlink 2.7 years ago
Asaf
5.6k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0