Working with PVCA
0
0
Entering edit mode
7.0 years ago
firestar ★ 1.6k

I have my RNASeq counts and the various factors of my experimental setup (condition,strain,pool etc). I am interested to know which of these effects matter and how much so that I can make a decision of how to model them in GLM.

I have found this package called PVCA which seems to show the proportion of variance explained by each factor and interaction of factors.

If counts is my count table and met is my metadata table, I use:

library(pvca)
eset <- ExpressionSet(as.matrix(counts),new("AnnotatedDataFrame",data=met))
pvcaobj <- pvcaBatchAssess(eset, batch.factors=c("bias","diet","line"), threshold=0.6)
df <- data.frame(label=as.character(pvcaobj$label),wmpv=round(as.numeric(pvcaobj$dat),2)

And this returns something like this

      label wmpv
1 diet:line  0.04
2 bias:line  0.02
3      line  0.02
4 bias:diet  0.02
5      bias  0.02
6      diet  0.02
7     resid  0.86

So here are my questions.

Which dataset should I use as counts? They all produce different results.

  1. raw filtered counts
  2. cpm transformed counts
  3. cpm log transformed counts

What does the threshold=0.6 in pvcaBatchAccess() do?

Are there any other such tools or methods to access batch effects?

RNA-Seq rna-seq PVCA DGE • 2.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 3279 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6