Hi All,
I am plotting PCA via pcaplot in Deseq2. I am interested to look at other pca plots like pca3 vs. pca4 as well. How could I have this information in Deseq2?
Thanks,
Hi All,
I am plotting PCA via pcaplot in Deseq2. I am interested to look at other pca plots like pca3 vs. pca4 as well. How could I have this information in Deseq2?
Thanks,
Straight from plotPCA code:
dds <- rlog( dds, blind = T )
rv <- rowVars(assay(dds))
# select the ntop genes by variance
select <- order(rv, decreasing=TRUE)[seq_len(min(ntop, length(rv)))]
# perform a PCA on the data in assay(x) for the selected genes
pca <- prcomp(t(assay(dds)[select,]))
Now you can access more components:
PC1=pca$x[,1]
PC2=pca$x[,2]
PC3=pca$x[,3]
See the pcaPlot source for more ideas.
Here is my version of plotPCA
## Modified plotPCA from DESeq2 package. Shows the Names of the Samples (the first col of SampleTable), and uses ggrepel pkg to plot them conveniently.
# @SA 10.02.2017
library(genefilter)
library(ggplot2)
library(ggrepel)
plotPCA.san <- function (object, intgroup = "condition", ntop = 500, returnData = FALSE)
{
rv <- rowVars(assay(object))
select <- order(rv, decreasing = TRUE)[seq_len(min(ntop,
length(rv)))]
pca <- prcomp(t(assay(object)[select, ]))
percentVar <- pca$sdev^2/sum(pca$sdev^2)
if (!all(intgroup %in% names(colData(object)))) {
stop("the argument 'intgroup' should specify columns of colData(dds)")
}
intgroup.df <- as.data.frame(colData(object)[, intgroup, drop = FALSE])
group <- if (length(intgroup) > 1) {
factor(apply(intgroup.df, 1, paste, collapse = " : "))
}
else {
colData(object)[[intgroup]]
}
## Select the PCAs and percentVar that you like instead of 1 and 2
d <- data.frame(PC1 = pca$x[, 1], PC2 = pca$x[, 2], group = group,
intgroup.df, name = colData(rld)[,1])
if (returnData) {
attr(d, "percentVar") <- percentVar[1:2]
return(d)
}
ggplot(data = d, aes_string(x = "PC1", y = "PC2", color = "group", label = "name")) + geom_point(size = 3) + xlab(paste0("PC1: ", round(percentVar[1] * 100), "% variance")) + ylab(paste0("PC2: ", round(percentVar[2] * 100), "% variance")) + coord_fixed() + geom_text_repel(size=3)
}
Select the PCs and percentVars you like by changing the argument below and also later in the ggplot (x="PC3" etc..)
## Select the PCAs and percentVar that you like instead of 1 and 2
d <- data.frame(PC1 = pca$x[, 1], PC2 = pca$x[, 2], group = group,
intgroup.df, name = colData(rld)[,1])
if (returnData) {
attr(d, "percentVar") <- percentVar[1:2]
return(d)
}
Yet more possibilities via base R functions: A: PCA plot from read count matrix from RNA-Seq
DESeq2's PCA functionality automatically filters out a bunch of your transcripts based on low variance (biased / supervised). The code to which I have linked you does not (unbiased / unsupervised).
Kevin
May be you can store plotPCA output (with option returnData=TRUE) in an object and plot PCs independently with qplot.
Thanks, everyone. For those who are interested in plotting the PCA component 3, 4 and ... the following lines could help too.
library("DESeq2")
library(ggfortify)
project.pca <- prcomp(t(assay(dds)[select,]))
pdf("PCA34.pdf")
autoplot(prcomp(t(assay(dds)[select,])), x=3, y=4, data=sampleTable ,colour = 'genotype')
dev.off()
pdf("PCA56.pdf")
autoplot(prcomp(t(assay(dds)[select,])), x=5, y=6, data=sampleTable ,colour = 'genotype')
dev.off()
you can also add shape to the plot.
autoplot(prcomp(t(assay(dds)[select,])), data=sampleTable ,colour = 'genotype', shape='diet', size=3)
You can look at here for further discussion.
Have fun!
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
plotPCA( ..., returnData=TRUE)
returns just the first two PCs, hence the need to useprcomp
- as is done internally by plotPCA - or some other function to perform the PCA.you are correct @ h.mon. It stores only pc1 and 2 (https://rdrr.io/bioc/DESeq2/src/R/plots.R) or
getMethod("plotPCA","DESeqTransform")