Biostar Beta. Not for public use.
Question: Is it possible to make a PCA plot for samples using TPM exprssion values in R?
0
Entering edit mode

I have a table with TPM expression values for several samples (10-12) and I want to create PCA plot in order to estimate similarity of raplicates of a certain conditions. If it possible could you please suggest some pipelines or commands for R?

ADD COMMENTlink 17 months ago rimgubaev • 120 • updated 17 months ago andrew.j.skelton73 5.7k
Entering edit mode
0

I think you can do so

library(scater)

example_sce <- SingleCellExperiment(
    assays = list(counts = matrix of your raw values))

cpm(example_sce) <- calculateCPM(example_sce)

example_sce <- normalize(example_sce)

plotPCA(example_sce)

You can do many things here

https://bioconductor.org/packages/devel/bioc/vignettes/scater/inst/doc/vignette-dataviz.html#generating-pca-plots

ADD REPLYlink 17 months ago
jivarajivaraj
• 40
7
Entering edit mode

You should transform your data to a log-like scale. If you're analysing in DESeq2, look at vst or rlog methods, alternatively if you're using Limma Voom, then your data should be good to go. Have a look at the tximport package if you're confused about these different input metrics.

When you've got your data in the correct scale, here's a nice bit of code to produce a PCA - note I'm using dummy data in this case.

library(tidyverse) #CRAN - install.packages("tidyverse")
library(ggrepel)   #CRAN - install.packages("ggrepel")

# Generate some fake data
set.seed(73)
mat.row      <- 1000
mat.col      <- 15
data.pheno   <- data.frame(SampleID   = paste0("SAM", 1:mat.col),
                           SampleType = rep(c("A","B","C"), times = mat.col / 3),
                           stringsAsFactors = F)
foo          <- rnorm(mat.row * mat.col, mean = 300) %>% 
                log2 %>% 
                matrix(., ncol = mat.col) %>% 
                `colnames<-`(data.pheno$SampleID)
# 

# Generate PCA Data & Proportion of variability
pca          <- foo %>% t %>% prcomp
d            <- pca$x %>% as.data.frame %>% 
                add_rownames("SampleID") %>% 
                left_join(data.pheno) 
pcv          <- round((pca$sdev)^2 / sum(pca$sdev^2)*100, 2)
# 

# Make a pretty Picture
plot.pca    <- ggplot(d, aes(PC1,PC2,colour = SampleType)) +
               geom_point() +
               xlab(label=paste0("PC1 (", pcv[1], "%)")) +
               ylab(label=paste0("PC2 (", pcv[2], "%)")) +
               theme_bw() +
               geom_label_repel(aes(label = SampleType), show.legend = F) +
               theme(axis.title.x = element_text(size=15),
                     axis.title.y = element_text(size=15)) +
               labs(title    = "My Fake PCA",
                    subtitle = "With some random data",
                    caption  = "Coloured by my random phenotype")
print(plot.pca)
#

PCA Plot

ADD COMMENTlink 17 months ago andrew.j.skelton73 5.7k
Entering edit mode
2

Very nice!

ADD REPLYlink 17 months ago
Kevin Blighe
43k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0