PCA of expression data what PC to use?
0
0
Entering edit mode
8.3 years ago
Floris Brenk ★ 1.0k

Hi all,

I'm looking for covariates in my expression dataset (countdata CAGEseq). I have 120 samples and about 20.000 expression values. I found this tutorial that is really easy to use:

https://tgmstat.wordpress.com/2013/11/28/computing-and-visualizing-pca-in-r/

So I changed it for my data with 120 samples columns and 20.000 rows.

pca_data = t(log(norm.data+1))
dim(pca_data)
[1]   120 20000
cage.pca <- prcomp(pca_data,
                 center = TRUE,
                 scale. = TRUE) 

# plot method
plot(cage.pca, type = "l")

# summary method
summary(cage.pca)

Look like to me if I use the first 6 PCAs then most of the variation is gone.

however when I do the summary method it gives out 119 PCA components? I am a bit confused now and don't know which PCA components I need to use as covariates. And the cumulative proportion of PCA6 is 0.43128 not like in the plot where I would expect a lot more... Could anyone help with this?

R PCA expression • 2.2k views
ADD COMMENT
1
Entering edit mode

you don't need to use 6 principal components. from the scree plot you can see that 3 capture most of the variability, adding 3 more don't really add that much.

ADD REPLY
0
Entering edit mode

Ok thanks. For extracting these components I can just do this right?

first = cage.pca$x[,1]
second = cage.pca$x[,2]
third = cage.pca$x[,3]
ADD REPLY
1
Entering edit mode

Were you concerned because plot() only showed 10 components and summary() showed 119 ? The reason is that by default, plot() shows at most 10 components. So although it shows that the first 3-6 components explain a large amount of variance, it is a bit misleading because a lot of the variance is also captured in the components not shown, summary() shows the cumulative variance explained and tells you that the first 6 components only explain ~43% of the variance.

ADD REPLY

Login before adding your answer.

Traffic: 1771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6