Question: How to create a PCA Plot of Proteomics Data in R?
0
Entering edit mode
4 months ago
ishackm • 60

Hi all,

I hope you're well

I have the following the dataset:

QE1_Jo_Exp1_AOCS1_R QE1_Jo_Exp1_G33 QE1_Jo_Exp1_G33_R QE1_Jo_Exp1_G164
1           1027.9600       1434.3834         1774.4618         892.7630
2           1075.0975       1692.0633         1014.8056         537.9152
3           1031.2545       1377.9725         1181.1430        3983.6936
4           3257.5661       3433.5130         3644.4593         933.2016
5            535.0528        839.5253          523.3276        3708.1248
6           6259.4604      23886.0483         9353.2122       29776.4997

I used the following script to carry out PCA analysis:

a <- myPr

rv <- rowVars(as.matrix(a))

select <- order(rv, decreasing = TRUE)[seq_len(min(ntop = 12596, length(rv)))]

pca1 <- prcomp(t(a[select, ]))

scores <- data.frame(pca1$x[,1:ncol(pca1$rotation)])

scores.df <- data.frame(colnames(a), pca1$x[,1:ncol(pca1$rotation)])

pca1
summary(pca1)

The result is:

> summary(pca1)
Importance of components:
                             PC1       PC2       PC3       PC4       PC5       PC6       PC7       PC8       PC9      PC10      PC11
Standard deviation     5.441e+05 1.097e+05 6.167e+04 1.720e+04 1.293e+04 9.306e+03 8.535e+03 7.128e+03 4.357e+03 3.526e+03 2.666e+03
Proportion of Variance 9.470e-01 3.853e-02 1.217e-02 9.500e-04 5.300e-04 2.800e-04 2.300e-04 1.600e-04 6.000e-05 4.000e-05 2.000e-05
Cumulative Proportion  9.470e-01 9.855e-01 9.977e-01 9.987e-01 9.992e-01 9.995e-01 9.997e-01 9.999e-01 9.999e-01 1.000e+00 1.000e+00
                            PC12      PC13      PC14      PC15     PC16      PC17      PC18      PC19      PC20      PC21      PC22
Standard deviation     2.349e+03 1.851e-06 2.318e-07 2.279e-07 2.25e-07 2.173e-07 2.151e-07 2.148e-07 2.081e-07 2.065e-07 2.002e-07
Proportion of Variance 2.000e-05 0.000e+00 0.000e+00 0.000e+00 0.00e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
Cumulative Proportion  1.000e+00 1.000e+00 1.000e+00 1.000e+00 1.00e+00 1.000e+00 1.000e+00 1.000e+00 1.000e+00 1.000e+00 1.000e+00
                            PC23      PC24
Standard deviation     1.942e-07 4.214e-11
Proportion of Variance 0.000e+00 0.000e+00
Cumulative Proportion  1.000e+00 1.000e+00

But when I do:

biplot(pca1)

I get the following

the current PCA plot current PCA plot:

the desired PCA plot Desired PCA plot

This is my first time doing a PCA plot so any help will be greatly appreciated.

Many Thanks,

Ishack

ADD COMMENTlink 4 months ago ishackm • 60
Entering edit mode
0

Please see How to add images to a Biostars post to add your images properly. You need the direct link to the image, not the link to the webpage that has the image embedded (which is what you have used here)

ADD REPLYlink 4 months ago
RamRS
21k
Entering edit mode
0

Are your plots from the same dataset? How do you know that such a plot is possible with your dataset? They're both PC1 vs PC2 plots, maybe the nature of your data prevents the plot from being like #2?

ADD REPLYlink 4 months ago
RamRS
21k
Entering edit mode
0

Hi RamRS,

Thank you for your quick response

the second plot is from a different dataset but I would like to have the first dataset to have a plot similar to the second PCA plot, please

ADD REPLYlink 4 months ago
ishackm
• 60
Entering edit mode
1

Check @Kevin's PCAtools package: PCA plot from read count matrix from RNA-Seq While this refers to RNAseq the principle should be the same.

ADD REPLYlink 4 months ago
genomax
68k
Entering edit mode
0

Hi genomax, thanks for the link

I have created the following plot from this code:

library(factoextra)

fviz_eig(pca1)

fviz_pca_ind(pca1,
             col.ind = "cos2", # Color by the quality of representation
             gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
             repel = TRUE     # Avoid text overlapping)

The new PCA plot enter image description here

is Dim 1 and Dim 2 same as PCA 1 and PCA 2?

ADD REPLYlink 4 months ago
ishackm
• 60
Entering edit mode
0

As genomax says, you can just use my code from my other thread: PCA plot from read count matrix from RNA-Seq

Also PCAtools (https://bioconductor.org/packages/release/bioc/html/PCAtools.html) can be used - this was just released with Bioconductor 3.9

ADD REPLYlink 4 months ago
Kevin Blighe
43k

Login before adding your answer.

Powered by the version 1.5