Biostar Beta. Not for public use.
R package used for PCA plotting in a paper (rice RNA-Seq)
Entering edit mode
2.0 years ago
Ann ♦ 2.2k
Concord NC USA

I'm looking for an R package that can do principal component analysis and make a 3-D plot of the principal components, as shown in Fig. 1 in this paper:

Comparative transcriptome analysis of transporters phytohormone and lipid metabolism pathways in response to arsenic stress in rice (Oryza sativa) 2012

Figure link:

Does anyone recognize the plot? (Please tell me it's not Excel :-)

Entering edit mode

I'd suggest that if good separation into groups is achievable by a 2D plot (try plotting any two of the first 3 PCs against each other), then 3D may be superfluous.

Entering edit mode

I'd use base R function pairs() or ggolot version ggpairs. In referenced paper it is hard to see the depth.

Entering edit mode
14 months ago
Seattle, WA USA

The princomp library can generate points in three-dimensional space.

Once you have those in a data frame with columns, say, PC1, PC2, PC3, name, and rColor — corresponding to the first, second and third components, the experiment name, and the R color name, respectively — you could use the rgl library to make a PDF file to annotate in Adobe Illustrator (which is probably what the authors did, to highlight the two classes).

For example:

featureRadius <- 15
featureShininess <- 20
featureTransparency <- 1
thetaStart <- 45
offset <- 50
par3d(windowRect=c(offset, offset, 1280+offset, 1280+offset))
rgl.viewpoint(theta=thetaStart, phi=30, fov=30, zoom=1)
spheres3d(df$PC1, df$PC2, df$PC3, radius=featureRadius, color=df$rColor, alpha=featureTransparency, shininess=featureShininess)
aspect3d(1, 1, 1)
title3d("", "", "PC1", "PC2", "PC3", col='black', line=1)
texts3d(df$PC1, df$PC2, df$PC3, text=df$name, color="black", adj=c(0,0))
rgl.light(-45, 20, ambient='black', diffuse='#dddddd', specular='white')
rgl.light(60, 30, ambient='#dddddd', diffuse='#dddddd', specular='black')
filename <- "PCA.labeled.pdf"
rgl.postscript(filename, fmt="pdf")

Printing a 3D cube on a 2D piece of paper can hide depth details. But this can be addressed with some more work.

One technique I found useful when using this for visualizing principle coordinate analysis (not PCA, but the code is basically the same) was to write an R script that loops through the theta value in the rgl.viewpoint() call, between 0 and 359, and makes differently-named PNGs at every step with rgl.snapshot(), instead of rgl.postscript().

I used to use imagemagick to convert the set of 360 PNGs to equivalent GIFs, and I then used gifsicle to make an animated GIF. I viewed the animation in a web browser or OS X Preview to get a truer picture of cluster dispersement and, thus, was able to explore and pick the best angle from which to render a publication-quality PDF.

More recently, I wrote a webGL tool called Cubemaker that automates all of this manual labour. The end user imports a three-, four- or more columned text file with data points, names and category assignments. The browser renders the data and offers an interface to rotate and zoom the cube, as well as export PNG, PDF and animated GIF files.

Entering edit mode

How do we get PC1, PC2, PC3, from gene expression data ?


Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1