I'm trying to do a PCA analysis for a time course protein expression study and for three time points. I wish to find out proteins that best represents the time points.In other words I wish to get a list of proteins which appear in different "sectors" of time as given in this example plot/link.
PCA analysis of the time course response to benomyl of the wild type strain
I used the following R script for doing the PCA and generate the Biplot.
rm(list=ls())
my.df <- read.table("expression_log2ratio.txt",row.names=1,header=TRUE,sep="\t",check.names=FALSE)
prot.pca <- prcomp(na.omit(my.df), scale=FALSE)
summary(prot.pca)
biplot(prot.pca,col=c("blue","red"),cex=c(0.5,0.5) )
My input data is of the folllowing format with around 4000 proteins. The numerical values are that of log2ratios
T1 T2 T3
p1 -0.071396303 0.006385917 0.088535769
p2 -0.115839104 0.043409057 -0.035812972
p3 -0.01593602 -0.02627361 0.014833213
.....
My Biplot looks as follows:
Can anybody suggest how to extract the proteins in different 'sectors' using R ? Any help is highly appreciated.