Biostar Beta. Not for public use.
How to check if first two components of PCA are separated without visualisation?
0
Entering edit mode
19 months ago
fernardo • 130
Italy

Hi Everyone,

In the following examples we see the second example has better result and more separable.

Is it possible to see somehow find this separation based on the data matrix of PCA result? e.g. based on some kind of score like Mean, Median calculation of the components or any other way?

PCA_example 1:

PCA_example 2:

The code used:

``````pca <- prcomp(dataMatrix, scale=T)
scores <- data.frame(Groups, pca\$x[,1:3])
pc1.2 <- qplot(x=PC1, y=PC2, data=scores, colour=factor(Groups)) + theme(legend.position="right")
``````

1
Entering edit mode
14 months ago
raunakms ♦ 1.1k

First get the PCA eigenvalues of the first two Principal Components (PC1 & PC2) using `pca\$x[,1:2]`. Then calculate in-class distance (i.e. the pairwise distance between the samples belonging to the same class) as well as out-class distance (i.e. the pairwise distance between a sample belonging to the one class and each of the samples in the other class). If the average of the resulting out-class distance is greater than the average of in-class distance, you are most likely to get a distinct clusters of sample groups.

0
Entering edit mode

Thanks a lot. It seems a solution to me. I am going to try that. But to fully understand your point:

1- in-class distance: do you mean to calculate pairwise correlation between PC1 and PC2 for a condition(class) ?

2- out-class distance: this one related to first point and didn't get it actually.

Thanks