Biostar Beta. Not for public use.
Need Cluster of cluster analysis steps
0
Entering edit mode
3.1 years ago
Mamta • 410
United States

HI all,

I read multiple papers on cancer multi-platform genomics data that used Cluster of cluster analysis to see consensus in the different platforms.

I want to do the same but I am confused with the steps- 1. So I do clustering on each dataset and get cluster information (like cluster assignment) so this can be number 1- k 2. The paper says-

"Briefly, subtype calls from each of the following 4 platforms: mRNA, miRNA, methylation, and copy number were used to identify relationships between the different data type’s classifications. Subtypes defined from each platform were coded into a series of indicator variables, resulting in a matrix of 1s and 0s. Hierarchical clustering of this matrix was performed using the ConsensusClusterPlus R-package"

Iam not sure how this matrix was just 0 and 1s?

Anyone familiar who can help?

Thanks!!! Mamta

0
Entering edit mode

I am not completely certain about this, but I guess what they meant is something like: If this subtype is identified by miRNA, I will code it as 1 under the miRNA part. So in the end, we will have something like

                         miRNA    mRNA   methylation    CNV
Subtype A              1            0              1                0
Subtype B              0            0              1                1


Which basically indicates whether if the subtype can be identified by particular method. However, I am not 100% certain so please correct me if I am wrong

0
Entering edit mode

Thanks Sam! What u said makes sense and I think probably that's what they did. However, the integrated heatmap had samples instead of subtypes. I will update if I know more about this.

Mamta

0
Entering edit mode
2.6 years ago
United States

Hi Mamta,

CoC analysis is basically based on the cluster information from various platforms(miRNA, mRNA, methylation, CNV, etc). The basic Idea being, you should have the same number of samples in each platform and some way of clustering them. It can be NMF, heatmap clustering, kmeans, etc.

So say miRNA data reveals 3 clusters, miRNA1, miRNA2, miRNA3. mRNA data shows 2 clusters, mRNA1 and mRNA2, etc. and so on.

We essentially create a matrix with samples as column names and rows as the clusters from each platform i.e. miRNA1, miRNA2, miRNA3, mRNA1, mRNA2,etc. So for each sample, you would assign 1 to the corresponding cluster and 0 otherwise.

       S1   S2   S3   S4   S5
miRNA1   1   0   1   1   0
miRNA2   0   1   0   0   0
miRNA3   0   0   0   0   1
mRNA1   0   1   1   1   0
mRNA2   1   0   0   0   1
meth1   1   0   0   0   1
meth2   0   0   0   1   0
meth3   0   1   0   0   0
meth4   0   0   1   0   0
CNV1   1   0   0   0   0
CNV2   0   1   1   1   1


You then need to perform clustering analysis on this data using ConsensusclusterPlus package.