Biostars beta testing.
Question: heat map clustering
Entering edit mode

Hello everyone,

During RNA-seq analysis I faced a problem, need to perform hierarchical clustering heatmap and got the following result by using following code

var_genes <- apply(logCounts , 1, var)
select_var <- names(sort(var_genes, decreasing = TRUE))[1:i]
highly_variable_lcpm <<- logCounts[select_var,]
par(mfrow=c(1,2), mar=c(5,4,10,2))

mypalette <- brewer.pal(11,"RdYlBu")
morecols <- colorRampPalette(mypalette)

heatmap.2(highly_variable_lcpm, col=rev(morecols(50)) ,offsetRow=0, offsetCol = -0.2, cexCol = 0.6, trace="none", main=stri,ColSideColors=colors,scale="row")

enter image description here

So the question is, how can I change my plot to cluster my genes( should be three green and three purple in a row together)?

P.S I tried to use fewer samples but it didn't help Thanks.

ADD COMMENTlink 15 months ago Nick • 0 • updated 15 months ago ahmad mousavi • 430
Entering edit mode

Your genes are already clustered, as indicated by the dendrogram, at left. However, there does not appear to be any discernible pattern of expression. To reveal more patterns of expression, I would actually include more genes in the heatmap.

Other things that you can try:

  • use different distance metrics
  • use different agglomeration functions
  • use different linkage metrics

I go over some of these, here: A: How to cluster the upregulated and downregulated genes in heatmap?

Also, what is the source of your data? - just genes that have high variance among your 6 samples? Is there are a particular reason for showing these genes and not those that are statistically significantly differentially expressed?

ADD REPLYlink 15 months ago
Kevin Blighe
Entering edit mode


For log2 from count matrix try this :

      pheatmap(log2(exp+1) ,show_rownames = F)
      pheatmap(log2(exp+1),show_rownames = F, color=greenred(75))

you can use following code for heatmap of the best 40 genes based on DESeq2 result

  #Heatmap 40 top genes
      rld = rlogTransformation(cds)
      mat = assay(rld)[ head(order(res$padj),40), ] # select the top 30 genes with the lowest padj
      mat = mat - rowMeans(mat) # Subtract the row means from each value
      # Optional, but to make the plot nicer:
      df =[,c("condition")]) # Create a dataframe with a column of the conditions
      colnames(df) = "Condition" # Rename the column header
      rownames(df) = colnames(mat) # add rownames
      # and plot the actual heatmap

      fnh40 <- paste("Heatmap_Top40_",fn,sep="")

      pheatmap(mat, annotation_col=df)
      pheatmap(mat,annotation_col=df, color=greenred(75))
ADD COMMENTlink 15 months ago ahmad mousavi • 430

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0