Biostars beta testing.
Question: heat map clustering
0
Entering edit mode

Hello everyone,

During RNA-seq analysis I faced a problem, need to perform hierarchical clustering heatmap and got the following result by using following code

var_genes <- apply(logCounts , 1, var)
select_var <- names(sort(var_genes, decreasing = TRUE))[1:i]
highly_variable_lcpm <<- logCounts[select_var,]
par(mfrow=c(1,2), mar=c(5,4,10,2))

mypalette <- brewer.pal(11,"RdYlBu")
morecols <- colorRampPalette(mypalette)

heatmap.2(highly_variable_lcpm, col=rev(morecols(50)) ,offsetRow=0, offsetCol = -0.2, cexCol = 0.6, trace="none", main=stri,ColSideColors=colors,scale="row")

enter image description here

So the question is, how can I change my plot to cluster my genes( should be three green and three purple in a row together)?

P.S I tried to use fewer samples but it didn't help Thanks.

ADD COMMENTlink 15 months ago Nick • 0 • updated 15 months ago ahmad mousavi • 430
Entering edit mode
0

Your genes are already clustered, as indicated by the dendrogram, at left. However, there does not appear to be any discernible pattern of expression. To reveal more patterns of expression, I would actually include more genes in the heatmap.

Other things that you can try:

  • use different distance metrics
  • use different agglomeration functions
  • use different linkage metrics

I go over some of these, here: A: How to cluster the upregulated and downregulated genes in heatmap?

Also, what is the source of your data? - just genes that have high variance among your 6 samples? Is there are a particular reason for showing these genes and not those that are statistically significantly differentially expressed?

ADD REPLYlink 15 months ago
Kevin Blighe
43k
0
Entering edit mode

Hi

For log2 from count matrix try this :

 library(pheatmap) 
library(gplots)
pdf(paste(fnh,".pdf",sep=""))
      pheatmap(log2(exp+1) ,show_rownames = F)
      pheatmap(log2(exp+1),show_rownames = F, color=greenred(75))
      dev.off()

you can use following code for heatmap of the best 40 genes based on DESeq2 result

  #Heatmap 40 top genes
      rld = rlogTransformation(cds)
      mat = assay(rld)[ head(order(res$padj),40), ] # select the top 30 genes with the lowest padj
      mat = mat - rowMeans(mat) # Subtract the row means from each value
      # Optional, but to make the plot nicer:
      df = as.data.frame(colData(rld)[,c("condition")]) # Create a dataframe with a column of the conditions
      colnames(df) = "Condition" # Rename the column header
      rownames(df) = colnames(mat) # add rownames
      # and plot the actual heatmap

      fnh40 <- paste("Heatmap_Top40_",fn,sep="")

      pdf(paste(fnh40,".pdf",sep=""))      
      pheatmap(mat, annotation_col=df)
      pheatmap(mat,annotation_col=df, color=greenred(75))
      dev.off()
ADD COMMENTlink 15 months ago ahmad mousavi • 430

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0