heatmap for rna seq data
0
0
Entering edit mode
3.7 years ago
Rob ▴ 170

Hi friends I want to plot heatmap. I dont know why each time I do with any software and also in R, I do not get any pattern for gene expression while I have differential expressed genes.

I used these types of data: raw data, log10, log2 transformed, z-score, and winscaled z-score. how can I solve the problem? Thanks

heatmap • 1.9k views
ADD COMMENT
0
Entering edit mode

You should put the code you are using here so we know what you are doing. Preferably a reproducible example using a smaller chunk of your data.

ADD REPLY
0
Entering edit mode

Without any example it's more difficult to help you.

Though, I'm not familiar with the TCGA kirk data, if it shows a similar distribution to omics data, RNA-seq, amplicon etc, with a small number of features/genes highly expressed and most of them lowly expressed, with a lot of zeros, an approach with a row Z-score transformation over a log2(raw values + pseudocount) transformed data table could give you what you're looking for.

Basically, the log2(raw values + pseudocount) transformation will reduce the long right tail. The pseudocount will handle the zeros, because log2 cannot handle. At this point, you should have a distribution more evenly distributed around the median. Then, if you apply a row Z-score (assuming that you've the genes on the rows) will basically highlight the under- and over-expressed features/genes across the samples. The Z-score will subtract the mean and divide by the standard deviation, so basically genes over-expressed are those above the mean, and under-expressed below the mean. If you apply the Z-score over the rows it will highlight differences across each gene, i.e., in which samples each gene is more or less expressed/abundant.

I hope this helps,

António

ADD REPLY
0
Entering edit mode

Thanks you Antonio

I do not get any pattern.

Please give me your helpful comments Thanks

ADD REPLY
0
Entering edit mode

You can use the code sample button to highlight the code and data. For instance, it's difficult for me to understand the structure of your data. I assume that the first column is gene names and the remaining columns are gene abundance for the different samples. Is this right?

António

ADD REPLY
0
Entering edit mode

First you need to log2 transform your data. Pheatmap allows you to scale the data by rows or columns if you use the parameter scale="row" or scale="col". So there isn't any need to scale yourself. Below is an example. The first part is just some lines to create a fake matrix.

## Import libraries
library("OmicsMarkeR")
library("pheatmap")

## create fake matrix
set.seed(1024)
m <- create.random.matrix(nvar = 6, 
                          nsamp = 10,
                          st.dev = 1, 
                          perturb = 0.2)
m <- m + sample(1:100000, 60)
colnames(m) <- paste0("samp_", 1:6)
rownames(m) <- paste0("gene_", 1:10)

## log2 transform + pseudocount
m_tr <- log2(m + 1)

## row Z-score
m_tr_z_score <- t(scale(t(m_tr)))

## plot heatmap
pheatmap(m_tr_z_score)

#or

pheatmap(m_tr, scale = "row")

The result figure:

enter image description here

As you can see in each row you have samples more reddish and bluish. Of course the intensity of the color depends how much your genes deviates from the mean. This works better with a real world data.

I hope this helps, António

ADD REPLY
0
Entering edit mode

Hello rhasanvandj!

We believe that this post does not fit the main topic of this site.

I am closing the post duplicated

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLY
0
Entering edit mode

rhasanvandj : Don't delete threads once they have received comments/answers. Also don't post same question in separate threads.

ADD REPLY

Login before adding your answer.

Traffic: 2781 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6