Question

Help fixing my DGE heatmap?

0

Entering edit mode

6.3 years ago

Tania ▴ 180

Hi Everyone

I need your help fixing my heat map. My top 10 differential expressed genes are not the ones that show up in the heatmap. The ones that show in the heat map are the first 10 ordered in the samples counts, although they are not significant at all.
Below is a snapshot of my code and I added my heatmap too: https://ibb.co/gPKepm

labels=c("C1", "C2", "C3", "C4", "C5", "C6","C7", "C8", "C9", "C10", "T1","T2","T3","T4","T5","T6","T7","T8","T9","T10")
dge = DGEList(counts=cts, genes= rownames(cts), group=group)
countsPerMillion <- cpm(dge, prior.count=2, log=TRUE)
CPM = countsPerMillion
colnames(CPM) = labels
o <- order(etp$table$PValue)
CPM <- CPM[o[1:10],]
CPM <- t(scale(t(CPM)))
col.pan <- colorpanel(50, "blue", "white", "red")
heatmap.2(CPM, col=col.pan, Rowv=TRUE, scale="none",trace="none", dendrogram="none", cexRow=0.2, cexCol=0.9, density.info="none",margin=c(10,9), lhei=c(2,10), lwid=c(2,6))

Thanks

RNA-Seq Heatmap • 2.3k views

ADD COMMENT • link updated 6.3 years ago by Kevin Blighe 87k • written 6.3 years ago by Tania ▴ 180

1

Entering edit mode

Which numbers are the first ten of "o"? Did you try to associate them to gene names? Which gene names were associated to them? Are those the same that you see in the heatmap? Since I am not an expert, it is really hard to help more by only looking at the code.

ADD REPLY • link 6.3 years ago by Fabio Marroni ★ 3.0k

0

Entering edit mode

The o contains 1 to 10 and those are the genes which show up in the heatmap. The first 10 in etp before order which are not significant. This is what I can't fix. So the order in o is wrong I guess.

ADD REPLY • link 6.3 years ago by Tania ▴ 180

0

Entering edit mode

When I changed O to:

o = order(etp$table$PValue, na.last=TRUE,decreasing= TRUE, method="radix")

I got another list of genes that are significant but not the top either.

ADD REPLY • link 6.3 years ago by Tania ▴ 180

1

Entering edit mode

Try without decreasing=TRUE (you want the smaller p being the first one, not the last one).

ADD REPLY • link 6.3 years ago by Fabio Marroni ★ 3.0k

0

Entering edit mode

I tried it and still picks the wrong order. It doesn't even pick from top significant or least significant so I think of reversing, it picks some genes from the middle.

ADD REPLY • link 6.3 years ago by Tania ▴ 180

score 4 · Accepted Answer · 2017-12-16

4

Entering edit mode

6.3 years ago

Kevin Blighe 87k

This doesn't work because the gene/transcript indices in the etp object won't match those in your CPM object. You should instead subset CPM based on gene/transcript names.

Wherever you ran the topTags function, the genes should automatically be ordered by PValues there (lowest to highest). You can then just get the rownames of the resulting object and use those to subset CPM.

For example:

topGenes <- rownames(topTags(et, n=10))

CPM[topGenes,]

Take a look at the tutorial here: RNA Sequence Analysis in R: edgeR

ADD COMMENT • link 6.3 years ago by Kevin Blighe 87k

0

Entering edit mode

Thanks Kevin and Fabio. I think it works now, although it is kinda sparse (not sure if this is okay). [https://ibb.co/b0kbn6]

ADD REPLY • link 6.3 years ago by Tania ▴ 180

1

Entering edit mode

It looks like better scaling of the data would help. You may want to take some ideas from my various postings on heatmaps:

You already scale your data to Z scores but perhaps specifying breaks (like -1, +1) in addition would help.

ADD REPLY • link 6.3 years ago by Kevin Blighe 87k

0

Entering edit mode

Great, thanks so much Kevin, very helpful :)

ADD REPLY • link 6.3 years ago by Tania ▴ 180

1

Entering edit mode

Hi Tania. You are welcome

ADD REPLY • link 6.3 years ago by Kevin Blighe 87k

0

Entering edit mode

Great, thanks so much Kevin, very helpful :)

ADD REPLY • link 6.3 years ago by Tania ▴ 180

0

Entering edit mode

Great, thanks so much Kevin, very helpful :)

ADD REPLY • link 6.3 years ago by Tania ▴ 180