cannot replicate the pheatmap scale function
1
2
Entering edit mode
5.5 years ago
lessismore ★ 1.3k

Dear all,

i am using pheatmap to generate some heatmaps using the function scale=row but i cannot replicate (at least visually, because i didn't manage to export the scaled matrix) the results if i manually scale the matrix with t(scale (t(my.mat))) (the scale function scales by column). Normally that would be used as scale(x, center = TRUE, scale = TRUE). I've seen that because when replotting with the scale=none the the manually scaled matrix the heatmaps look different.
If somebody has any idea about why this happens, it would be really appreciated.

p.s. i've checked the exact function they used:

scale_rows = function(x){
    m = apply(x, 1, mean, na.rm = T)
    s = apply(x, 1, sd, na.rm = T)
    return((x - m) / s)
}

the clustering in fact is exactly the same but it looks like a difference in the colour bar.

pheatmap r • 12k views
ADD COMMENT
0
Entering edit mode

the clustering in fact is exactly the same but it looks like a difference in the colour bar.

It should be the opposite. The scaling is the same, but the clustering is different due to the order or operations. See this previous thread: Clustering differences between heatmap.2 and pheatmap

ADD REPLY
0
Entering edit mode

Hey igor, it's the same clustering in my case because i am comparing
manually scaled > pheatmap function with scale=none VS pheatmap scaled with scale=rows

ADD REPLY
3
Entering edit mode
5.5 years ago

The row scaling functions from both pheatmap() (pheatmap) and heatmap.2() (gplots) should produce the same results as t(scale(t(x))). Here is the proof using the functions from these packages:

random data

randomdata <- matrix(rexp(200, rate=.1), ncol=20)

heatmap.2 (gplots) row scaling

heatmap.2.scale <- function(x, na.rm) {
        retval=NULL
        retval$rowMeans <- rm <- rowMeans(x, na.rm = na.rm)
        x <- sweep(x, 1, rm)
        retval$rowSDs <- sx <- apply(x, 1, sd, na.rm = na.rm)
        x <- sweep(x, 1, sx, "/")
    }

randomdata.scaled1 <- round(heatmap.2.scale(randomdata, na.rm=TRUE), 3)

pheatmap row scaling

pheatmap.scale <- function(x) {
    m = apply(x, 1, mean, na.rm = T)
    s = apply(x, 1, sd, na.rm = T)
    return((x - m) / s)
}

randomdata.scaled2 <- round(pheatmap.scale(randomdata), 3)

manual row scaling

randomdata.scaled3 <- round(data.frame(t(scale(t(randomdata)))), 3)

test if there are differences

all((randomdata.scaled1 == randomdata.scaled2) == TRUE)
[1] TRUE

all((randomdata.scaled1 == randomdata.scaled3) == TRUE)
[1] TRUE

all((randomdata.scaled2 == randomdata.scaled3) == TRUE)
[1] TRUE

You should check your data for missing values.

Kevin

ADD COMMENT
1
Entering edit mode

You edited the post while I answered. They may use different breaks, which would directly affect the colour bar and colour shading.

ADD REPLY
0
Entering edit mode

That's the only explanation! Thanks

ADD REPLY

Login before adding your answer.

Traffic: 1697 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6