random data

Question

cannot replicate the pheatmap scale function

2

Entering edit mode

5.5 years ago

lessismore ★ 1.3k

Dear all,

i am using pheatmap to generate some heatmaps using the function scale=row but i cannot replicate (at least visually, because i didn't manage to export the scaled matrix) the results if i manually scale the matrix with t(scale (t(my.mat))) (the scale function scales by column). Normally that would be used as scale(x, center = TRUE, scale = TRUE). I've seen that because when replotting with the scale=none the the manually scaled matrix the heatmaps look different.
If somebody has any idea about why this happens, it would be really appreciated.

p.s. i've checked the exact function they used:

scale_rows = function(x){
    m = apply(x, 1, mean, na.rm = T)
    s = apply(x, 1, sd, na.rm = T)
    return((x - m) / s)
}

the clustering in fact is exactly the same but it looks like a difference in the colour bar.

pheatmap r • 12k views

ADD COMMENT • link 5.5 years ago by lessismore ★ 1.3k

0

Entering edit mode

the clustering in fact is exactly the same but it looks like a difference in the colour bar.

It should be the opposite. The scaling is the same, but the clustering is different due to the order or operations. See this previous thread: Clustering differences between heatmap.2 and pheatmap

ADD REPLY • link 5.5 years ago by igor 13k

0

Entering edit mode

Hey igor, it's the same clustering in my case because i am comparing
manually scaled > pheatmap function with scale=none VS pheatmap scaled with scale=rows

ADD REPLY • link 5.5 years ago by lessismore ★ 1.3k

score 3 · Answer 1 · 2018-10-24

The row scaling functions from both pheatmap() (pheatmap) and heatmap.2() (gplots) should produce the same results as t(scale(t(x))). Here is the proof using the functions from these packages:

random data

randomdata <- matrix(rexp(200, rate=.1), ncol=20)

heatmap.2 (gplots) row scaling

heatmap.2.scale <- function(x, na.rm) {
        retval=NULL
        retval$rowMeans <- rm <- rowMeans(x, na.rm = na.rm)
        x <- sweep(x, 1, rm)
        retval$rowSDs <- sx <- apply(x, 1, sd, na.rm = na.rm)
        x <- sweep(x, 1, sx, "/")
    }

randomdata.scaled1 <- round(heatmap.2.scale(randomdata, na.rm=TRUE), 3)

pheatmap row scaling

pheatmap.scale <- function(x) {
    m = apply(x, 1, mean, na.rm = T)
    s = apply(x, 1, sd, na.rm = T)
    return((x - m) / s)
}

randomdata.scaled2 <- round(pheatmap.scale(randomdata), 3)

manual row scaling

randomdata.scaled3 <- round(data.frame(t(scale(t(randomdata)))), 3)

test if there are differences

all((randomdata.scaled1 == randomdata.scaled2) == TRUE)
[1] TRUE

all((randomdata.scaled1 == randomdata.scaled3) == TRUE)
[1] TRUE

all((randomdata.scaled2 == randomdata.scaled3) == TRUE)
[1] TRUE

You should check your data for missing values.

Kevin