Biostar Beta. Not for public use.
Can you add a label key to heatscatter plot in R?
0
Entering edit mode
2.7 years ago
james.lloyd • 80
United States

I have made a heatscatter plot in R (I am new to R) and it looks good but I would like to add a Key to label what the colours represent (datapoint depth). This would be a really good feature to help the reader of the figure to understand what it means quickly but I cannot see this feature in the documentation.

Any help would be great!

For_R_PSI_only <- read.table('For_R_PSI_only.txt', sep="\t", header=TRUE)
library(LSD)
heatscatter(For_R_PSI_only$WT_exp2,For_R_PSI_only$WT_exp3, colpal="bl2gr2rd", xlab='1',ylab='1')
ADD COMMENTlink
1
Entering edit mode

Were you asking about legend by any chance?

ADD REPLYlink
1
Entering edit mode

I think OP is indeed referring to the legend.

ADD REPLYlink
0
Entering edit mode

Yes, a legend to indicate what density level the colours in the plot represent.

ADD REPLYlink
1
Entering edit mode

An alternative to a heatscatter plot is a plot with Hexagonal Binning. You can create such a plot using stat_binhex() function in the ggplot2 package or hexbinplot() function in the hexbin package. You may also find this useful.

ADD REPLYlink
0
Entering edit mode

The script you're using to generate the plot would help us get an idea of your approach and where a potential solution could be plugged in. Could you add that maybe?

ADD REPLYlink
0
Entering edit mode

Ah, sorry about that, here is the code (added to original post).

ADD REPLYlink
2
Entering edit mode

Please also add the line library(LSD) just so people are clear on the library in play here. The authors seem to have given no pointers to adding a legend. Maybe look for a different library that supports heat maps better (like MASS: http://cran.r-project.org/web/packages/MASS/index.html ).

These links might be of help:

https://www.biostars.org/p/73193/

http://stats.stackexchange.com/questions/31726/scatterplot-with-contour-heat-overlay

ADD REPLYlink
0
Entering edit mode
14 months ago
seidel 6.8k
United States

I'm not really sure how you'd match up the colors, (though it looks like you can hand heatscatter a pallette) but you can add a legend to a plot using the legend function:

legend("bottomright", legend=letters[1:8], pch=19, col=rainbow(8))

If you want to draw the legend outside the plot, you can use coordinates (rather than a keyword) and you might have to adjust the margins before drawing the plot with:

par(mar=c(5.1, 4.1, 4.1, 5.1))

and then draw the plot, then draw the legend.

edit: Here's a simple example. I added the contour to help the visualization.

library(LSD)
# create some data
x <- rnorm(100)
y <- rnorm(100)
# draw the plot
heatscatter(x,y, add.contour=T, ncol=8, colpal="bl2gr2rd")
# get the density values
d <- kde2d(x,y)
# d is a list, the z element determines color
# Make a sequence of labels reflecting what's in z
ColorLevels <- round(seq(min(d$z), max(d$z), length=8),2)
# draw the legend
legend("bottomright", legend=ColorLevels, pch=19, col=blue2green2red(8))
ADD COMMENTlink
1
Entering edit mode

I think the OP's main goal is to create a legend that should match up with (and truly represent) the colors in the heatscatter plot. Otherwise creating a legend is pretty simple. Which is why the questions says Key (similar to a heatmap.2 key)

ADD REPLYlink
0
Entering edit mode

So there's more than one question here. If you're "new to R", nothing is simple :) The code above references bl2gr2rd which is a shorthand for a color palette from the colorRamps package called: blue2green2red, which could be used to create the palette that is being used in the plot, but it would need to be scaled according to the values returned by the kde2d function, and if one examines those, they seem non-trivial to me. I don't know how to easily whip up a scaled legend for that, and indeed I don't even know what the labels would represent - does kde2d return a meaningful density scale or range of numbers one could easily map to a color gradient? If so, you might consider using the "layout" functionality of R to divide up the plot space, and draw your scatter plot in a large portion, and in a smaller portion use "image" to draw a color gradient matching the range of numbers, it's a bit complicated.

ADD REPLYlink
0
Entering edit mode

Thank you for your advice. I do not mind whether I use the bl2gr2rd or the default 'heat' but I doubt that makes a difference to my problem.

It does not sound like an easy fix, which is what I was hoping for. It is just surprising to me that such a feature was not added into it to start with. I will have to try and get another method working on it but so far I have not had luck getting graphs that look good or reveal any meaning like these do and they are not as pretty.

ADD REPLYlink
0
Entering edit mode

I added an example that seems to work, but I took a shortcut and narrowed down the number of colors to match the number of labels to draw. One could make it much more complicated by drawing a gradient scale, using many more colors, or allowing a broad range of colors but then partitioning it to fit a set of labels. You might consider writing the author and suggesting they add a legend option - or at least request that the heatscatter() function return some values you can use to draw your own rather than returning NULL, and having to call kde2d independently (for instance the histogram() function returns a nice little data structure (if you assign the function call to a variable) that can be very handy, as well as drawing a histogram).

ADD REPLYlink
0
Entering edit mode

OK. That would probably do the trick but I am having problems with NA values in the columns I am interested in. The heatscatter doesn't mind them but

d <- kde2d(x,y)

does care about NA. The data frame has many columns and I don't want to do na.omit on the whole data frame as it will remove nearly all the data due to other columns not important for this having incomplete or NA values. I could break it down into lots of smaller data frames but that would be laborious. Perhaps it is the only way.

df <- na.omit(For_R$WT_exp2,For_R$WT_exp3)

But this did not return a useful data frame.

ADD REPLYlink
0
Entering edit mode

Oh that's just because you're handing na.omit() two vectors instead of a data frame (even though they each come from a data frame). Try it this way:

kde2d(na.omit(data.frame(For_R$WT_exp2,For_R$WT_exp3)))
ADD REPLYlink
0
Entering edit mode

Dear Seidel, This code throws me an error message

Error in kde2d(na.omit(data.frame(For_R$WT_exp2, For_R$WT_exp3))) : 
  argument "y" is missing, with no default

Also when I run through your above information with a data frame without NA values I get a legend but for each colour it is labelled 0.

ADD REPLYlink
0
Entering edit mode

My mistake, trying to do two steps in one. kde2d doesn't take a data frame, it takes two vectors.

foo <- na.omit(data.frame(x=For_R$WT_exp2,y=For_R$WT_exp3))

kde2d(foo$x, foo$y)

as for the legend with the data in your data frame, the likely reason is that I rounded the values to 2 digits. If you get rid of the round() part of the call, the legend should populate with the fully visible values representing the range of your data. You could also visualize those numbers with the add.contour=T parameter set, so you can see if there's a range you'd like to round them to.

ADD REPLYlink
0
Entering edit mode

Great, thanks for all your help. When I remove the round part it removed the around() part it works, although the numbers are super small so not sure how helpful they actually are. Perhaps just the contour lines will be helpful or I can play around it with. I now need to understand how to get legends outside the plot area as well, but thanks again! (Apparently the developers are planning to add the feature but they are not sure when.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3