Biostar Beta. Not for public use.
Question: Set point colour based on threshold in scatter plot
Entering edit mode


I have a set of samples and log10 total read counts of them in a data frame like below

> head(a)
A1            6.468503
A10           6.565213
A11           6.752139
A12           5.078598
A2            6.277342
A3            6.473411

enter image description here

For instance red dots have less than 1000000 counts and black dots have more thsn 1000000 counts.

How could I repreduce this plot please?

ADD COMMENTlink 14 months ago F ♦ 3.4k • updated 14 months ago cpad0112 11k
Entering edit mode

Something like the following would work with base R functions:

# make some data
a <- data.frame(sample=paste0("A", 1:50), log10_total_counts=rnorm(50, 6))

# get x points
x <- 1:length(a$log10_total_counts)

# create plot, drop x axis
plot(x, a$log10_total_counts, ylim=c(0,8), pch=19, xaxt="n", xlab="sample")
# add dashed line and X-axis labels with rotation
abline(h=6, lty="dashed")
axis(1,1:50, labels=a$sample, las=3)

# select points to color differently
LessThan6 <- a$log10_total_counts < 6
points(x[LessThan6], a$log10_total_counts[LessThan6], pch=19, col="red")
ADD COMMENTlink 14 months ago seidel 6.8k
Entering edit mode

Use ggplot2.

Something is missing in your example; the labels at the bottom do not match you data frame, and it's unclear how your sample dataframe would count up to values near 1000000. You might want to precompute/add a QC vector based on that to your dataframe; basically have your dataframe in a state that's ready to be plotted with minimal calculations.

Here's an example; I generate a dataframe with normally distributed values centered on 6.0 (seemed close to what your example had). I plot a dotted line at 6; and anything below 6 becomes red.

Example ggplot call with geom_points and two colours.

For the style, I directly took the function theme_Publication() from this link.

Here's the code:


# Themes for style
theme_Publication <- function(base_size=14, base_family="helvetica") { ... }
scale_fill_Publication <- function(...){ ... }
scale_colour_Publication <- function(...){ ... }

# Function to plot points as described
plotReads <- 
  function(DF) {
    ggplot(DF) + 
      theme_Publication() + # Style it
      geom_point(mapping = aes(x = as.factor(1:nrow(DF)), # Sequentially
                               y = log10_total_counts, # Y is the count values
                               color = QC)) + # Color it by the QC vector in the dataframe
      scale_color_manual(values = c("PASS" = "black", "FAIL" = "red")) + # Tell it what colours to use.
      geom_hline(yintercept = 6.0, linetype="dotted") + # Add the dotted line at 6.0
      xlab("Sample") + # Label the X
      ylab("log10(Total Reads)") + # Label the Y
      ylim(c(-0.2, 8.2)) # Extend the Y axis to be between this range.

DF = data.frame( 
  log10_total_counts = rnorm(30, mean = 6, sd = 0.5), # Randomized data
  QC = as.factor(c( rep("FAIL",15), rep("PASS", 15) )) # A vector that is just half FAIL and half PASS

DF$QC <- as.factor(ifelse(DF$log10_total_counts < 6.0, "FAIL", "PASS")) # Now actually add some logic to what should be PASS or FAIL.
ADD COMMENTlink 14 months ago manuel.belmadani • 830
Entering edit mode

None of the code uses anything from gridExtra - is it being loaded for code within the _Publication functions?

ADD REPLYlink 14 months ago
Entering edit mode

You're right, it's not needed here. Edited, and thanks!

ADD REPLYlink 14 months ago
• 830
Entering edit mode

Note: I cleaned up a discussion that sprang off an unwarranted comment of mine. Apologies!

ADD REPLYlink 14 months ago
Entering edit mode

another plot in ggplot2 with simulated data:

df=data.frame(genes=paste0("gene_", seq_along(1:75)), reads=round(rnorm(75, 110,2),0))

ggplot(df,aes(genes, reads, color = ifelse( reads < 110, "Fail", "Pass")))+
    geom_point() +
  scale_y_continuous(limits = c(0, 200))+
  scale_color_manual(name="QC", values = c("red","darkgreen"))+
  geom_hline(yintercept = 110, color="red")+
  labs(y=expression("log"[10]("Total reads")))+
  theme(axis.text.x = element_text(angle = 45,hjust = 1),
        legend.position = "bottom")


ADD COMMENTlink 14 months ago cpad0112 11k
Entering edit mode

BIOSTARS I am so lucky that I can ask for your help and obtain such a nice solutions THANK YOU

ADD REPLYlink 14 months ago
♦ 3.4k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0