Biostars beta testing.
Question: Set point colour based on threshold in scatter plot
1
Entering edit mode

Hi,

I have a set of samples and log10 total read counts of them in a data frame like below

> head(a)
    log10_total_counts
A1            6.468503
A10           6.565213
A11           6.752139
A12           5.078598
A2            6.277342
A3            6.473411
>

enter image description here

For instance red dots have less than 1000000 counts and black dots have more thsn 1000000 counts.

How could I repreduce this plot please?

ADD COMMENTlink 12 months ago F ♦ 3.4k • updated 12 months ago cpad0112 11k
7
Entering edit mode

Something like the following would work with base R functions:

# make some data
a <- data.frame(sample=paste0("A", 1:50), log10_total_counts=rnorm(50, 6))

# get x points
x <- 1:length(a$log10_total_counts)

# create plot, drop x axis
plot(x, a$log10_total_counts, ylim=c(0,8), pch=19, xaxt="n", xlab="sample")
# add dashed line and X-axis labels with rotation
abline(h=6, lty="dashed")
axis(1,1:50, labels=a$sample, las=3)

# select points to color differently
LessThan6 <- a$log10_total_counts < 6
points(x[LessThan6], a$log10_total_counts[LessThan6], pch=19, col="red")
ADD COMMENTlink 12 months ago seidel 6.8k
6
Entering edit mode

Use ggplot2.

Something is missing in your example; the labels at the bottom do not match you data frame, and it's unclear how your sample dataframe would count up to values near 1000000. You might want to precompute/add a QC vector based on that to your dataframe; basically have your dataframe in a state that's ready to be plotted with minimal calculations.

Here's an example; I generate a dataframe with normally distributed values centered on 6.0 (seemed close to what your example had). I plot a dotted line at 6; and anything below 6 becomes red.

Example ggplot call with geom_points and two colours.

For the style, I directly took the function theme_Publication() from this link.

Here's the code:

library("ggplot2")

# Themes for style
theme_Publication <- function(base_size=14, base_family="helvetica") { ... }
scale_fill_Publication <- function(...){ ... }
scale_colour_Publication <- function(...){ ... }

# Function to plot points as described
plotReads <- 
  function(DF) {
    ggplot(DF) + 
      theme_Publication() + # Style it
      geom_point(mapping = aes(x = as.factor(1:nrow(DF)), # Sequentially
                               y = log10_total_counts, # Y is the count values
                               color = QC)) + # Color it by the QC vector in the dataframe
      scale_color_manual(values = c("PASS" = "black", "FAIL" = "red")) + # Tell it what colours to use.
      geom_hline(yintercept = 6.0, linetype="dotted") + # Add the dotted line at 6.0
      xlab("Sample") + # Label the X
      ylab("log10(Total Reads)") + # Label the Y
      ylim(c(-0.2, 8.2)) # Extend the Y axis to be between this range.
  }

DF = data.frame( 
  log10_total_counts = rnorm(30, mean = 6, sd = 0.5), # Randomized data
  QC = as.factor(c( rep("FAIL",15), rep("PASS", 15) )) # A vector that is just half FAIL and half PASS
)
plotReads(DF)

DF$QC <- as.factor(ifelse(DF$log10_total_counts < 6.0, "FAIL", "PASS")) # Now actually add some logic to what should be PASS or FAIL.
plotReads(DF)
ADD COMMENTlink 12 months ago manuel.belmadani • 830
Entering edit mode
1

None of the code uses anything from gridExtra - is it being loaded for code within the _Publication functions?

ADD REPLYlink 12 months ago
RamRS
21k
Entering edit mode
0

You're right, it's not needed here. Edited, and thanks!

ADD REPLYlink 12 months ago
manuel.belmadani
• 830
Entering edit mode
0

Note: I cleaned up a discussion that sprang off an unwarranted comment of mine. Apologies!

ADD REPLYlink 12 months ago
RamRS
21k
4
Entering edit mode

another plot in ggplot2 with simulated data:

df=data.frame(genes=paste0("gene_", seq_along(1:75)), reads=round(rnorm(75, 110,2),0))
library(ggplot2)

ggplot(df,aes(genes, reads, color = ifelse( reads < 110, "Fail", "Pass")))+
    geom_point() +
  scale_y_continuous(limits = c(0, 200))+
  scale_color_manual(name="QC", values = c("red","darkgreen"))+
  geom_hline(yintercept = 110, color="red")+
  theme_bw()+
  xlab("Sample")+
  labs(y=expression("log"[10]("Total reads")))+
  theme(axis.text.x = element_text(angle = 45,hjust = 1),
        legend.position = "bottom")

Rplot01

ADD COMMENTlink 12 months ago cpad0112 11k
Entering edit mode
1

BIOSTARS I am so lucky that I can ask for your help and obtain such a nice solutions THANK YOU

ADD REPLYlink 12 months ago
F
♦ 3.4k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0