Biostar Beta. Not for public use.
How I reproduce such a plot
0
Entering edit mode
12 months ago
F ♦ 3.4k
Iran

Hi,

I have a list of differentially expressed genes (DEGs) from single cell RNA-seq between two clusters of cells. I have also a list of differentially expressed proteins (DEPs) from proteomics . I want to classify DEGs and DEPs, and their overlap into individual functional groups something like below picture but I don't know how. I know how to classify them individually but I need a picture shows all together. For example GO terms for DEGs, GO terms fro DEPs and GO terms for their overlap enter image description here

Any idea?

ADD COMMENTlink
0
Entering edit mode

Divide it into its components:

  1. stacked bar-plot, rotated horizontally
  2. Venn diagram
ADD REPLYlink
0
Entering edit mode

Thank you; Supposing 100 DEGs , 200 DEPs and 70 overlap, they are being classified into different Terms so how I select which term for plotting?

ADD REPLYlink
0
Entering edit mode

I'm not sure what the message is behind that plot, what do you want to show?

ADD REPLYlink
0
Entering edit mode

The relationship between the transcriptome and proteome data

ADD REPLYlink
4
Entering edit mode
12 months ago
SMK ♦ 1.3k
Ghent, Belgium

Hi F,

They can be reproduced using ggplot and VennDiagram::draw.pairwise.venn:

library(tidyverse)
library(VennDiagram)
library(GO.db)

# Grap some example from E. coli
gene2go <- read_tsv("https://www.uniprot.org/uniprot/?query=organism:83333&format=tab&columns=id,go-id")
colnames(gene2go) <- c("Gene", "GO")
DECs <- gene2go[sample(nrow(gene2go), 500),]
DEPs <- gene2go[sample(nrow(gene2go), 500),]

# Calcuate sets
sets <- calculate.overlap(x = list("DECs" = DECs$Gene,
                                   "DEPs" = DEPs$Gene))
Overlap <- sets$a3
DECs_only <- setdiff(sets$a1, Overlap)
DEPs_only <- setdiff(sets$a2, Overlap)
df_sets <- rbind(
  data.frame(Type = rep("Overlap", length(Overlap)), Gene = Overlap),
  data.frame(Type = rep("DECs_only", length(DECs_only)), Gene = DECs_only),
  data.frame(Type = rep("DEPs_only", length(DEPs_only)), Gene = DEPs_only)
)

# Combine with GO data and flatten GO
df_sets_go <- left_join(df_sets, gene2go, by = "Gene") %>% separate_rows(., "GO", sep = "; ")
df_sets_go$Description <- Term(df_sets_go$GO)
levels(df_sets_go$Type) <- as.vector(c("DECs", "DEPs", "Overlap"))

# Only look at top 20 GO terms
GO_top20 <- t(t(sort(table(df_sets_go$GO)))) %>% tail(20) %>% row.names()

# Barplot
ggplot(filter(df_sets_go, GO %in% GO_top20), aes(str_to_sentence(Description))) +
  geom_bar(color = "black", aes(fill = Type)) +
  coord_flip() +
  theme_bw() +
  scale_fill_manual(values = c(
    "DECs" = "black",
    "DEPs" = "white",
    "Overlap" = "grey"
  )) +
  scale_y_continuous(expand = c(0, 0)) +
  xlab("") +
  ylab("Number of DECs or DEPs") +
  theme(legend.position = "top",
        legend.title = element_blank())

# Venn diagram for the whole sets (not only the genes in GO barplot)
draw.pairwise.venn(
  area1 = length(DECs_only),
  area2 = length(DEPs_only),
  cross.area = length(Overlap),
  category = c("DECs", "DEPs")
)

barplot

venn-diagram

Hope it helps.

ADD COMMENTlink
0
Entering edit mode

Thank you, seems amazing but how I provide gene2go?

ADD REPLYlink
0
Entering edit mode

Starting from the table which looks like this:

> head(as.data.frame(DECs))
    Gene                                                                                             GO
1 P0A7S9                         GO:0000049; GO:0003735; GO:0005829; GO:0006412; GO:0019843; GO:0022627
2 P0AFW0 GO:0001000; GO:0001073; GO:0001124; GO:0003677; GO:0005829; GO:0008494; GO:0031564; GO:0045727
3 P76000                                                                                     GO:0019867
4 P0A953                                                 GO:0004315; GO:0005829; GO:0006633; GO:0008610
5 Q9JMT8                                                                         GO:0003677; GO:0006355
6 P0AE34                                     GO:0005886; GO:0005887; GO:0022857; GO:0055052; GO:0097638

Please check the codes and see how each dataframe looks like.

ADD REPLYlink
0
Entering edit mode

Sorry I mean which tool you have used to produce the source of gene2go? Which functional annotation tool?

Also I mam getting this error

> sets <- calculate.overlap(x = list("DECs" = DECs$Gene,
+                                    "DEPs" = DEPs$Gene))
Error in calculate.overlap(x = list(DECs = DECs$Gene, DEPs = DEPs$Gene)) : 
  could not find function "calculate.overlap"

Which package gives this function?

ADD REPLYlink
0
Entering edit mode

You can use InterProScan.

ADD REPLYlink
0
Entering edit mode
ADD REPLYlink
0
Entering edit mode

Thank you so much, I don't have a list of protein sequences rather I have a list of protein IDs that I have converted them to gene symbol. I also have a list of genes from single cell RNA-seq. The goal is to seeing the relationship of proteomics and single cell RNA-seq. For example how much GO terms or pathways are persistent in both data sets.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1