Creating a Venn diagram from RNA seq data
4
0
Entering edit mode
5.4 years ago
Bolesaem ▴ 10

First post here. Hello everyone!

This is the first time I have to use bioinformatics, so I apologize but this is going to be a very basic question. I would like to create a Venn diagram to see which genes are commonly expressed between two or three treatments. I have the excel sheets with p values and gene ids of differentially expressed genes between treatments. I have one excel file for each comparison.

I don't seem to get how it works because I upload the file with gene IDs in BioVenn (as an example) but doesn't give me any response (might be a technical issue). More importantly I'm not understanding how I can create a Venn diagram uploading a single file where the comparison of two conditions was done already. I thought I should use individual gene counts, but I don't know how to start with it.

Sorry for the dumb question but I am utterly confused.

Thanks!

RNA-Seq Venn diagram basic • 19k views
ADD COMMENT
2
Entering edit mode

A Venn diagram isn't really going to tell you which genes are commonly expressed between conditions, just how many. Are you sure a Venn diagram is what you want? Or are you trying to determine common sets of differentially expressed genes between the different treatment comparisons? Regardless, we need more info as to what you're tried. Did the list you uploaded for each set contain only the IDs, one per row?

ADD REPLY
0
Entering edit mode

Ok sorry I realized my question wasn't very clear.

I would like to know how many genes are commonly or differentially expressed in different treatments (it's about stem cell biology, I would like to show how similar are two different developmental stages). So I thought of a Venn diagram. I have several excel tables obtained after DESeq2 analysis, were different pairwise comparisons were done. Yes, each row contains the IDs and the values per each sample (one triplicate per condition), here is a snapshot of how the table looks like: Snapshot

ADD REPLY
0
Entering edit mode

Oh, okay. If your only real goal is to show similarity between the two stages, something as simple as mentioning the number of differentially expressed genes between the two stages should really suffice, honestly. Venn diagrams aren't really a great construct for gene expression data, in my mind. Heatmaps generally look better and are immediately interpretable. Most people don't care about the genes that are similarly expressed between two conditions/developmental stages/whatever, the differentially expressed genes are the real meat that you should focus on in most cases.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

or DrawVenn if you have more than 4 lists

ADD REPLY
0
Entering edit mode

Hello dear, I am also trying for creating venn diagram to explain overlapping RNA-seq data. During my search i just saw this grateful discussion same like my problem and i am also not use to with bioinformatics tools.

As you suggested DrawVenn for more than 4-lists (I have 7-lists of my samples). i tried this online tool but it can't create venn diagram maybe due to more lists. Please recommend any other tool that can interpret all of my samples.

Sorry, but i am really confused

Thank YOU

ADD REPLY
0
Entering edit mode

You should use UpSet plots if you have 7 lists.

ADD REPLY
0
Entering edit mode

I have a general question regarding the type of data to be used for generation of the Venn diagram: would it make sense to use the genes identified via Gene Ontology? I have several comparisons being done AvsB and BvsA, or AvsC and CvsA, and BvsC and CvsB.

Could the gene list from these comparisons be used to check how similar samples A, B, and C are to each other?

Thank you very much!

ADD REPLY
1
Entering edit mode

Sounds to me like this is sufficiently different from your original question, so you may want to open a new thread for this.

ADD REPLY
1
Entering edit mode
5.4 years ago

It's possible but it will require a few steps.

  • First, as you already did, do the differential expression analysis for each pairwise comparison you are interested in.
  • take from each of those lists the gene IDs you are interested in , eg. the down-regulated ones.
  • put those in a text file (or copy paste)
  • upload them to one of the online tools to draw venn diagrams

of course make sure that you use the same criteria for selecting the gene(IDs) from each DEG comparison

ADD COMMENT
0
Entering edit mode

Thank you so much! This is actually what I was struggling with (again, I have really basic questions...).

About the criteria for selecting gene IDs, I think this is key in order to have useful information. I would like to understand how similar the different samples are to each other. It's about developmental biology, so I'm trying to understand how each differentiation stage differs from the other ones.

Could a criteria simply be to take the highest 300 expressed genes in each set? Or is it too naive? I also have some gene ontology comparisons. Is there a way to take the genes that are represented in the most relevant GOs entries and use them to see whether are commonly expressed?

ADD REPLY
0
Entering edit mode

yes, that is a criteria but perhaps not the best one? Personally I would use a significance threshold, something like all genes p <0.05 or something, but I assume you will likely find more and better advice on this specific topic from people within this field.

ADD REPLY
1
Entering edit mode
5.4 years ago

If you are talking about overlapping differentially expressed genes, you could try Vennerable or VennDiagram.

If you have more than 3 or 4 comparisons, finding the same method to use for all of the comparisons may be difficult (although not necessarily impossible, or at least clear differences may be OK if you pick a favorite method). You may want to see if you can set up of some sort of multi-variate comparison with more samples; or, occasionally, what needs to be done is have different methods for each comparison, even within one paper.

That said, differential expression is a little different than "commonly expressed." For example, you may have a gene with high expression in all samples. In microarrays, you could have some sort of background signal. However, even with differential expression, there will probably be some false negatives of genes that don't overlap (which is why I think you need to take time to critically assess your data, ideally trying to find some sort of new question to ask and address, and then try to determine how best a representative strategy that fairly represents your overall conclusions).

ADD COMMENT
0
Entering edit mode

What do you exactly mean with finding the "same method"?

Well the idea would be to identify a footprint or a molecular signature for each condition, and then compare it between each other to understand how similar they are.

ADD REPLY
1
Entering edit mode

In general, I would recommend testing at least edgeR, limma-voom, and DESeq2 for differential expression.

ADD REPLY
1
Entering edit mode

OK! Yes all samples were analyzed with DESeq2 prior to the analysis!

ADD REPLY
0
Entering edit mode
5.4 years ago
Bolesaem ▴ 10

I have a general question regarding the type of data to be used for generation of the Venn diagram: would it make sense to use the genes identified via Gene Ontology? I have several comparisons being done AvsB and BvsA, or AvsC and CvsA, and BvsC and CvsB.

Could the gene list from these comparisons be used to check how similar samples A, B, and C are to each other?

Thank you very much!

ADD COMMENT
0
Entering edit mode
4.3 years ago
Liang Sun ▴ 10

you can use this online tool DiVenn which will generate a nice graph https://divenn.noble.org

ADD COMMENT

Login before adding your answer.

Traffic: 2770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6