Biostar Beta. Not for public use.
Question: Pathway analysis and Gene enrichment Analysis Queries
0
Entering edit mode

Hi All,

I am working on the RNAseq samples. Planning to do pathway analysis and gene enrichment analysis. As of now don't have much background on these analyses. Currently doing some background research. If you people know some useful resources, kindly do share with me.

  • At first instance, why do we do pathway analysis and gene enrichment analysis?
  • Have a set of genes which are upregulated and down regulated between wild type and mutant, how to get enrichment score for upregulated genes and down regulated genes?
  • How to identify which pathways are enriched in the wild type and mutant samples?
  • How to identify which pathways are enriched in upregulated or downregulated genes?
ADD COMMENTlink 14 months ago bioinforesearchquestions • 230 • updated 14 months ago dz2353 • 70
2
Entering edit mode

At first instance, why do we do pathway analysis and gene enrichment analysis?

Sorry, please do your own background reading in order to understand this. Go to a search engine, type in keywords ncbi enrichment pathway analysis, and then start to read.

Have a set of genes which are upregulated and down regulated between wild type and mutant, how to get enrichment score for upregulated genes and down regulated genes?

Perform the enrichment separately, using the direction of fold-change to determine up- and down-regulation

How to identify which pathways are enriched in the wild type and mutant samples?

Different ways to do this. This could be the same as the answer that I gave in the previous point, or you could define a threshold Z-score for 'expressed' 'not expressed' (using the entire unfiltered dataset), and perform the enrichment and / or pathway analysis separately on those genes passing the threshold in wild type and, then, mutant.

How to identify which pathways are enriched in upregulated or downregulated genes?

Perform the pathway analysis separately, using the direction of fold-change to determine up- and down-regulation

-----------------------------

Some resources to get you started:

Kevin

ADD COMMENTlink 14 months ago Kevin Blighe 43k
Entering edit mode
2

Adding to the list,

Command-line based: Gene Set Clustering based on Functional annotation (GeneSCF)

ADD REPLYlink 14 months ago
EagleEye
6.4k
Entering edit mode
2

I'm also going to recommend my very recent answer to a similar question for why we do enrichment analyses and how they work.

Other resources include clusterProfiler (R) and enrichR (web-based and R).

ADD REPLYlink 14 months ago
jared.andrews07
♦ 2.4k
Entering edit mode
0

Good answer on the other thread, jared - had not seen it. Thanks!

ADD REPLYlink 14 months ago
Kevin Blighe
43k
Entering edit mode
0

Hi Kevin, Sample1 - Mutant, Sample2 -Wildtype. As per the list given to me there are 680 genes in that cuffdiff output file. Just for understanding, when I took log2(Value_2/Value_1) -> Wildtype/Mutant, I got the same logFC as per the cuffdiff output. As you mentioned, I categorized the genes based on the log fold change now.

For 110 genes, the logFC values are positive and ranged between 1.02 to 4.8. So these genes are downregulated for mutant sample.
For 570 genes, the logFC values are negative and ranged between -9 to -1. So these genes are upregulated for mutant sample. Is my understanding correct?

I am planning to use GSEA. I have prepared three ranked gene list files (sorted logFC descending)

1) with 680 genes and their logFC values

2) with 570 genes and their logFC values for upregulated

3) with 110 genes and their logFC values for downregulated

Should I run GSEA separately on upregulated gene list and downregulated gene list or on total gene list?

ADD REPLYlink 14 months ago
bioinforesearchquestions
• 230
Entering edit mode
1

I would likely run all three lists, as you can make different statements about each. For the full list, you can say that enriched pathways are perturbed or deregulated. Maybe the genes are split between up/down regulated. It still provides you something to hypothesize about, though actual effects would have to be measured more directly.

The up/down lists yield more direct observations. For instance, maybe many genes involved in calcium signaling are upregulated in the mutant, which might allow you to speculate something about the mutant phenotype. Perhaps something that could be easily experimentally validated.

Either way, running an additional list is easy, so there's no reason not to do all 3 sets.

ADD REPLYlink 14 months ago
jared.andrews07
♦ 2.4k
Entering edit mode
0

Thanks, Jared. I have done GSEA on all three. But I was not sure which one is more meaningfull in interpreting.

For instance, when I did GSEA on upregulated gene list (570 genes). I selected this GENESET DATABASE "Mouse_GOBP_AllPathways_no_GO_iea_October_01_2018_symbol.gmt". GSEA finished successfully. As per the GSEA report for upregulated gene list, I could see

100/648 gene sets are upregulated in phenotype na_pos

42 gene sets are significant at FDR < 25%

32 gene sets are significantly enriched at nominal pvalue < 1%

548/648 gene sets are upregulated in phenotype na_neg

35 gene sets are significantly enriched at FDR < 25%

24 gene sets are significantly enriched at nominal pvalue < 1%

What is na_pos and na_neg? Is it mutant and wild type? How to know which is mutant and wild type?

How to interpret these values?

ADD REPLYlink 14 months ago
bioinforesearchquestions
• 230
2
Entering edit mode

Hi, Maybe you can try this one: Metascape (web-based). For pathway analysis, I used IPA but is a commercial software.

ADD COMMENTlink 14 months ago dz2353 • 70

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0