Biostar Beta. Not for public use.
Pathway analysis and Gene enrichment Analysis Queries
0
Entering edit mode
2.1 years ago
@bioinforesearchquestions20943

Hi All,

I am working on the RNAseq samples. Planning to do pathway analysis and gene enrichment analysis. As of now don't have much background on these analyses. Currently doing some background research. If you people know some useful resources, kindly do share with me.

• At first instance, why do we do pathway analysis and gene enrichment analysis?
• Have a set of genes which are upregulated and down regulated between wild type and mutant, how to get enrichment score for upregulated genes and down regulated genes?
• How to identify which pathways are enriched in the wild type and mutant samples?
• How to identify which pathways are enriched in upregulated or downregulated genes?
RNAseq Pathway analysis Enrichment score GSEA • 486 views
2
Entering edit mode
2.1 years ago
@Kevin Blighe41557

At first instance, why do we do pathway analysis and gene enrichment analysis?

Sorry, please do your own background reading in order to understand this. Go to a search engine, type in keywords ncbi enrichment pathway analysis, and then start to read.

Have a set of genes which are upregulated and down regulated between wild type and mutant, how to get enrichment score for upregulated genes and down regulated genes?

Perform the enrichment separately, using the direction of fold-change to determine up- and down-regulation

How to identify which pathways are enriched in the wild type and mutant samples?

Different ways to do this. This could be the same as the answer that I gave in the previous point, or you could define a threshold Z-score for 'expressed' 'not expressed' (using the entire unfiltered dataset), and perform the enrichment and / or pathway analysis separately on those genes passing the threshold in wild type and, then, mutant.

How to identify which pathways are enriched in upregulated or downregulated genes?

Perform the pathway analysis separately, using the direction of fold-change to determine up- and down-regulation

## -----------------------------

Some resources to get you started:

Kevin

2
Entering edit mode

Command-line based: Gene Set Clustering based on Functional annotation (GeneSCF)

2
Entering edit mode

I'm also going to recommend my very recent answer to a similar question for why we do enrichment analyses and how they work.

Other resources include clusterProfiler (R) and enrichR (web-based and R).

0
Entering edit mode

0
Entering edit mode

Hi Kevin, Sample1 - Mutant, Sample2 -Wildtype. As per the list given to me there are 680 genes in that cuffdiff output file. Just for understanding, when I took log2(Value_2/Value_1) -> Wildtype/Mutant, I got the same logFC as per the cuffdiff output. As you mentioned, I categorized the genes based on the log fold change now.

For 110 genes, the logFC values are positive and ranged between 1.02 to 4.8. So these genes are downregulated for mutant sample.
For 570 genes, the logFC values are negative and ranged between -9 to -1. So these genes are upregulated for mutant sample. Is my understanding correct?

I am planning to use GSEA. I have prepared three ranked gene list files (sorted logFC descending)

1) with 680 genes and their logFC values

2) with 570 genes and their logFC values for upregulated

3) with 110 genes and their logFC values for downregulated

Should I run GSEA separately on upregulated gene list and downregulated gene list or on total gene list?

1
Entering edit mode

I would likely run all three lists, as you can make different statements about each. For the full list, you can say that enriched pathways are perturbed or deregulated. Maybe the genes are split between up/down regulated. It still provides you something to hypothesize about, though actual effects would have to be measured more directly.

The up/down lists yield more direct observations. For instance, maybe many genes involved in calcium signaling are upregulated in the mutant, which might allow you to speculate something about the mutant phenotype. Perhaps something that could be easily experimentally validated.

Either way, running an additional list is easy, so there's no reason not to do all 3 sets.

0
Entering edit mode

Thanks, Jared. I have done GSEA on all three. But I was not sure which one is more meaningfull in interpreting.

For instance, when I did GSEA on upregulated gene list (570 genes). I selected this GENESET DATABASE "Mouse_GOBP_AllPathways_no_GO_iea_October_01_2018_symbol.gmt". GSEA finished successfully. As per the GSEA report for upregulated gene list, I could see

100/648 gene sets are upregulated in phenotype na_pos

42 gene sets are significant at FDR < 25%

32 gene sets are significantly enriched at nominal pvalue < 1%

548/648 gene sets are upregulated in phenotype na_neg

35 gene sets are significantly enriched at FDR < 25%

24 gene sets are significantly enriched at nominal pvalue < 1%

What is na_pos and na_neg? Is it mutant and wild type? How to know which is mutant and wild type?

How to interpret these values?

2
Entering edit mode
2.1 years ago
dz2353 • 80
@dz235350774

Hi, Maybe you can try this one: Metascape (web-based). For pathway analysis, I used IPA but is a commercial software.