Gsea Pre-Ranked File Format
2
1
Entering edit mode
10.8 years ago

Can I use p-values in the pre-ranked file provided to GSEA? The format of the file would look like (Gene symbol -- p-value):

HGMGCS1     1.83E-13
TMEM97      2.14E-11
..
..
NDST4       0.546
GSEA • 9.6k views
ADD COMMENT
0
Entering edit mode

The files you are providing are gene-expressions, gene-sets and a grouping arent they? Where do you want to use the pvals? Maybe instead of expression-levels?

ADD REPLY
0
Entering edit mode

I provide only the pre-ranked file to see if gene sets database used by GSEA are enriched by my genes which are in my pre-ranked file.

ADD REPLY
2
Entering edit mode
7.8 years ago
siabadaba ▴ 120

The biggest problem of using strightforward p-value is that youy loose information about up/down regulation. Personally, I've been satisfied with the result of my analysis using this (p-value-based) way of computing rank ordered gene list:

http://genomespot.blogspot.com.au/2016/04/how-to-generate-rank-file-from-gene.html

ADD COMMENT
1
Entering edit mode
10.8 years ago

If you read the documentation you'll notice the a few sentences directly applicale to your case:

For instance, you might have used your favorite tTest-like statistic to produce a ranked ordered gene list from your dataset which you now want to test for enrichment. Order of lines does not matter. It is important, however, that the second column will have numeric values - they will be used to rank order genes by GSEA.

ADD COMMENT
0
Entering edit mode

I do not understand what they mean by tTest-like statistic. Does this includes Wilcoxon signed rank-test ? (This is the test I am using to get my p-values). I addition, if I give the gene list with p-values as describe above, gene will be ranked according to p-value in a decreasing order. This means that the most significant gene in my list (HGMGCS1) will be last.

ADD REPLY
0
Entering edit mode

What they imply is that it does not matter HOW you came up with the numbers. What does matter is that the second column have numeric values, as they are used to rank the genes.

What it boils down to, is that YOU have to make a judgement whether the Wilcoxon signed rank-test is a sensible way to calculate a p-value used to rank your data.

ADD REPLY

Login before adding your answer.

Traffic: 2771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6