Biostar Beta. Not for public use.
What'S The Meaning Of "Random Walk" In The Gsea Paper Published By The Broad Institute?
Entering edit mode
5.9 years ago
rolyata47 • 40
United States

The Gene Set Enrichment Algorithm, outlined in this paper,, refers often to a "random walk" used to traverse the ranked list L of gene-to-phenotype correlations.

However, what they actually do in the paper does not look like a random walk at all. It seems to me that they traverse the ranked list L sequentially, from rank 1 (highest correlation) onwards.

I was wondering if anyone could clear up the confusion of what they mean by "random walk", and why they use the term, when really it looks like they are doing a sequential walk, quite the opposite.

Also, as a follow-up question, how is it that they do not bias the top of the ranked list L over the bottom? If we assume for the moment that they are doing a sequential walk, which seems to be the case, then the gene sets found at the bottom extreme will have a larger value for P_miss, since P_miss is proportional to i. As a consequence, they will have smaller enrichment scores.

Perhaps this is related to the question above, since a sequential walk does not seem to work here...

I appreciate any help... I suspect I am not understanding something correctly...

Entering edit mode
12 months ago
Leandro Lima • 920
San Francisco, CA


Maybe this article can help you.

Introduction to Statistical Methods for Analyzing Large Data Sets: Gene-Set Enrichment Analysis

Entering edit mode

Hey, thanks! I think this article made it clear. They are comparing the supremum (ES) with what it would be for a random walk... gene sets found at the top or the bottom will have a higher ES, and gene sets that are randomly distributed will resemble a random walk - thanks!


Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1