How to Calculate FDR in permutation F test
1
1
Entering edit mode
7.0 years ago
hellocita ▴ 40

Hi all, I am a little confused about how to calculate FDR after permutation F test.

Assume there is 6000 genes in my data. And for each gene, I perform 1000 permutation F test and got 1000 F value, which includes 1 original F value and 999 permutating F value. And p-value = sum(F > F-original)/1000.

But I am confused how to calculate FDR? I think it should be FDR = False positive gene number/ gene with Permutation p < 0.05 number.

Thank you in advance:)

R • 5.4k views
ADD COMMENT
0
Entering edit mode

Hi! Did you find answers for the questions you asked? To my understanding for each gene you have to calculate: perm_p-value= number of p-values<=p-value experimental +1/total number of permiutations+1. So your formula is not correct in this way. To perform FDR correction you should take your raw p-values and adjust them e.g. by means of p.adjust(method='fdr') R base function.

ADD REPLY
0
Entering edit mode
7.0 years ago

The FDR is the probability of getting a false positive result at a given p-value threshold. It is E[false positive]/E[significant tests]. E[significant tests] is just the number of tests called significant at the chosen threshold. The problem is then to estimate the number of false positives. This is the number of true negatives times the probability of calling one significant, which is the given threshold. So we need to estimate the number of true negatives. For this we can assume that the distribution of p-values for true negatives is uniform, plot a histogram of the observed p-values and find the region where the distribution is flat. The height of this part gives an estimate of the proportion of true negatives. In practice, one finds a value lambda after which the p-value distribution is flat and the proportion of true negatives is the number of p-values greater than lambda divided by 1-lambda times the total number of tests. See Storey, J. D. and R. Tibshirani (2003). “Statistical significance for genome-wide studies.”Proceedings of the National Academy of Sciences 100(16): 9440-9445.
This is related to the q-value which is the minimum FDR of deciding that a particular test is significant. This is probably what you want and is available as the qvalue() function in the qvalue R package.

ADD COMMENT

Login before adding your answer.

Traffic: 1470 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6