Question

How can I determine which statistical method is appropriate ?

0

Entering edit mode

7.4 years ago

Reza ▴ 10

Hello everyone,

I carried out some statistical methods (such as F-test, …) for the analysis of my data. Now, I am trying to plot a ROC curve to determine the most appropriate method for my data. I estimated FDR and P-value based on each method but I can’t calculate the true positive rate (TPR) and false positive rate (FPR). Could you please guide me on how can I do to calculate TPR and FPR or send me the R codes?

Reza

statistics R • 1.5k views

ADD COMMENT • link updated 12 months ago by Ram 43k • written 7.4 years ago by Reza ▴ 10

1

Entering edit mode

It is not very clear what you want to do? To evaluate your methods, you must have some truth set, using which you can say which classification are of TRUE category and which are FALSE. Then only you can compare performance of different methods

ADD REPLY • link 7.4 years ago by Santosh Anand 5.7k

0

Entering edit mode

Please be as informative as possible when asking questions.

ADD REPLY • link 7.4 years ago by WouterDeCoster 47k

score 1 · Answer 1 · 2016-11-13

1

Entering edit mode

7.4 years ago

Devon Ryan 104k

You can't calculate TPR or FPR, since you have real data. Why not post what sort of data you're trying to analyse and the methods you're working with? There might be an a priori best way to analyze things.

ADD COMMENT • link 7.4 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks, a lot for the reply. I have a set of microarray data and normalization of raw expression data for each sample. I applied different statistical test (F-test, t-test…) for comparison between groups for each gene. Now I want to determine the best test for that. Can I use empirical cumulative distribution function (ECDF) of P-values for tests? How do I do an a priori ?

ADD REPLY • link 7.4 years ago by Reza ▴ 10

2

Entering edit mode

Please just use limma.

ADD REPLY • link 7.4 years ago by Devon Ryan 104k

3

Entering edit mode

To try and make this clear, FDR, TPR, and ROC curves are used for evaluating methods under controlled conditions. That is, we use these when we have a set of known data or on simulated data of some sort. When you are looking at an unknown/experimental dataset you can't do this. To decide what statistical analyses you should do and use you should be basing it on the literature, where people have hopefully compared the tools you are looking at using head to head.

I'll also point out that based on your response to Devon's answer where you are using F-test, t-test, etc it sounds like there is a very good chance you are not analyzing this data correctly. There are a lot of very well described software tools out there for analyzing microarray data. You should definitely use one of these tools, which a lot of very smart people with deep statistical knowledge have developed for just this purpose. Limma is a good choice. While there are tons of publications out there that have used things like T-tests on microarray data (particularly years ago) it is absolutely not an appropriate statistical tests for expression data.

ADD REPLY • link 7.4 years ago by DG 7.3k