Question

using fdr to gate values in NGS comparison ttest

0

Entering edit mode

9.8 years ago

BarristanTheBold • 0

New to NGS analysis, but that's the task I've been assigned. I have received NGS data that I am trying to decipher.

I’m attempting to learn what exactly is meant by "unadjusted p-value" and "FDR" in looking at comparison ttests of genes (the comparisons are between NGS of animals treated with drug or placebo). I understand the basic concepts, but not how to functionally make use of them. Most of the values seem fairly large (well over 0.1 for p-values, in the 0.1 to 0.9 range for FDR) when looking at data sets of ~20,000 to 40,000 members. My goal here is to determine a value for each that would allow me to gate on the genes with meaningful expression differences. Is there a specific value I should use as the boundary, or some way to calculate it based on the sample size or something?

NGS FDR unadjusted p-value • 3.2k views

ADD COMMENT • link updated 2.5 years ago by Ram 43k • written 9.8 years ago by BarristanTheBold • 0

Ram · Answer 1 · 2014-06-26

2

Entering edit mode

9.8 years ago

Devon Ryan 104k

Ignore unadjusted p-values completely. Unadjusted p-values, also called "raw p-values" or simply p-values, don't have much relevance in individually when you perform multiple testing (see this XKCD comic for a nice example of why multiple-testing and fishing for changes increases false-positive rates). A common threshold for adjusted p-values (or FDR) is 0.1 (as with p-value thresholds in general, there's some wiggle room here). That's a bit higher than the typical 0.05 that you'd use with a raw p-value, but it turns out to be a convenient trade-off. After making a list of significant findings, sort them by fold-change to help prioritize results.

ADD COMMENT • link 9.8 years ago by Devon Ryan 104k

0

Entering edit mode

This. I see so many people making the mistake of assuming a low p-value is a large effect size.

ADD REPLY • link updated 2.5 years ago by Ram 43k • written 9.8 years ago by David Westergaard ★ 1.5k

0

Entering edit mode

Isn't the way to combat that to just lower your threshold for calling something significant?

ADD REPLY • link 9.8 years ago by BarristanTheBold • 0

2

Entering edit mode

Never confuse statistical significance and biological relevance.

ADD REPLY • link 9.8 years ago by Devon Ryan 104k

1

Entering edit mode

No. P-value is a measure of significance, and therefore more related to variation and sample consistency. If all the drug treated were at 102.1% expression plus or minus 0.001, this would have high certainty of difference without much biological relevance; compared to another gene with 300% plus or minus 50. As Devon said, use fdr to gate then sort for high fold change. They will be correlated..

ADD REPLY • link 9.8 years ago by karl.stamm 4.1k