using fdr to gate values in NGS comparison ttest
1
0
Entering edit mode
9.8 years ago

New to NGS analysis, but that's the task I've been assigned. I have received NGS data that I am trying to decipher.

I’m attempting to learn what exactly is meant by "unadjusted p-value" and "FDR" in looking at comparison ttests of genes (the comparisons are between NGS of animals treated with drug or placebo). I understand the basic concepts, but not how to functionally make use of them. Most of the values seem fairly large (well over 0.1 for p-values, in the 0.1 to 0.9 range for FDR) when looking at data sets of ~20,000 to 40,000 members. My goal here is to determine a value for each that would allow me to gate on the genes with meaningful expression differences. Is there a specific value I should use as the boundary, or some way to calculate it based on the sample size or something?

NGS FDR unadjusted p-value • 3.2k views
ADD COMMENT
2
Entering edit mode
9.8 years ago

Ignore unadjusted p-values completely. Unadjusted p-values, also called "raw p-values" or simply p-values, don't have much relevance in individually when you perform multiple testing (see this XKCD comic for a nice example of why multiple-testing and fishing for changes increases false-positive rates). A common threshold for adjusted p-values (or FDR) is 0.1 (as with p-value thresholds in general, there's some wiggle room here). That's a bit higher than the typical 0.05 that you'd use with a raw p-value, but it turns out to be a convenient trade-off. After making a list of significant findings, sort them by fold-change to help prioritize results.

ADD COMMENT
0
Entering edit mode

This. I see so many people making the mistake of assuming a low p-value is a large effect size.

ADD REPLY
0
Entering edit mode

Isn't the way to combat that to just lower your threshold for calling something significant?

ADD REPLY
2
Entering edit mode

Never confuse statistical significance and biological relevance.

ADD REPLY
1
Entering edit mode

No. P-value is a measure of significance, and therefore more related to variation and sample consistency. If all the drug treated were at 102.1% expression plus or minus 0.001, this would have high certainty of difference without much biological relevance; compared to another gene with 300% plus or minus 50. As Devon said, use fdr to gate then sort for high fold change. They will be correlated..

ADD REPLY

Login before adding your answer.

Traffic: 1593 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6