The same qvalues are reported to different pvalues
1
1
Entering edit mode
5.5 years ago

Hi guys,

I'm running MethylKit to find differently methylated CpG. I have three replicates in the control and three in the treatment. When I run the code myDiff=calculateDiffMeth(meth,mc.cores=4,method="fdr") and then hyper=getMethylDiff(myDiff,difference=10,qvalue=0.01,type="hyper") the output give me qvalues lower than pvalues. Some of these pvalues are not even statistically significant. The result is the same changing the adjustment method (SLIM/BH)

> hyper
methylDiff object with 12912 rows
--------------
chr      start        end      strand   pvalue      qvalue        meth.diff
chr1.1 226898 226898      + 0.02917285 0.001714451  20.02165
chr1.1 227164 227164      + 0.43722433 0.001714451  11.76471
chr1.1 227258 227258      + 0.07721846 0.001714451  30.04769
chr1.1 273666 273666      + 0.08245243 0.001714451  15.38462
chr1.1 293303 293303      + 0.04406780 0.001714451  15.38462
chr1.1 297572 297572      + 0.33056133 0.001714451  11.02757

How is it possible to have qvalues higher than pvalues with so many multiple comparisons (~8 millions )? What should I do to overcome this problem and find reliable results (p-value lower than 0.05 and q-value lower than 0.01).

next-gen r • 2.2k views
ADD COMMENT
0
Entering edit mode

You may try to use the standard p.adjust with method = "BH" (which is the FDR-correction) on the hyper's pvalue.

It may be that the parallel computing messed up the qvalues.

ADD REPLY
0
Entering edit mode

Have you solved the same q-value problem? I had the same problem

ADD REPLY
1
Entering edit mode
ADD REPLY
2
Entering edit mode
5.5 years ago

To be honest your question is confusing, you mention

...the output give me qvalues lower than pvalues ...

but then

... how is it possible to have qvalues higher than pvalues ...

is your question that the values are higher or lower? Unclear...

That being said, there is no reason to select both by p-value and q-value at the same time. You would be messing up the statistical interpretation of the results.

Then, as it happens q-values are less well defined than p-values, different tools may compute different quantities that they call q-values, so look into the documentation. In the example that you show the q-value is larger than 1, that also goes against what a typical q-value should be in the range of [0,1]

Finally, I will say q-values are not p-values, while one might expect q-values to be lower than p-values and that is how they turn out most of the time, they are different concepts altogether.

ADD COMMENT
0
Entering edit mode

Hi Istvan

Thank you for your answer and sorry for my ambiguous question. The q-values are lower than p-values. The q-values are in the column of values = 0.001714451. The values above 1 are from the column meth.diff.

The reason for using both, p-values and q-values is because the q-values were not computed right. I checked the histogram of p-values and most of the p-values are equal or close to one. Maybe this lake of uniformity is affecting the FDR correction. Would you have any suggestion to correct p-values for multiple comparisons?

ADD REPLY
0
Entering edit mode

frankly, if the q-values are computed incorrectly the entire pipeline is suspect in my opinion - are the p-values to be trusted then? hard to say.

You can estimate the adjusted p-value as a Bonferroni correction where you simply divide the threshold with the number of comparisons that you make.

So if initially, you wanted to apply a 0.05 threshold then, if you had 10 comparisons (rows in the table) the threshold is 0.005

ADD REPLY

Login before adding your answer.

Traffic: 1861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6