Biostar Beta. Not for public use.
too many adj pvalues = 1
0
Entering edit mode
19 months ago
Illinu • 90
Belgium

Hi, While analyzing a set of DEGs resulting from DESeq2, I noticed for some genes that were I 'see' differences they are not significant, but then another gene with similar behaviour, the difference is significant. They both come from different comparisons, the former with 6 replicates and the later with 3, so I guess the number of replicates in the comparison make a difference. However I decided to look at the p-value distribution and noticed that almost all pvalues fall in the 1 bin. From what I read this could mean that the differential test would be assuming that the data has a distribution it doesn't have. But then I am all confused whether I should consider the p-values, the adj p-values or do another test alltogether and ignore DESeq2.

These are the genes: gene A, DE when testing for genotype 1 vs genotype 2 (6 replicates) adj-pvalue=0.006, but it is not DE in the genotype 2 T vs C comparison (adj p-value = 0.8, red and blue dots), while it has the same profile/behaviour as gene B -> with DE in genotype 2 T vs C (adj p-value=0.016) but not DE when testing for genotype 1 vs genotype 2 (adj p-value=0.77). If I have to interpret these two genes as a biologist, I would not say they are differentially expressed between genotypes but both induced at treatment in genotype 2. I am wondering how I can support this in a report. while justifying the statistical results. [1] http://hpics.li/66db722

This is the pvalues histogram for one of the comparisons but the other one looks the same [2] http://hpics.li/5d0bdf2

ADD COMMENTlink
1
Entering edit mode
16 months ago
Freiburg, Germany

The only thing that matters is the adjusted p-value, ignore the unadjusted p-values.

Look at the error bars on gene B. That is why the difference isn't significant. The statistical results are correct, you have no basis upon which to disagree with them.

ADD COMMENTlink
0
Entering edit mode

Hi Devon, It just seems weird that the test finds insignificant Gene A between T and C for genotype B, when it goes from 2000 to 8000 counts with small error bars. To me there is a clear induction of this gene by the treatment compared to genotype 1. But the test considers this difference as not significant. So I should interpret this as the gene not being induced at T in genot 1, I guess? I also wanted to know if the pvalues histogram with so many pvalues = 1 is pointing that something is wrong with the data. Thanks

ADD REPLYlink
0
Entering edit mode

You might have an outlier sample, which is inflating the variance and decreasing your power. In the DESeq2 tutorial there are some examples of creating dendrograms and PCA plots. Have a look at those and see if one of the samples is obviously weird (in which case you can exclude it). DESeq2 will normally try to do that automatically, but you need at least 6 replicates per group for it to work.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1