Weird R behaviour with NAs in DESeq2 results
1
1
Entering edit mode
5.6 years ago

Hello !

I have a DESeq2 result dataframe-like structure in R with some NAs in the p.adj column. Strangely, those NAs are handled seemingly randomly by two extremely similar functions used to test the genes dow- or up- regulation. What is happening here ? A quick fix would be to change those NAs into "1" (never significant) but I want to understand :) Here is my code:

significant_D <- function(x){return(x$padj < 0.01 & (x$log2FoldChange) < -0.584)}

significant_O <- function(x){return(x$padj < 0.01 & (x$log2FoldChange) > 0.584)}

head(DESeq_results[whichis.na(DESeq_results$padj)),])
               baseMean log2FoldChange     lfcSE        stat    pvalue      padj
               <numeric>      <numeric> <numeric>   <numeric> <numeric> <numeric>
WBGene00021406 20.718704    -0.21384520 0.5063939 -0.42229021 0.6728132        NA
WBGene00021407  3.961096     0.66807041 1.1159760  0.59864226 0.5494115        NA
WBGene00021405  1.939649    -1.16416923 1.6007395 -0.72726966 0.4670608        NA
WBGene00021409  1.719862     1.91952086 2.2842210  0.84033940 0.4007181        NA
WBGene00235257  7.055687     0.23653150 0.8488041  0.27866442 0.7805024        NA
WBGene00015246 15.699319     0.06366663 0.6514634  0.09772863 0.9221478        NA

sumis.na(significant_D(DESeq_results)))
[1] 2558

sumis.na(significant_O(DESeq_results)))
[1] 1452

sumis.na(significant_D(DESeq_results)) & is.na(significant_O(DESeq_results)))
[1] 0

sumis.na(DESeq_results$padj))
[1] 8582
RNA-Seq R DESeq2 NA • 1.8k views
ADD COMMENT
4
Entering edit mode
5.6 years ago

The confusion is due to the following:

> TRUE & NA
NA
> FALSE & NA
FALSE
> NA & NA
NA

The output of significant_D and significant_O are boolean vectors with some NA values. It will necessarily be true that any NA or TRUE values output by one of these will be FALSE in the other (after all, the only time you can get a TRUE or NA is when the fold-change passes filtering, which means it will fail in the other function). Since FALSE & NA is FALSE the penultimate sum is 0. As an aside, it makes sense that FALSE & NA is FALSE, since NA can be considered "unknown".

ADD COMMENT
0
Entering edit mode

Good catch ! Thanks !

ADD REPLY

Login before adding your answer.

Traffic: 2025 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6