I'm analyzing a large number of gene sets in RNAseq data, hoping to find sets that can differentiate between 2 conditions. I'm using fry(), from the limma package.
I understand that there are 3 different hypotheses tested for each gene set (upregulated, downregulated, and the last one to my understanding is that the genes are not equally expressed in the two conditions = there is a mix of genes, which are either up or downregulated). I hope my understanding is correct there.
Out of thousands of genesets, one has an FDR of 0.02, for which the FDR.mixed is rather high, and the rest are all above 0.25. About 10 has an FDR.mixed bellow 0.01, and for some of those, the two sided/normal FDR is really high (almost 1). For visualizing the results, I tried creating heatmaps with the mean pattern expressions of the significant sets, and the separation is not as good as I'd like.
My questions are:
- What is a sensible choice of FDR cutoff? Should I consider both FDR and FDR.Mixed?
- How to interpret the two sided FDR and the mixed FDR? (why is the two sided FDR low, yet the mixed FDR high, and also the other way around)
- Does it make sense to take a closer look at sets with a significant mixed FDR, and split them futher depending on the direction of DE (up, down, not DEd)?
Thank you for your help in advance!