Dear friends,
The meta-analysis was done for various subtypes of cancer, separately. I have got the corresponding expression values for finding hub genes and downstream analysis using WGCNA. I think of doing WGCNA analysis in two different ways, 1) consider all subtypes as different traits and doing the analysis and 2) doing the WGCNA analysis separately, on each subtype. As it is my first experience, I would like to know your comments and suggestions. Could you please let me know which way (1 or 2) do you recommend and why?
Another question is regarding the input data, I know WGCNA is an unsupervised method and can get all genes as input, however, I see only differentially expressed genes(DEG) were used for WGCNA analysis in many papers in even reputable journals. Based on your experience, could you please tell me if the results differed when all genes and just DEGs are used?
If
subtype
is the only variable then I would go for for 1. You will get back clusters of co-expressed genes across allsubtypes
.Perhaps reviewers had no familiarity with WGCNA but based on my experience (I used to reanalyze, just for fun, already published RNA-seq data with WGCNA), if the intra-group variability is very low and the experimental design is relatively simple (e.g. one factor
subtype
with multiple levels,sub1
-sub2
-sub3
-sub4
), then most hub genes will be the DEG.However, how do you select the DEG, if you have a more complex experimental design: different factors, each one with multiple levels, together with continous variables?