Entering edit mode
6.4 years ago
mrobinso
▴
10
Hi,
I was interested in comparing the level of enrichment of pathways across samples - specifically I'd like to use CCLE data to examine whether a particular gene correlates with enrichment of a pathway.
My naive approach to this was to rank genes by their log2 fold change to the geometric mean of all other conditions then perform Mann-Whitney and scale to obtain a Z-score. Something like this (in R):
stat <- sapply(sampleNames(ccle), function(cell_id) {
cell_idx <- which(sampleNames(ccle) == cell_id)
gm <- apply(ccle_expr[, -cell_idx], 1, function(x) exp(mean(log(x))))
test_data <- data.frame(gene = log2(ccle_expr[,cell_idx] / gm),
path = rownames(ccle_expr) %in% gene_set)
wilcox.test(gene ~ path, test_data)$statistic
})
z_score <- scale(stat)
Is there a better way to approach this? I was also wondering whether the NES from Broad's GSEA would be directly comparable?
Appreciate any advice!
Thanks, Mark
What do you mean by "level of enrichment of pathways across samples" ? Are you trying to figure out which pathways are more represented as enriched in the samples ? Or are you trying to compare samples and want to say that pathway A is more enriched in sample x than in sample y ? But then what do you mean by "whether a particular gene correlates with enrichment of a pathway" ? This makes it look like you're trying to find genes that are predictive of enrichment in a given pathway.