GSVA Analysis Ouput Array gives lesser gene sets than original gene set array
0
0
Entering edit mode
5.1 years ago
amit15013 • 0

Hi

I am doing gsva analysis on pathways gene sets which consists of 186 gene set. When i run gsva analysis on my dataset it returns only 168 gene sets. How do i get to know which gene set has been returned as there is not naming in the final output array.

Thank you

GSVA • 1.5k views
ADD COMMENT
0
Entering edit mode

What software/package are you using?

ADD REPLY
0
Entering edit mode

Ciao Pietro. S/He is using GSVA: https://bioconductor.org/packages/release/bioc/html/GSVA.html

amit15013, please show all of the commands that you are using.

ADD REPLY
0
Entering edit mode

Hi

This is the code i am using

exprs <- as.matrix(read.table("~/velocyto/velocity_matrix_output.csv", header=TRUE, sep=",",row.names=1,as.is=TRUE))
gsva_set<- ExpressionSet(assayData=exprs)
our_genes<-GSA.read.gmt("~/velocyto/c2.cp.kegg.v6.2.symbols.gmt")
gsva_analysis <- gsva(gsva_set@assayData$exprs,our_genes$genesets,min.sz=0, max.sz=500, verbose=TRUE)

This returns only 168 gene sets and not 186. Also it doesn't return the names of the gene sets which are processed.

Thank you

ADD REPLY
0
Entering edit mode

our_genes should be a named list, right? - are the names assigned?

Even though you have min.sz=0, I think that it is possible that GSVA will not return gene sets that do not have any match genes - not 100% sure, though.

ADD REPLY
0
Entering edit mode

Yes, our_genes prints the names of the pathways as well as the genes present in it.

True, even though it has min.sz=0 it is returnig only 168 gene set and not 186.

Thank you

ADD REPLY
0
Entering edit mode

Best thing, then, is to infer the ones that are not returned and then investigate why. Not sure about the names issued - I recently used my own curated gene set with GSVA and it returned names.

ADD REPLY
0
Entering edit mode

can you please show your code. It would be really helpful

ADD REPLY
0
Entering edit mode

Hey, I had to wait until I got home. I just checked and, indeed, the names are being carried through. Take a look:

names(genesets)
 [1] "C1_mean_UMI"  "C2_mean_UMI"  "C3_mean_UMI"  "C4_mean_UMI"  "C5_mean_UMI" 
 [6] "C6_mean_UMI"  "C7_mean_UMI"  "C8_mean_UMI"  "C9_mean_UMI"  "C10_mean_UMI"
 ...

topMatrixGSVA <- gsva(vstMatrix, genesets, method="gsva", min.sz=5, max.sz=999999, abs.ranking=FALSE, verbose=TRUE)
Estimating GSVA scores for 29 gene sets.
Computing observed enrichment scores
Estimating ECDFs with Gaussian kernels
Using parallel with 4 cores
  |=============================================================         |  88%

rownames(topMatrixGSVA)
 [1] "C1_mean_UMI"  "C2_mean_UMI"  "C3_mean_UMI"  "C4_mean_UMI"  "C5_mean_UMI" 
 [6] "C6_mean_UMI"  "C7_mean_UMI"  "C8_mean_UMI"  "C9_mean_UMI"  "C10_mean_UMI"
 ...

Are you sure that your gene-sets are in a named list? Which version of GSVA are you using? - I have GSVA_1.30.0

ADD REPLY
0
Entering edit mode

Maybe some gene sets are larger than 500 genes (max.sz=500)?

ADD REPLY
0
Entering edit mode

No i checked that. Max size is 93.

ADD REPLY

Login before adding your answer.

Traffic: 3861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6