Bioconductor getGEO(), how to convert a single eset of GSE matrix files to a list of esets?
1
0
Entering edit mode
5.5 years ago
Davide Chicco ▴ 120

I've been using the getGEO() function of Bioconductor. I noticed that, instead of downloading the GEO archive every time you run the script, you can set a filename parameter where the script will read in the GEO file.

My original code:

gset <- getGEO("GSE59867", GSEMatrix =TRUE, getGPL=FALSE)

My new code:

GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
gsetFromFile <- getGEO("GSE59867",   filename=GSE59867_filename)

I tried to do that, but my script does not work anymore in the subsequent steps. I checked the getGEO() online webpage, and I read:

Note that since a single file is being parsed, the return value is not a list of esets, but a single eset when GSE matrix files are parsed.

Okay, here's the diagnosis, now I need the cure. How can I convert my my single eset of GSE matrix files to a list of esets?

Thanks!

EDIT: Sorry if I was unclear: the above piece of code works, but I have a problem later when I use the function getBM(). Here's the complete piece of code:

GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
gset <- getGEO("GSE59867",  GSEMatrix =FALSE,   filename=GSE59867_filename)

if (length(gset) > 1) idx <- grep("GPL570", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]


mart <- useMart("ENSEMBL_MART_ENSEMBL")
mart <- useDataset("hsapiens_gene_ensembl", mart)
annotLookup <- getBM(mart=mart, attributes=c("affy_hugene_1_0_st_v1", "ensembl_gene_id", "gene_biotype", "external_gene_name"), filter="affy_hugene_1_0_st_v1", values=rownames(exprs(gset))[1:50], uniqueRows=TRUE)

And here's the error I get:

Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘exprs’ for signature ‘"character"’ Calls: getBM -> rownames -> exprs -> <anonymous> Execution halted

Any idea on how to solve it? Thanks!

Bioconductor getGEO R • 2.6k views
ADD COMMENT
0
Entering edit mode
5.5 years ago
Benn 8.3k

Your code works for me:

> GSE59867_filename <- "GSE59867_series_matrix.txt.gz"
> gsetFromFile <- getGEO("GSE59867",   filename=GSE59867_filename)

> gsetFromFile
ExpressionSet (storageMode: lockedEnvironment)
assayData: 33297 features, 436 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: GSM1448335 GSM1448336 ... GSM1620804 (436 total)
  varLabels: title geo_accession ... samples collection:ch1 (34 total)
  varMetadata: labelDescription
featureData
  featureNames: 7892501 7892502 ... 8180418 (33297 total)
  fvarLabels: ID GB_LIST ... category (12 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL6244 

> dim(exprs(gsetFromFile))
[1] 33297   436

I get an expression set of 436 samples, I don't understand why you want a list of esets? Please explain why you expect a list of esets? Please explain what your subsequent steps are.

ADD COMMENT

Login before adding your answer.

Traffic: 2640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6