GEO datasets: Raw data is available on Series record
2
1
Entering edit mode
8.1 years ago
aharnishi02 ▴ 80

I am new to lot of these genomics efforts. I have some basic questions on expression datasets found in GEO. The raw files of some datasets are indicated to be available on series record. (eg: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE42133)

I could not find the raw files anywhere within the series record. Where do you suppose I could find these raw data?

NCBI Expression array • 9.9k views
ADD COMMENT
0
Entering edit mode

Does not look like the raw data is available for this accession.

ADD REPLY
0
Entering edit mode

I had mailed the authors asking for the raw files, they reiterated that the raw files have been submitted and available online for download.

So what do you think i can do to get access to these files?

ADD REPLY
0
Entering edit mode

@mastal511 provided a link below that has the raw data files. Here is the link for raw (CEL/CHP) data.

ADD REPLY
0
Entering edit mode

I did not realise the link i gave above is leading us to a completely different dataset.

I am looking for the data from GSE42133.

ADD REPLY
0
Entering edit mode

How about this page. There are no CHP/CEL files though.

ADD REPLY
0
Entering edit mode

Hi, could you tell me what does 'Raw data is available on Series record' mean?

ADD REPLY
0
Entering edit mode

These are all the files that were submitted for this record. As I said before this does not appear to contain CHP/CEL files. Look in the "suppl" folder for the data labelled "raw". If you are after those then you will need to contact submitters.

ADD REPLY
0
Entering edit mode

CEL files? If we are talking about the original dataset GSE42133, which author of the post is interested in, it is Illumina HumanHT-12 V4.0 expression beadchip, so the bead-level data should have TIFF extension (it's an image), while bead-summary level data - idat. And for some unknown reason people usually do not submit real raw data for Illumina.

"I had mailed the authors asking for the raw files, they reiterated that the raw files have been submitted and available online for download."

Do not want to offend the authors, but most probably they outsourced the analysis of raw data, and they could be unaware what is Illumina raw data exactly. Here is the rare case of when people uploaded idat files - https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE71625

So, I would suggest mailing authors again and explaining the situation.

And I recommend this package to deal with raw Illumina files - http://bioconductor.org/packages/release/bioc/html/beadarray.html

ADD REPLY
0
Entering edit mode
8.1 years ago
mastal511 ★ 2.1k

If you navigate to the series record page,

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE51808

And click on the options (http)(custom) under the Download heading at the bottom, there are a couple of ways to download raw data (CEL files).

ADD COMMENT
0
Entering edit mode

Hi,

Thanks for responding. But... Raw files in GEO are stored/uploaded with the following note: a) Raw data provided as supplementary file where the data is directly available for download which is what your link contains and there are no issues with that. b) Raw data is available on Series record and this is where i am facing a problem. I cannot access these files.

Could you help out with the series record

ADD REPLY
0
Entering edit mode

Hi,

in the meanwhile did you get the raw data you were looking for? if yes could you please say how?

ADD REPLY
0
Entering edit mode

If you are referring to the same record that @mastal511 had posted then the link for the raw data is at the bottom of the page and reproduced here.

ADD REPLY
0
Entering edit mode

Not really, I was referring to his (b) question, if raw data was mentioned to be available, yet you can't find it...what to do

Raw data is available on Series record and this is where i am facing a problem. I cannot access these files.

ADD REPLY
0
Entering edit mode
4.1 years ago
chunhui.gu • 0

I have encountered the same problem on GSE111368, there is no raw data there. Previously, the person in charge of uploading it insisted that the raw data was uploaded and was there. After I reported the issue to the editor of nature immunology, he finally admitted that the raw data is not available on the GEO website. However, he refused to share the raw data and claimed that per the requirement of GEO they are responsible to share raw data. Then I sent an email to GEO to check it since I saw that the raw data is required by GEO. Then GEO responded that they do accept non-normalized data in tabular format as an alternative to IDAT files.

To those who face the same problem, this should let them know why people don't give raw data from the beadarray.

This is the message from the uploader.

As per requirement from GEO, we do not have to upload .idats file anymore because it’s too large. The non-normalised data from GEO is good enough for you to start the analysis from beginning. It has raw intensity signals and detection P values.

L

This is the message from the GEO team. ------ MESSAGE BODY. YOU MAY CHANGE IT OR ADD COMMENTS ABOVE ------

ORIGINAL CC line was: tirouvan@gmail.com, dmoncad@emory.edu

Dear Chunhui,

To be clear, we do accept non-normalized data in tabular format as an alternative to IDAT files. That said, we have written to the submitters of GSE111368 to encourage them to add the IDAT files.

We will inform you upon receiving their response.

Best Wishes, The GEO Team

---- END OF MESSAGE BODY. PLEASE DO NOT CHANGE THE DATA BELOW ----

ADD COMMENT

Login before adding your answer.

Traffic: 2076 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6