How To Process With Geo Raw Data Downloaded From Ncbi
2
2
Entering edit mode
12.4 years ago
Tao Zhao ▴ 20

Hi everyone!

Recently I’ve downloaded a raw data (≈ 400M) from GEO database from NCBI. The Platform : NimbleGen GDR Malus domestica EST UnigeneV4 array. Overall Design: “Using a single color labeling system, a total of 24 microarray slides were utilized, one for each cortex tissue sample, for transcriptome profiling analysis. 2 cultivars x 3 developmental stages x 4 biological replicates.” Each sample has a normalization RMA data.

Here's my question: HOW to process these raw data before Cluster to find genes upregulated or downregulated . the data are all positive numbers, how to get a log ratio.

GEO url: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE24523

I've little experience and so confused. You can recommend some materials for me to learn. Thanks for your help!

geo data • 8.2k views
ADD COMMENT
2
Entering edit mode
12.4 years ago

I'd suggest that you find a bioinformatics collaborator to work with you on these data. While your questions have answers, an online forum may not be the best way for you to move forward.

ADD COMMENT
1
Entering edit mode

I agree with Sean. While this response may not be the immediate, out-of-the-box solution for which you were looking, it is the most practical. Processing GEO raw data to log ratio and gene set/pathway enrichment and all is a mutli-step process, which, in this forum, would ideally be presented as a series of single questions. Look for a patient, communicative bioinformatics collaborator.

ADD REPLY
0
Entering edit mode

I see that the last post was 5 years ago...has the situation changed at all since then? I'm still faced with the situation that the curators had not gone through the data yet, such as this one: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE42608 .

What is involved in parsing the GEO data into R data structures? I know a bit of statistics and I have statistics friends who can take over to do the statistics, but could you give me a 30,000 feet view of the steps involved (cleaning the sequences, pairing the ends, etc) to get there? Thanks!

ADD REPLY
0
Entering edit mode

Ok,thank you two. Anyway,if you are familiar with this data process procedures,you can write these steps roughly here to me, just some keywords will be ok. I'm so curious about this. Maybe it is a little hard for me to find a collaborator, I am such a primary user and many people here are experts as i see. ……Huge gaps.

ADD REPLY
0
Entering edit mode

RMA data are not raw data; they are normalized already. You can use those data directly to cluster. You do not need to form log ratios to cluster, either. As for up/down regulated genes, clustering does not tell you that. You will need to do a statistical test to find those genes that are up/down regulated.

ADD REPLY
0
Entering edit mode

Thank you so much Sean, you've enlightened me a lot.

ADD REPLY
1
Entering edit mode
12.4 years ago
Yogesh Pandit ▴ 520

GEO DataSet Cluster Analysis

Also as a starter, you can play with the R script generated by GEO2R to handle the dataset.

ADD COMMENT
0
Entering edit mode

Thank you ! I think I've known the fundamental procedures. I am now learning some variance analysis such as t-test、F-test、SAM. After the statistical test , then cluster.

ADD REPLY
0
Entering edit mode

GEO2R should help you with that, provided you have more than 1 samples

ADD REPLY

Login before adding your answer.

Traffic: 2047 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6