Finding Gene Symbols for Probes in Raw Data
1
2
Entering edit mode
6.0 years ago
samet ▴ 20

Dear all,

I'm working on GSE23561 dataset and GPL10775 platform.

There are RAW Data Files and a Normalized Series Matrix File in GEO. By mapping ID_REFs in normalized series matrix to IDs in platform, I can get gene symbols (Symbol v12) for any probe.

But, I need to construct a Non-Normalized Series Matrix and don't know how to get ID_REF values (so gene symbols) for probes in raw data. Probes are not ordered by ID_REFs in raw datasets as it is in the normalized series matrix.

A piece of raw data:

A piece of raw data

Here, ID_REF of the 6th row is not equal to 6. So, when I directly use row numbers as ID_REF, the gene symbol appears as MAR6, but it is actually HPRT1. I don't know what are these IDs stand for in raw data or can they be used to get ID_REFs.

Any suggestion is appreciated. Thanks!

gene ChIP-Seq gse gpl geo • 3.3k views
ADD COMMENT
0
Entering edit mode
6.0 years ago
GenoMax 142k

Information about annotations for this platform are available in this file. It comes from this platform page at NCBI. Scroll down and click on View Full Table link.

ADD COMMENT
0
Entering edit mode

Yes, I'm already using it to get gene symbols for probes of Normalized Series Matrix (by mapping ID_REFs to IDs in platform). But, I need to get symbols for probes in Non-normalized dataset (which does not have ID_REF values).

ADD REPLY
0
Entering edit mode

The file above should be for the platform ( Human 50K Exonic Evidence-Based Oligonucleotide array Technology type spotted oligonucleotide) and should contain everything on the array. It does not?

ADD REPLY
0
Entering edit mode

Yes, it does. The problem is that, although raw data files and platform has the same number of rows (50400 for each), the order is not identical. E.g. the 6th row contains MAR6 gene in the raw data tables but HPRT1 in platform. Which means that the gene symbol of a probe with ID_REF = 6 is HPRT1 since it corresponds to ID = 6 in platform. So, I need to find ID_REF values for each row in RAW data to be able to use platform info. Right?

ADD REPLY

Login before adding your answer.

Traffic: 1335 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6