Converting A Set Of Protein Ids To Gene Ids
3
0
Entering edit mode
10.7 years ago
lin.barnum ▴ 230

I wanted to convert a set of protein IDs to gene IDs. I have a list of proteins IDs such as

ADF1_DROME
A1Z8K2_DROME
A0AQF9_DROME
A0AQG1_DROME
B4J066_DROGR
A0AQG9_DROSI
B3NHM7_DROER
B3NB45_DROER
B3P0U4_DROER
B3NJP0_DROER

and my final aim is to identify the genes that they originate from since many of them are different transcripts from the same gene. How could I go about it? I tried BioMart but could not figure out how this could be done there. All of my proteins are from dipterans.

database gene genes • 8.8k views
ADD COMMENT
3
Entering edit mode
10.6 years ago
Arnaud Ceol ▴ 860

You are looking to convert from Uniprot IDs, so the most straight forward way is to use the mapping tool from Uniprot: go to http://www.uniprot.org/ , and then to the ID mapping tab. Here paste you proteins ID (or load them from a file), choose UniprotKB AC?ID as input and GeneID as output and click on MAP.

ADD COMMENT
1
Entering edit mode
10.7 years ago
brentp 24k

You can start with this mapping from the UCSC database given your inputs in a file names.txt

 awk -F_ '{ print $1 }' names.txt \
    | xargs -i mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A uniProt -N -e \
      "select * from gene where acc = '{}'"
ADD COMMENT
0
Entering edit mode

Thanks, this worked beautifully.

ADD REPLY
0
Entering edit mode
10.7 years ago
Bill Pearson ★ 1.0k

As you noticed, your problem is that you are using UniprotKB ID's, and there is no guaranteed mapping between UniprotKB ID's and genes (even in Uniprot). I suggest you find matches between your UniprotKB ID's and NCBI Refseq Protein IDs (which you do on the Uniprot web site using the ID mapping option). (Be aware that sometimes the mapped proteins are not identical.)

Once you have NCBI RefSeq Protein ID's, it is easy to get NCBI Gene ID's and Refseq mRNA IDs.

ADD COMMENT

Login before adding your answer.

Traffic: 1698 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6