UniProt references to SwissProt references?
3.2 years ago
@mafernandez40311

Hello there

I have some identifiers coming from a search against the Swissprot database that have the following structure (example 1): SYDND_PSEFS

And I want them to be in the UniProt format, just like the following (example 2):

I4L7P1_9PSED

But I am not able to achieve it through the RetrieveID/ mapping tool, since I do not know the name assigned to the Swissprot database. If doing it the other way (from UniProt to Swissprot) is possible, I am also very interested in how to do it.

Thanks a lot

3.2 years ago
mobiusklein • 160
@mobiusklein8714

Uniprot's HTTP API is very accommodating about translating identifiers.

Requesting http://www.uniprot.org/uniprot/SYDND_PSEFS will be redirected to http://www.uniprot.org/uniprot/C3JYT1. If you're comfortable with Python, you could use the following approach:

from lxml import etree

uri_template = "http://www.uniprot.org/uniprot/{0}.xml"
nsmap = {"up": "http://uniprot.org/uniprot"}

translated = []

for swiss_id in your_ids:
tree = etree.parse(uri_template.format(swiss_id)
names = [el.text for el in tree.findall(
".//up:protein/*/up:fullName", nsmap)]
recommended_name_tag = tree.find(
".//up:protein/*/up:recommendedName", nsmap)
if recommended_name_tag is not None:
if recommended_name_tag.text.strip():
recommended_name = recommended_name_tag.text.strip()
else:
recommended_name = ' '.join(c.text for c in recommended_name_tag)
else:
try:
recommended_name = names[0]
except IndexError:
recommended_name = ""
gene_name_tag = tree.find(".//up:entry/up:name", nsmap)
if gene_name_tag is not None:
gene_name = gene_name_tag.text
else:
gene_name = ""

translated.append((names, recommended_nam, gene_name))


This will collect all the names that UniProt has for that symbol and store them in the list translated, you can then iterate over you_ids and translated in parallel with zip and decide which identifier to retain.

3.2 years ago
@Elisabeth Gasteiger2939

First of all, a short note on terminology.

The UniProt Knowledgebase (UniProtKB) consists of 2 section: UniProtKB/Swiss-Prot for reviewed entries and UniProtKB/TrEMBL for unreviewed entries (see http://www.uniprot.org/help/uniprotkb_sections, http://www.uniprot.org/help/entry_status).

Since Swiss-Prot is part of UniProtKB, it does not make sense to map from Swiss-Prot to UniProtKB. If an entry is in UniProtKB/Swiss-Prot, it has been reviewed, while a UniProtKB/TrEMBL entry is not reviewed, but in both cases, entries have a UniProtKB identifier (accession number and entry name).

However, if your goal is to map from entry name to accession number, you can indeed use the IDmapping tool http://www.uniprot.org/uploadlists, map from UniProtKB to UniProtKB, and then download the results in "List" format. Or you can use our REST API to map the identifiers one at a time, with an URL of the form

http://www.uniprot.org/uniprot/?query=SYDND_PSEFS&format=list