Forum:Uniprotkb Accession Number Format To Be Extended To 10 Characters
1
4
Entering edit mode
10.4 years ago

UniProtKB accession numbers currently consist of 6 alphanumerical characters. With our projected growth of UniProtKB, we expect to use up all accession numbers of this format in 2014. We will therefore extend the format to 10 alphanumerical characters.

Read more here: http://www.uniprot.org/changes and contact the UniProt helpdesk with any comments you might have.

uniprot web-service • 3.0k views
ADD COMMENT
2
Entering edit mode
10.4 years ago
Michael 54k

Reminds me of transition to from IPv4 to IPv6 ;) That would give 2.611467e+13 new IDs following the new scheme. If we assume there a 10 million species on earth and each contributes on average 20,000 proteins (total 2e+12), then these numbers should be sufficient.

I'd expect a question on BioStar like: "How can I map from old to new UniProtKB accession numbers?", but if I understand correctly both short and long versions should co-exist and the already assigned ANs should not be changed, and new ANs only assigned to new proteins? Further, is it a problem that some new IDs can have valid or existing old ANs as prefixes according to your definition?

ADD COMMENT
0
Entering edit mode

Regarding the mapping: yes, you are correct, short and long versions will co-exist. New ACs are assigned to new entries, and already assigned ACs usually do not change. If they do need to change, this will be handled like with the current AC scheme: http://www.uniprot.org/manual/accession_numbers .

ADD REPLY

Login before adding your answer.

Traffic: 1892 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6