Cross-reference with PDB database
1
1
Entering edit mode
6.3 years ago
adirsommer ▴ 10

Hi,

I have a list of several thousand proteins and their UNIPROT IDs. I'm looking for an efficient method of cross-referencing it against the PDB tertiary structure database, and get a list of those proteins with a tertiary structure in the PDB database.

I've tried to BLASTP the list of UNIPROT IDs against the PDB database, using the NCBI BLAST portal but encountered too many errors of "Error: Failed to read the Blast query: Sequence ID not found", making the process of manual filtering not convenient and not efficient.

Any ideas?

Thank you!

PDB UNIPROT PROTEIN sequencing • 2.2k views
ADD COMMENT
1
Entering edit mode

Use UniProt ID converter to map them to PDB ID's. That can give you an idea of how many are present in PDB. From there you can start looking for things with a tertiary structure.

ADD REPLY
1
Entering edit mode

there are several similar posts in Biostars,

see this one below and the right panel:

Protein PDB ID

ADD REPLY
0
Entering edit mode
6.3 years ago

Here is the list of UniProtKB entries cross-referenced to PDB https://www.uniprot.org/uniprot/?query=database:(type:pdb)&format=tab&columns=id,entry%20name,reviewed,database(PDB)

For reviewed entries only, you can also have a look at this precomputed file: ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/docs/pdbtosp.txt

The reason why some of your NCBI BLAST queries by UniProtKB identifiers fail is that NCBI_nr does not include UniProtKB/TrEMBL. I presume that it is failing on these identifiers?

ADD COMMENT

Login before adding your answer.

Traffic: 1933 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6