Entering edit mode
5.5 years ago
I understand that every protein accession may have one or multiple PDB id, and I know each PDB id also can have one or multiple chains. I do not understand where I find the primary sequence of each PDB id which has the same sequence as protein sequence in accession. Please anyone can explain or clarify that to me. I really appreciate any help. Thanks
What are you calling the primary sequence?
If a PDB ID has multiple chains, do you want all of them?
I mean the primary structure of the protein. Does the PDB ID has the same primary structure as the accession? Thanks for reply
"Primary structure" is a term for sequence, if I remember my lessons from undergrad correctly. Is that what you mean? Are you asking if the sequence of the protein as found in the PDB entry is the same as the sequence of the protein as found in, say, UniProtKB-SwissProt?
Thanks, RamRS for your reply. Yes, I meant exactly what you have mentioned in your question! How can I find the primary structure in PDB entry that matches with protein accession sequence?
You should be able to use the
DBREF
flag to get to the corresponding UniProt entries. For example, theDBREF
entry for 3HG2 (in PDB format)looks like this:Both chains point to
P06280
, which I know is the canonical entry for Human GLA. There's also theSEQRES
section if you're interested in the exact sequence in the PDB entry.Here's an example (1A3N, human hemoglobin) where the PDB entry has links to multiple UniProt entries:
Thank you so much for your reply. I really appreciate it. But please I know how I get the PDB code for the protein accession. I meant how I GET THE sequence of the PDB id that match with the sequence in the accession. For example: This is the fast file for YP_009208550.1 accession
suppose it has multiple PDB ids . How I can find the corresponding primary structure sequence for these ids
You need to explain what you mean by the following terms, as your usage of them is pretty ambiguous:
Lastly, are you looking to go from a sequence input to a structure result? Are you
blast
-ing protein sequences against PDB and looking to pick the exact match results?If it has multiple PDB IDs, they are all equivalent, so that sequence is THE sequence. There is nothing else to do.