Using UniProt entry names to retrieve UniProtEntry data (ID and sequence) using UniProt JAPI (Java)
1
1
Entering edit mode
6.6 years ago

I am trying to use the UniProtJAPI for Java to get the protein sequence based on the entry name and later identifying a consensus sequence. I'm working on NetBeans 8.2 and UniProt JAPI 1.0.14

I have a list of entry names along with the peptides found by mass spectrometry, a short example (not actual data) is:

MAOM_YEAST_R.LATYGGD.K

MAOM_YEAST is my key to the full sequence, after separating it from the partial sequence I want to use this to get the UniProt ID and from there the corresponding full protein sequence. This full protein sequence is to be used to extend the partial sequence to find possible matches outside of the range of the partial sequence. So to quickly sum up the steps:

MOAM_YEAST > find UniProtID (P36013) > find sequence > locate partial sequence > extend the partial sequence with ~5 amino acids in both directions > search for a consensus sequence

I have consulted the UniProt JAPI documentation (included in the download), but especially uk.ac.ebi.kraken.interfaces.uniprot / UniProtId is where the confusion starts. To cite from the documentation:

How to work with this Interface

The standard way of retrieving this data type

The standard way of setting this data type

UniProtEntry entry = getEntryFromParserOrAPI();

entry.setUniProtId(DefaultUniProtFactory.getInstance().buildUniProtId("CYC_HUMAN"));

UniProtId id = entry.getUniProtId();

However, I get errors for the UniProtEntry entry not being loaded properly due to the getEntryFromParserOrAPI() not working (don't have the exact error at the moment, will post it ASAP). This method seems like the ideal way to perform the action I want, replacing "CYC_HUMAN" with another name to get the proper entry. If I understand the documentation correctly a UniProtEntry should be able to get the UniProt ID based on an entry name like "CYC_HUMAN" using getUniProtID() and getSequence() could be used for the sequence.

My questions are:

1) Does anyone know how to go from an entry name like MOAM_YEAST or CYC_HUMAN to the corresponding UniProt ID and perhaps from there to the sequence?

2) Does anyone have a solution for the suggested code from the documentation to get it working?

Much thanks

java uniprot sequence • 1.5k views
ADD COMMENT
4
Entering edit mode
6.6 years ago

1) Does anyone know how to go from an entry name like MOAM_YEAST or CYC_HUMAN to the corresponding UniProt ID and perhaps from there to the sequence?

$ curl -sL "http://www.uniprot.org/uniprot/MAOM_YEAST.xml" | \
     xmllint --xpath '//*[local-name()="sequence"]/text()' - | \
     tr -d '\n' |\
     awk '{S="LATYGGD";x=5;i=index($0,S);print substr($0,i-x,length(S)+2*x);}

SIECRLATYGGDKDVDY
ADD COMMENT
0
Entering edit mode

This seems like an interesting method and I well definitely try to translate it to Java to get it working, would you however also happen to know a solution using the UniProt JAPI?

ADD REPLY

Login before adding your answer.

Traffic: 3236 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6