Retrieving Swissprot FT details for a particular amino acid position using Python
1
0
Entering edit mode
9.4 years ago
a.gardner ▴ 10

Hi,

I am new so my apologies if I phrase things incorrectly.

I am trying to extract information from a Swissprot file for a mutation. For example I need to know the Domain it is positioned in and the secondary structure.

The input variables are: acc_number (the swissprot accession code), wild_aa (wild amino acid), position (that is the number amino acid in the protein), mutant_aa (the replacement amino acid).

I have got as far as retrieving the features using

for feature in record.features:
    print feature

How do I extract the secondary structure information from this to determine whether my amino acid is in a strand, helix, turn or random coil (if no secondary structure is recorded then I would say none).

My full code to date is below...as you will guess I am a total beginner with Python and Biopython (in fact with programming!):

Thanks in advance

#!/usr/bin/env python

import time
import sys # this module provides access to the input variables
import os
from Bio import ExPASy # this will allow a Swiss-Prot file to be opened
                       # over the internet using the accession number.
from Bio import SwissProt #this will allow the file to be read.

# This section receives the parameters from user input via the website:
# This will be commented out during the development period and temp. 
# variables will be used.

# acc_number = sys.argv[1]
# wild_aa = sys.argv[2]
# position = sys.arg[3]
# mutant_aa = sys.arg[4]

#Temp variables for developing:

acc_number = 'P01308'
wild_aa = 'L'
position = '43'
mutant_aa = 'P'

# next step is to retrieve the text file from swissprot to parse.
# this uses the acc_number variable:
handle = ExPASy.get_sprot_raw(acc_number)

# this reads the swissprot file:
record = SwissProt.read(handle)

# test to see if record has been retrieved:
print record.description

# next section will parse the sequence information using the position variable
# will determine the secondary structure location of the mutation

#obtaining sequence and placing it in a variable.
sequence = record.sequence
#print sequence

# accssing the secondary structure and domain information from FT lines
for feature in record.features:
    print feature

# Check that the wild amino acid is correct
Swissprot Python parsing Tuples Biopython • 3.2k views
ADD COMMENT
2
Entering edit mode
9.4 years ago
Ram 43k

The documentation refers to features as a tuple with key name, start, from, description. To see how a record looks like in plain text, check this out: http://www.uniprot.org/uniprot/P01308.txt

I've never used Bio.SwissProt.Record before, but you should probably be able to get feature[0] where feature[1] <= position and feature[2] >= position

This is the logic, implementing it in Python should not be a problem.

References:

  1. http://www.uniprot.org/uniprot/P01308.txt
  2. http://biopython.org/DIST/docs/api/Bio.SwissProt.Record-class.html
  3. http://www.tutorialspoint.com/python/python_tuples.htm
ADD COMMENT
0
Entering edit mode

Thank you, it works (though I feel a total numpty!)

ADD REPLY
0
Entering edit mode

More on....Python Tuple

ADD REPLY

Login before adding your answer.

Traffic: 1603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6