Double Restriction, Fragment Generation
1
0
Entering edit mode
11.9 years ago
capemaster • 0

Hi all, I'm trying to deal with the restriction module of BioPyhton because I need to create a report of the fragments generated by a double in silico digestion. Briefly, if I digest a .fasta sequence with let's say EcoRI and PstI, I would like to know how many Eco-Pst, Pst-Eco, Eco-Eco and Pst-Pst fragments are generated. Does anybody know how to do it with BioPython?

EDIT

>Sequence
CTAGCGTAATAATAGGTACACTGAATAGTAGTACAGTACGTACAGCTTTTCCTGGGGATC
CTATCGCAATCGCGAATGCGACTTCACGTGAATAGATCTCATTCTGAGCTCCCTTATACG
TTATAGTTCGACTGTGCTTGATACAAAACGTTTTACTGACTATAACGTGGGGGCACGGGA
ATTCAACAGAACTCTCCAAGCTGTCGATTTCTGTATGTTTGAGATTAGATCAGACCTCAC
AAGACTCCCTAAACCATCCAGCCCACTTTATATCCCCTCTTCTCCGCCGGAGGTGAATTC
AATCCGGCACCAAGGGACTGACAATTTAGCGCAGATACGAGGCAGAACACCGGAAAGACC
AGCGGCACTCGCGGGGATCTGGCCCGGTGGGCCCCGGTCCGTGAGCCCGAAGACCCCCTC
CCCGAAGATTGGAGGTGCCAGGCAACTGAGGGAGGTGGCTGTCGACGCGCGCCCGGTGCC
CGGCCGAGATGTGGGGCCTCCCGGACGGGTCGACCAGCAGCCGGCCGGTGCCCCCTCCGT

In this sequence there are 2 EcoRI sites and  4 TaqI sites. I want to know how many fragments Eco-Taq, Eco-Eco, Taq-Eco or Taq-Taq are made by the digestion.

The answer from ALchEmiXt is something reasonable but I can't undestand the subtraction step: It could be that there are many sites of one enzyme, contigous... how can I deal with that?

Tank you for your help.

biopython • 2.9k views
ADD COMMENT
0
Entering edit mode

this all depends on what your data currently looks like. Edit your question and post a small snippet of the data that you are trying to process.

ADD REPLY
0
Entering edit mode

Basically you generate a list sorted on cut position. To get the fragments just subtract item1 from item2, item2 from item3, and so forth... Since you have different enzymes you need to track which enzyme generated the cut...therefore the use of dictionaries (python I believe) or hashes/arrays....what you prefer, to keep de cut position associated with the enzyme.

ADD REPLY
0
Entering edit mode
11.8 years ago
ALchEmiXt ★ 1.9k

You can use BioPython or I used BioPerl in the past...which is good practise without reinventing the wheel.

Alternatively you could search using a regular expression in Perl (or Python) for Eco and store Eco and position in a hash. Do the same for PST and store them also in the hash. Sort the hash on position. Readout the fragments and determine the lengths of each by subtracting the two positions.

Have phun!

ADD COMMENT

Login before adding your answer.

Traffic: 2004 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6