Biostar Beta. Not for public use.
List of sequences in a fasta file. I need to generate the reverse complement for each of them.
0
Entering edit mode
12 months ago

Here is what the list of sequences looks like in the file brca.example.illumina.0.1.fasta

>Frag_1 chr17 (Strand + Offset 106709--107175) 467M 101M
AGGCAGGTCTCAAACTCCTGACCTCAGGTGATCCACCCACCTCAAGCCTCCCAAAGTGCT
GGGATTATAGGCATGAGCCACCATGTCCGGCAAGTTTCTTT
>Frag_2 chr17 (Strand + Offset 5449--5912) 464M 101M
ACTCCTGACAAGTGATCCACCTGCCTCGGCCTCCCAAAGTGCTGGGATTACAGACATGAG
CCACCATGCCCAGCCTCCAGCCCATCATTTCTTGATGATTT
>Frag_3 chr17 (Strand + Offset 82312--82850) 539M 101M
TTTGCCATAAAGTGCCTGCCCTCTAGCCTCTACTCTTCCAGTTGCGGCTTATTGCATCAC
AGTAATTGCTGTACGAAGGTCAGAATCGCTACCTATTGTCC
>Frag_4 chr17 (Strand - Offset 1052--1590) 539M 101M
TTGCAGTGAGCCAAGATCATACCACGGCACTCCAGCCTGGGTGACAGTGAGACTGTGGCT
CAACAAAAAAAAAAAAAAAAGGAAAATGAAACTAGAAGAGA

And so on. I would like to generate the reverse complement for each of these, so here is what I have written

for record in SeqIO.parse("brca.example.illumina.0.1.fasta", "fasta"):
    y = record.seq.reverse_complement()

rev_comp2 = open("python_output2.txt", "w")
rev_comp2.write(str(y))

Upon running the script and opening up the text file, it appears to have only generated the reverse complement for the first sequence. Here is the content of the text file

GGCTGGAGTGCAGTGGCATGATCTCGGCTCACTGCAAGCTCTGCCTCCCAGGCTCACACCATTCTCCTGCCTCAGCCTCCCAAGCAGCTGGGACTACAGGC

I thought that including "for record in SeqIO.parse("etc. etc.") would generate the reverse complement for each sequence in the file, but apparently not? What can I write in my script that will generate the reverse complement for each sequence?

biopython python • 247 views
ADD COMMENTlink
0
Entering edit mode
11 months ago
JC 7.9k
Mexico

The problem is you are only writing the last record, you need something like:

rev_comp2 = open("python_output2.txt", "w")

for record in SeqIO.parse("brca.example.illumina.0.1.fasta", "fasta"):
    y = record.seq.reverse_complement()
    rev_comp2.write(str(y))
ADD COMMENTlink
0
Entering edit mode
3 months ago
Joe 12k
United Kingdom

You need to include your write step inside the loop (indented).

You will probably also need to open the file in write+append mode, so that it doesn’t continually overwrite.

E.g.

from Bio import SeqIO
from Bio.SeqRecord import SeqRecord

with open('359663.outfile', 'wa') as handle:
    for rec in SeqIO.parse('359663.fasta', 'fasta'):
        handle.write(SeqRecord(id=rec.id, seq=rec.seq.reverse_complement()).format('fasta'))

The with statement is considered more ‘pythonic’ too, as it means you do not have to close() the file when you’re done with it. You could also use the SeqIO.write method here for added safety, but use of SeqRecord allows you to write in fasta format directly.

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1