Removing gap using array
0
0
Entering edit mode
6.4 years ago
skjobs1234 ▴ 40
#/usr/bin/python

import sys
inp=sys.argv[1]
output=sys.argv[2]
count=0
tempcount=0
with open(inp, 'r') as file:
    for word in file:
        if word[0]=='>':
            if count<tempcount:
                count=tempcount
            tempcount=0
        elif set(word)=={'-','\n'}:
            tempcount=tempcount+1
if count<tempcount:
    count=tempcount
with open(inp, 'r') as file2:
    words=file2.read().split('>')[1:]
list=[ ]
i=count+1
for fa in words:
    fasta=fa.split('\n')
    list=list+['>'+fasta[0]]+fasta[i:]
list=[i+'\n' for i in list if i]
with open(output, 'w') as out:
    out.writelines(list)

I would like to remove the gap one by one. while this script is removing line by line. My objective is to remove the gap if found more than 20 gap (-) in template sequences.

software error alignment sequence • 1.5k views
ADD COMMENT
1
Entering edit mode

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY
1
Entering edit mode

Because this code : 'for word in file' treat file as a list of lines. Thus, you are counting lines.

Maybe you could try this way:

  1. remove all the '\n' in the fasta sequence
  2. code a regex of more than 20 '-', such as --------------------[-]*
  3. find the all matches and delete them.
ADD REPLY
0
Entering edit mode

Please can u modify this script?

ADD REPLY
0
Entering edit mode

I have tried.. But not getting solution.

ADD REPLY
0
Entering edit mode

Hello skjobs1234!

It appears that your post has been cross-posted to another site: https://stackoverflow.com/questions/47508097/

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY

Login before adding your answer.

Traffic: 2044 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6