ClustalW modifies the input sequence
0
0
Entering edit mode
7.1 years ago
BPors ▴ 60

Hi!

I am trying to calculate accuracy for my program which does prediction of aminoacid-aminoacid interactions. However, I am using ClustalW to create multiple sequence alignments, and it changes the input sequence,modifies it with adding gaps or other aminoacids. To be able to compare accuracy for the contact maps I created for my prediction and for the PDB file, they have to be in same size. At the moment, the matrix of PDB file is 70 and the one I predicted is 90. Do you have an idea of how to stop Clustalw to not to disturb the first sequence which is my input? I am using this code for Clustalw:

clustalw_cline = ClustalwCommandline("clustalw2", infile=name_to_align, outfile=filename, outorder= 'ALIGNED')

Also, I have changed the outorder to 'INPUT' but it did not change anything..

Any help would be appreciated,thank you.

clustalw contactmap accuracy coevolution • 1.8k views
ADD COMMENT
1
Entering edit mode

It does not seem very reasonable to not let gaps in your first sequence, does not make sense from a biological/evolutionary perspective. I would let clustalw (or Mafft, Muscle... and other better aligners) to insert the gaps they consider appropriate and then remove all the columns that have a gap in your first sequence

ADD REPLY
0
Entering edit mode

If you don't want gaps, you could try setting the gapopen penalty to a high value. However, I don't really see the point because if one sequence is shorter than the other, the alignment is likely to be of poor quality without gaps. I am not entirely sure what you're trying to do but it seems that since you have structural information, you may want to try an multiple alignment software that take this information into account like 3D-Coffee.

ADD REPLY

Login before adding your answer.

Traffic: 2292 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6