I have a number of aligned sequences, and their corresponding CIGAR strings. What I want to do is to reconstruct the alignment based on this. Basically I want to do what:
http://www.mathworks.co.jp/help/toolbox/bioinfo/ref/cigar2align.html
does, only I do not have access to Matlab.
Do you guys have any suggestions on how to accomplish this? Programs, packages etc.
A perl script would be perfect, but writing one is beyond my capabilities.
Using a multiple alignment tool like MAFFT is not doable due to the nature of my sequences.
This works for me (with a little modification), but more complex CIGAR strings (containing multiple insertions and deletions) seem to cause it problems.
I have had a similar question referring to percent identity. In the end I found it easier to recalculate the alignments using Smith-Waterman in R for small numbers of sequences.
If you output the alignments instead of calculating the percent identity that should work. I'm not saying that it doesn't work using the CIGAR string but it was just simpler for me to code it that way.
This works for me (with a little modification), but more complex CIGAR strings (containing multiple insertions and deletions) seem to cause it problems.