Multiple sequence alignement MSA editing
0
0
Entering edit mode
6.7 years ago
TEman ▴ 10

I want to remove all rare insertions (when it occurs in less than 5% of the sequences) in a multiple sequence alignment file (clustal .aln) with 699 sequences.

That is, I have a MSA with many columns containing only one or two insertions while the rest of the sequences are blank "-". It is by far too much to do manually.

Any suggestions how to do this?

R alignment clustal • 1.8k views
ADD COMMENT
1
Entering edit mode

Do you specifically want to do this in R?

If you use BioPython, you can create an ungapped concensus sequence with a threshold for inclusion of a particular residue in a column.

ADD REPLY

Login before adding your answer.

Traffic: 2580 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6