Hi all,
I'm dealing with a protein that has both ordered and disordered regions (most of it is disordered). I need to align this protein with very high accuracy (I don't care much about speed) with 40 of its orthologs, in a multiple sequence alignment. I want both ordered and disordered regions to be aligned properly, so I do not want to depend on substitution models. What would you recommend? MAFFT? MUSCLE?
Thanks!
As an aside, is some of you aware of a substitution model available for intrinsically disordered proteins? The only one I know of can be found here.
Hi Whetting,
by not depending on substitution models I mean not having to specify a predefined model and use, for instance, HMM. In this case (where I have both ordered and disordered regions), I would feel more confident than using BLOSUM matrices or similar. Is this notion right?
In such a case, do you think ProbCons is a good option?
It seems to me that you may want to try a couple algorithms. It is usually good practice to empirically determine which alignment method worked best. t-coffe, probcons,...all have good and bad tendencies
Not to mention the fact that we don't REALLY have a great idea yet half of the time about what even constitutes good performance from an alignment algorithm when it comes to indels for instance. Whether the algorithms that produce more "gappy" alignments in loop regions are better or worse. People tend not to like it but the evidence suggests they are probably modelling biological reality better in lots of ways. I would guess this would be a big concern for the disordered regions.
Certainly, it is a big concern...