multiple alignment of multiple genes to generate a phylogenetic tree
1
0
Entering edit mode
7.1 years ago
qwzhang0601 ▴ 80

After we collected hundreds of one-copy orthologs among tens of specices, we want to generate a phylogenetic tree. But for the alignments of those sequences we are not sure what is the canonical steps?

(1) Should we concatenate protein or DNA sequences in each species then do the multiple alignment? Or should we do the multiple alignment for each ortholog and then concatenate the alignments of each orthologs? (2) Should DNA sequences be used or protein sequences? (3) What is the strategy to manually edit the alignments? Should the region showing a gap in any species be removed?

Thanks

phylogenetic tree • 3.4k views
ADD COMMENT
0
Entering edit mode

What kind of tree are you trying to build ? A species tree ?

ADD REPLY
0
Entering edit mode

Yes. I want to generate a species tree

ADD REPLY
0
Entering edit mode
7.1 years ago
Joe 21k

You could concatenate the sequences but I'd be more inclined to calculate alignments and trees separately for each gene and then compute a consensus tree.

See ASTRAL-II for instance

You'd save yourself some work doing the alignments and concatenation them instead of aligning pre-concatenated sequences.

Whether you should use protein or NA depends on the question and genes involved. If they are potentially quite diverse, use the protein sequence or vice versa.

ADD COMMENT
0
Entering edit mode

Thank you for your suggestions. You mentioned "You could concatenate the sequences but I'd be more inclined to calculate alignments and trees separately for each gene and then compute a consensus tree." Would you please explain how to compute a consensus tree using the trees by separate genes?

Besides, I am using MEGA7, do you have suggestions on how to generate a phylogenetic tree of tens of species based on hundreds of single-copy orthologs among them?

Thanks

ADD REPLY
0
Entering edit mode

Computing the concensus tree is exactly what ASTRAL does so check it out. The input is simply the trees for each of the genes.

You should be able to use any program capable of producing standard format trees. I know Astral accepts newick trees. I tend to trust the trees out of RAxML personally but it depends on your sequences/alignments as to how good your trees will be

ADD REPLY

Login before adding your answer.

Traffic: 1500 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6