How Does Codeml Assign Internal Branch Numbers To Phylogenetic Trees?
3
9
Entering edit mode
13.1 years ago
John ▴ 790

Hi,

I'm struggling to understand how CodeML assigns internal branch numbers to phylogenetic trees? The tree below is shown from a tree viewing program (figtree). The order of the tips is shown from the codeML output (bottom image in the collage) and the branches are shown on the left hand side. I have not pasted the code I used to make this tree, as it is just a random example. I've read through the CodeML docs (http://abacus.gene.ucl.ac.uk/software/paml.html), but I cannot figure out the process that the program uses to assign internal nodes (that is, to assign e.g. branch 20..21 to the correct branch). I thought this would be on p15-17 of the manual, but I don't think it is and I can't find it elsewhere. I've tried a few labeling schemes, but none then seem to be correct when I then try to identify the listed branches.

I hope somebody can guide me!

Thanks, John

alt text

codeml paml phylogenetics tree • 8.8k views
ADD COMMENT
3
Entering edit mode
13.1 years ago

Hi John,

  1. save the tree provided in the beginning of the rst file under:

    tree with node labels for Rod Page's TreeView

  2. open it in FigTree, it will ask you to name this node labels (e.g. ancestral_nodes).

  3. click Node Labels. Then, in the menu Display, select ancestral_nodes.

  4. you'll see both node and tip labels, so the branch 20..21 -for example- will be self-evident.

I hope this helps

ADD COMMENT
0
Entering edit mode

Thanks Carlos, but I don't understand your first instruction. What is the rst file (reStructuredText markup language? I guess not?). And which program should I open that file in order to save it as a tree with node labels?

ADD REPLY
0
Entering edit mode

rst file is one of the output files of Codeml (find it in the same folder where you run Codeml after the completion of the analyses). You can import it in excel in a tab-delimited format. if you are interested in the tables or just open it in a text editor to do what Carlos said.

Cheers, Kartik

ADD REPLY
0
Entering edit mode

Hi Carlos. I was wondering if the labels are printed out in the rst file for every model in Codeml or only when we change the molecular clock? Because it never prints out the node-label tree in the rst file or in the main output file for me. I don't know if it has something to do with the models being used.

ADD REPLY
3
Entering edit mode
12.8 years ago
User 1940 ▴ 80
  • Codeml starts numbering each sequence in the multiple alignment in an increasing order. So if you have the following alignment file (.aa):

mouse SDASDASDASD human WEFNWEPFWNF chimp ASDADAAFAF horse WRNWEPRWWR chicken QEQRTWEGGF

Then, codeml would assign node number 1 to mouse, number 2 to human, number 3 to chimp, number 4 to horse, and finally number 5 to chicken.

  • Codeml starts numbering internal nodes where it left numbering in the previous step and first numbers the MOST ancestral state at the root of the tree.

  • So the most ancestral node will get the number 6!

  • Then it continues numbering the internal (or ancestral) nodes in an increasing order while maintaining the ancestry relationships between the nodes (or species).

  • But how does Codeml know which node is more or less ancestral than another node? Well, it keeps track of this information by reporting the tree as such the most ancestral node (and the species it includes) appears on the most left (so in the beginning of the line that includes the tree[see note below]) and proceeds to the right in a decreasing order of ancestral relationship.

  • So if we assume that out of the five species that we have here, chicken is the most ancestral species; mouse and horse are related and follow the chicken; lastly, human and chicken are related and least ancestral or most recent. Then the tree codeml reports would look like something like this:

(5_chicken, ((1_mouse, 4_horse) 8 ,(2_human, 3_chimp) 9 ) 7) 6;

  • One thing you should immediately notice about this representation must be that the most ancestral node, which is 6 in this example, appears on the most right unlike the appearance of species by ancestry, which was most left as mentioned above. So you should be careful and take into consideration this when interpreting the tree file.

Note: When you run codeml and ask for the ancestral reconstruction (RateAncestor = 1 in the model), it will produce five output files, one of which is called 'rst'. This file keeps the ancestral reconstruction.

At line number 15 (yes, always 15), you will find the tree view representation of your tree with paml labels included. To view this tree file, I suggest you use TreeView which you can get free at http://taxonomy.zoology.gla.ac.uk/rod/treeview.html

Get that one line tree at line number 15, copy it to a new text file, and name it mytree.trees, save it.

Now go to TreeView, open this file. Then go to 'Trees' tab and choose 'Show internal node labels'. That's it.

ADD COMMENT
0
Entering edit mode
13.1 years ago

labelling nodes should be straightforward, although I've never seen an easy way to label branches (although I'd love to, so I'll follow up this post in case anyone suggests a viable alternative). have you tried using other tree visualization tools? I have kind of workaround this issue using Archaeopteryx, which accepts phyloXML format, which allows to define "Confidence" values that appear as branches' labels. the "Confidence" tag has to be a double, but if your labels could be converted to doubles (i.e. 20..21 to 20.21) then it would be easy to see them onto the branches. labeling the tree nodes would be using the "Name" attribute of each "Clade".

ADD COMMENT
1
Entering edit mode

Thanks Jorge. My question is more about the internal representation of the nodes on the tree. Which branch is 20..21 for example? How are internal node numbers are assigned? I don't actually need a method to write them onto the tree, although that would be useful too.

ADD REPLY
0
Entering edit mode

I see. your question is more about understanding the information you already have rather than presenting it in a different way. in that case I'd wait for someone more experienced to bring some light on this. good luck.

ADD REPLY

Login before adding your answer.

Traffic: 3225 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6