Hello everybody,
I have assembled the mitochondrial genome of 8 sheep individuals and now I am trying to construct the phylogenetic tree.
I have also downloaded from ncbi 2 representatives from each haplogroup. (So, I have 18 mitochondrial genomes in total).
With clustal omega I did the multiple alignments online and I used RAxML for the generation of 500 ML trees.
I did bootstrapping (number of bootstrap replicates: 2000) but I'm not so confident with the bootstrap values.
Well, the topology seems ok, since all my samples belongs to haplogroup B and that is obvious from the tree, but the bootstrap values between them are low. 50, 48, even lower.
The bootstrap numbers at the major branches (the branches that categorizes the haplogroups) are very good (100, 98, etc). The problem seems to be in the small branches among my individuals.
I try to build a tree only with my individuals but again, I have some very good values (around 100) but also and some weak.
As it seems, I'm doing something wrong in my methodology but I cannot think what is.
Could you help me with the steps that I have to follow?
Thank you very much in advance,
Vasilis.
Hey,
First thing I'd like to know is what are you aligning from these mitochondria? Is it their whole mitochondrial genomes, only some of their protein-coding genes or something else?
The alignment step is the one that will have the biggest impact on your ML generated tree, so I would suggest some manual QC of the alignment (and maybe try some other alignment programs as well) before jumping to downstream processing with RAxML.
Yes, I tried will the whole mitochondrial genomes.
I did the alignments with clustaw and muscle and the results were pretty the same. I used MEGA for this and manually I put one or two one base gaps to fix them. Actually seems pretty good.
What do you mean manually QC?
Do you believe that I have to restrict it in smaller regions?
Thank you very much
You could try if only coding-regions will give you better resolution between samples, since those are more conserved and any variation in them is going to be more significant, than if you compare them along with intergenic regions.