2.4 years ago
@Lars Juhl Jensen504
I am not aware of an easy way to construct reliable species trees based on complete genomes. The general approach that you need to take is to pick one or more genes based on which to base your phylogeny. This could be either 16S rRNA, all ribosomal-protein-coding genes, or other highly conserved genes that are universally present and rarely subject to gene duplications or lateral gene transfer.
Once you have picked the genes, you need to make a multiple sequence alignment(s). You need to do this for each of the genes that you want to use for your phylogeny. For this I would tend to use either muscle or mafft. After that I would use Gblocks to extract the conserved blocks in the alignment(s) in order to not use potentially misaligned parts as the basis for tree building.
If you decided to use multiple genes as the basis for your phylogeny, you now have to make a big decision, namely whether to go for a concatenated alignment approach or a supertree approach. In the first case, you would concatenate all of the multiple alignments and use the resulting big alignment as input for a phylogenetic tree reconstruction program, for example PhyML. In the second case, you would use such a program to make a separate tree for each of the genes of interest, and subsequently use one of several supertree programs to derive a consensus tree based on these. If you went for just using a single gene as the basis for your tree, you obviously just build a tree for that one gene and you are done.
I hope this helps, although it is certainly very far from a "push of a button" solution.