Outgroup from command line
1
0
Entering edit mode
9.6 years ago
Lee Katz ★ 3.1k

Hi Biostars, is there a command-line way to guess the best outgroup? Command-line tree-builders such as raxml can accept an outgroup as an argument but sometimes it is not something known a priori.

raxml perl phylogeny outgroup command-line • 2.4k views
ADD COMMENT
0
Entering edit mode

I think that one thing that could be helpful, for example, is that I have a pairwise distance file with three tabs:

genome1  genome2  distance

And I have a perl script to total the distance although probably not optimized

perl -MList::Util=max -lane '
  $d{$F[0]}+=$F[2]; $d{$F[1]}+=$F[2]; # give distance 'points' to each genome
  $e{$d{$F[1]}}=$F[1]; $e{$d{$F[0]}}=$F[0]; # reverse-index which points belong to which genome
  END{
    $max=max(values(%d));   # find the max distance attributed to a genome
    print "$e{$max}\t$max"; # print the genome and its distance
  }
' < pairwise.tsv

However, I think that because of a sampling bias, it is not giving me the correct outgroup.

ADD REPLY
1
Entering edit mode
9.6 years ago
David W 4.9k

For most software the tree seach uses unrooted trees, so the outgroup you set is of no consequence -- it just roots it at the end (section 10 of the raxml quickstart makes this clear, for instance).

You should just estimate you tree, then use mid-point rooting if it's important that you use rooted trees in down-stream analyses.

ADD COMMENT
0
Entering edit mode

Ok that makes sense, thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2743 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6