how can I simplfy a newick file with 10 000 entries into a tree of only 10 entries
1
1
Entering edit mode
8.4 years ago

Hi, wish to pick 10 species from a tree at random, and rewrite the newick file accordingly. I am programming the code to do that.

How can I do it? can someone clarify the logic?

genome newick • 1.9k views
ADD COMMENT
1
Entering edit mode
8.4 years ago
jhc ★ 3.0k

if you just want to sample 10 random tips from a large tree, this would the way to do it using ETE

from ete3 import Tree
import random
tree = Tree("myNewickFile.nw")
sample_tips = random.sample(tree.get_leaves(), 10)
tree.prune(sample_tips)
print tree.write()
ADD COMMENT
0
Entering edit mode

You can edit the newick tree file using Sed,Awk, grep commands. As long as you are extracting the ten species out with newick defined values it would seem appropriate..

However..

As you are wanting to redo the tree. I really suggest you download the accession sequences of interest ( your ten) , perform an alignment with a program such as MAAFT then use IQtree. In the manual there are nice examples on how to select the best model, and settings. Between MAAFT and iqtree I would maybe also download an alignment program to manually check your alignments for flanking regions with poor coverage and trim them on the consensus length of all ten sequences. It sounds more complicated than it is, but pretty easy.

ADD REPLY

Login before adding your answer.

Traffic: 2633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6