Looking for an aligned multi-FASTA file in order to practice building phylogenetic tree
3
1
Entering edit mode
9.4 years ago
l.roca ▴ 10

Dear All,

I am looking for an aligned multi-FASTA file to practice building phylogenetic tree. Do you have any idea where can I find a file like that (It does not need to contain more than 10 species and I prefer data related to plants, e.g., rbcL)?

Thanks

phylogenetic • 4.3k views
ADD COMMENT
2
Entering edit mode
9.4 years ago
Siva ★ 1.9k

You can try TreeBASE which is a repository of phylogenetic trees (12,817 trees from 104,593 distinct taxa) and corresponding multiple sequence alignments (8,233 alignments) from publications. TreeBASE calls the multiple alignment file as matrix. You can do a Taxon search (e.g. Arabidopsis thaliana or NCBI taxonomy ID 3702) to get plant related alignments. You mentioned that you want the alignments in FASTA format. Though, this website provides the alignment only in NEXUS format. If you want to use the data from TreeBASE, you can convert the alignments from NEXUS to FASTA format using readseq available at phylogeny.fr website or you can download readseq and install locally.

ADD COMMENT
1
Entering edit mode
9.4 years ago
Michael 54k

You can make such files easily with one of the many MSA online apps: https://www.ebi.ac.uk/Tools/msa/

Simply choose sequences of homologs from different species of interest and try the different tools. That way you can also compare the effect of different MSA algorithms and parameters on the resulting phylogenies.

Another quick way is to use TreeFam, that way you can save yourself some work, you do not need to pick homologues yourself. Use a single sequence of interest or press the Example button, then for inserting the sequence into the tree, TreeFam will calculate a MAFFT alignment which you can also download.

ADD COMMENT
0
Entering edit mode
9.4 years ago
Brice Sarver ★ 3.8k

Datasets, including multiple sequence alignments, from papers with a phylogenetic component are frequently posted on Dryad. Alternatively, the source code for just about every program often contains an example folder with a trial dataset or two.

ADD COMMENT

Login before adding your answer.

Traffic: 2330 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6