Orthology and Phylogenetic Analysis
1
1
Entering edit mode
4.9 years ago
GiV17 ▴ 50

Hi, I have a question. I have 18330 fasta sequences obtained from conting fasta annotation (using Blast2Go). Now I would like to recostruct the phylogenesy and I would like to investigate the orthology. Has anyone ever done such an analysis? Which tools is good for the large dataset?

gene • 1.4k views
ADD COMMENT
1
Entering edit mode

Yes, this is a pretty normal thing to do.

Before we advise you though, we need to know more about the data. What organisms are these sequences from? How closely related are they all?

Are you simply interested in generating a phylogeny to describe all the organisms, or do you wish to say specific things about the ortholog groups?

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

This doesn't really answer the question, as ClustalO would, at best, form only part of an orthology workflow.

EDIT: I have subsequently moved the answer to a comment (for the time being at least).

ADD REPLY
1
Entering edit mode
4.9 years ago
h.mon 35k

As jrj.healey said, yes, this is feasible to be done, but we would need more information about the dataset you have to provide more detailed answers.

One pipeline for phylogenomics analyses is Agalma. From the paper:

Each phylogenomic study requires many steps, the vast majority of which concern matrix construction rather than phylogenetic analysis itself. These steps include raw data filtering, assembly, identification of ribosomal RNA, selection of transcript splice variants, translation, identification of homologous sequences, identification of orthologous sequences, sequence alignment, phylogenetic analysis, and summary of results.

It is geared towards transcriptomic raw data, but it has facilities to introduce pre-assembled transcripts / genes.

ADD COMMENT
0
Entering edit mode

I'm new in this field. For better Understand, I have about 18.000 protein sequences obtained to genome de novo analysis. So I would like to recostruct the phylogenetic analysis, to identify orthologous sequences and a phylogenetic tree. how is this done? are there some tools? Can I help me? Thanks

ADD REPLY
0
Entering edit mode

obtained to genome de novo analysis

This still doesn't really tell us anything. Its very important for phylogenetics that we know what kinds of proteins these are, how many genomes they represent, are they all chromosomal etc.? Do you have genomes, or just genes?

ADD REPLY
0
Entering edit mode

Sorry, But I don't understand. I have different species distribution and I have several proteins obtained to Blast2Go software after scaffold recostruction. So I would like to perform a phylogenetic and othologous analysis. Is it possible?

ADD REPLY
0
Entering edit mode

Can you tell us what the original sample was (was it a metagenomic sample or came from one genome?)

ADD REPLY
0
Entering edit mode

the original sample represents one genome.

ADD REPLY
0
Entering edit mode

I have different species distribution

Then what does this statement mean?

If I understand this right you want to an ortholog/paralog investigation in your sample genome first followed by phylogenetic analysis with data from other species?

ADD REPLY

Login before adding your answer.

Traffic: 1663 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6