dN/dS analysis of genomes
0
0
Entering edit mode
5.9 years ago
Mehmet ▴ 820

Dear All,

I have done several analysis to find dN/dS ratio in six species.

  1. I found one-to-one orthologs (by orthofinder)

  2. I did codon-based alignments of the orthologs that produced .paml format alignments.

  3. I did protein alignment of the orthologs.

Now I would like to have your advice on:

How to concanate all paml files that are used in codeml.?

To be used in codeml (as tree file) do I need to perform phylogenetic of those .paml files? or it is fine to use all.paml files only in codeml?

genome sequence alignment • 4.0k views
ADD COMMENT
1
Entering edit mode

Hi Mehmet,

I think calculating dn/ds per alignment is good. You can create quick trees for the alignment as input of codeml, otherwise you could let the programme do it.

ADD REPLY
0
Entering edit mode

I performed dN/dS analysis using alignments of all orthologs (~5000) together.

ADD REPLY
0
Entering edit mode

Maybe by setting mgene = in the control file?

ADD REPLY
0
Entering edit mode

Please how can I get the alignment for all orthologs.? What steps do I have to follow?

PS: I have already run orthofinder and a lot of files were generated. Please advice.

Thanks

ADD REPLY
0
Entering edit mode

Steps that you need to follow are:

  1. Get CDS (coding sequences, DNA) of each single copy ortholog.
  2. Get aligment of protein sequences of each single copy ortholog. (You already have since you run Orthofinder).
  3. Get aligment of CDS. You can use a tool like Pal2aln to align protein sequnces (already aligned by Orthofinder) to CDS. Please read pal2aln tool documentation.
  4. Use aligned CDS in codeml.
ADD REPLY
0
Entering edit mode

I have performed steps 1 and 2. I also see that the alignment of protein sequences is in the MultipleSequenceAlignments folder.

But I am a bit confused about step 3. If you say I should get alignment of CDS, do you mean the alignment of CDS that were identified as orthologs ? I would be grateful if you can give a bit of details. Thanks

ADD REPLY
0
Entering edit mode

But I am a bit confused about step 3. If you say I should get alignment of CDS, do you mean the alignment of CDS that were identified as orthologs ? I would be grateful if you can give a bit of details.

Yes, you should use the same ortholog (single copy ones) CDS sequences. Please let me know if you need more help.

ADD REPLY
0
Entering edit mode

Hi Sishuo,

I would like to have your suggestion:

I have run codeml with the control file below;

eqfile = all.paml      * sequence data filename
      outfile = mlc           * main result file name

      runmode = 2  * 0: user tree;  1: semi-automatic;  2: automatic
                   * 3: StepwiseAddition; (4,5):PerturbationNNI; -2: pairwise

      seqtype = 1  * 1:codons; 2:AAs; 3:codons-->AAs
    CodonFreq = 2  * 0:1/61 each, 1:F1X4, 2:F3X4, 3:codon table

        ndata = 5398 * number of gene alignments to be analysed
        clock = 0  * 0:no clock, 1:clock; 2:local clock; 3:CombinedAnalysis

        model = 0  * models for codons: 0:one, 1:b, 2:2 or more dN/dS ratios for branches

      NSsites = 0  * 0:one w;1:neutral;2:selection; 3:discrete;4:freqs;
                   * 5:gamma;6:2gamma;7:beta;8:beta&w;9:betaγ
                   * 10:beta&gamma+1; 11:beta&normal>1; 12:0&2normal>1;
                   * 13:3normal>0

        icode = 0  * 0:universal code; 1:mammalian mt; 2-10:see below

    fix_omega = 0  * 1: omega or omega_1 fixed, 0: estimate 
        omega = .4 * initial or fixed omega, for codons or codon-based AAs

    cleandata = 0  * remove sites with ambiguity data (1:yes, 0:no)?

* Genetic codes: 0:universal, 1:mammalian mt., 2:yeast mt., 3:mold mt.,
* 4: invertebrate mt., 5: ciliate nuclear, 6: echinoderm mt., 
* 7: euplotid mt., 8: alternative yeast nu. 9: ascidian mt., 
* 10: blepharisma nu.
* These codes correspond to transl_table 1 to 11 of GENEBANK.

What I need is how I can get dN/dS ratio

ADD REPLY
0
Entering edit mode

When I typed grep omega mlc > omega.txt

omega.txt file has nothing.

ADD REPLY
0
Entering edit mode

Hi Mehmet,

Sorry, not sure.

Have you got anything in the mlc file? Or was the analysis disrupted?

By the way, there was a typo in "eqfile = all.paml".

ADD REPLY
0
Entering edit mode

the analysis did not disrupt. Yes, I have star tree and best tree values for each gene. But I did not get omega values. Is it fine to use a best tree that has highest likelihood value as a tree file in next codeml run?

ADD REPLY
0
Entering edit mode

I think that should work.

ADD REPLY

Login before adding your answer.

Traffic: 2539 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6