Biostar Beta. Not for public use.
How can I calculate the C:N ratio (or just number of carbons and nitrogens) of each amino acid sequence in a multifasta file?
0
Entering edit mode
14 months ago
kieft1bp • 0
United States

I have a multifasta file of amino acid sequences, around 1000 seqs total, like so:

  • > seq_id_1
  • MAWT........
  • > seq_id_2
  • MTRA.......
  • ....
  • > seq_id_1000
  • MIVE.......

I want to calculate the molar C:N ratio (number of total carbon atoms in each sequence divided by the number of total nitrogen atoms in each sequence) for all seq IDs and print a tsv file, like so:

  • seq_id_1 \t 1.5
  • seq_id_2 \t 0.9
  • ...
  • seq_id_1000 \t 1.1

This C:N ratio is derived from the number of carbon and nitrogen atoms in each amino acid residue (e.g., there are 5 Cs and 1 N in Methionine) and the number of each amino acid in the protein sequence. Is there a tool available that can do this, or do I have to write my own? I am fine with using a web server, a pre-written suite that runs on unix (mac, linux), or custom scripts from someone (python, perl, ruby). Thanks!

ADD COMMENTlink
0
Entering edit mode

using awk: awk '/^>/ {if(S>0) {print N==0?"NA":C/N;} C=0;N=0;S++;printf("%s\t",$0); ;next;} {t=$0; gsub(/[^Cc]/,"",t);C+=length(t);t=$0;gsub(/[^Nn]/,"",t);N+=length(t);} END{print N==0?"NA":C/N;}' in.fasta

ADD REPLYlink
0
Entering edit mode

Thanks for the answer, Pierre, but the problem is a little more complicated than counting the instances of a string in each line. I've updated my question. My fasta sequences are just amino acids (with no information about carbon or nitrogen content), so what I actually need to do is reference a separate table that contains the number of carbon and nitrogen atoms per amino acid in order to calculate the C:N ratio for each sequence.

ADD REPLYlink
3
Entering edit mode

There's 20 amino acids, it's fairly easy to create that list from the chemical formula in wikipedia, read it in a dictionary/hash, loop over your sequences, add up Cs and Ns, compute the ratio. Doesn't seem very complicated, or do I miss something

ADD REPLYlink
0
Entering edit mode

Yes, you're right. I was just wondering if there was a tool already that was written to solve the same task. Just trying not to reinvent the wheel.

ADD REPLYlink
0
Entering edit mode

I'm not saying it doesn't exist, but if it takes you longer to search for a tool than to write it then the choice is easy :-)

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1