Hi all,

I need to calculate percentage of blastp hits from a few genome as formula below:

[(C1 + C2)/(T1 + T2)] * 100

C1 and C2 is the number of blastp hits of two genomes against each other, for example genome A against genome B represent C1 and genome B against genome A represent C2. T1 and T2 is the total number of proteins in the two genomes being compared.

Now, I have a .txt file consists total number of protein for all my genome of interest:

A:1234 B:1234 C:1234...

I also have another .txt file consists of blastp result for each other genome:

A_B 123, B_A 123, A_C 123, B_C 123..

Is it possible to have a command that is able to calculate the percentage of A_B, A_C and B_C based on formula above?

I apologise if my question is confusing and please tell me if any part of my question is unclear.

a single (linux) cmdline will be difficult I guess. Are you familiar with any programming/scripting language? Writing a small script will have that processed quickly.

Hi, I am not familiar with any programming/scripting language. I am actually kind of new in this field.