Percentage of Conserved Protein
1
0
Entering edit mode
7.3 years ago
kkim46 • 0

Hi all, I would like to seek for some help over this issue im facing.

The objective is pretty self-explanatory - to determine percentage of conserved protein in two related species (I'm working in Ralstonia). I've obtained relevant wgs sources from ncbi database and performed pairwise alignment of the two ralstonia species with blastp software using the command: blastp -query file1.faa -subject file2.faa -evalue 1e-10 -outfmt6> output.txt

Soon as the output was generated, I referred to the POCP formula proposed by Qin et al 2014, which stated "POCP = [(C1  C2)/(T1  T2)] ยท 100%, where C1 and C2 represent the conserved number of proteins in the two genomes being compared, respectively, and T1 and T2 represent the total number of proteins in the two genomes being compared, respectively. Here's the issue - I can't seem to find the output figures that correspond to the aforementioned variables - C1 and C2.

Is there any steps that I've missed or did I do something wrong?

Suggestions/advises would be much appreciated!

blast blastp • 3.1k views
ADD COMMENT
2
Entering edit mode
7.3 years ago

The step you are missing is to identify orthologs. BLAST calculates sequence similarity and, at best, identifies homologs. To obtain the counts needed for the POCP formula, you need to know how many proteins in strain 1 have an ortholog in strain 2 (that is C1 in the formula) and vice versa (that is C2).

To find orthologs between your two strains you could, for example, use the stand-alone version of InParanoid.

ADD COMMENT
0
Entering edit mode

Thanks for the explanation!

ADD REPLY

Login before adding your answer.

Traffic: 2550 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6