where is "% Identity" column in blast-xml
4.6 years ago
helenhvalask • 30
United States

I am trying to parse blast-xml file from blastp search using searchIO in Biopython. However, I am not sure which one I should use for extracting "% identity".

In blast-tab file, I can use hsp.pident, does anyone know the equavilent attribute name for blast-xml. Or I should derive myself, "hsp.ident_num/hsp.aln_span*100". Thanks

15 months ago
Peter 5.8k
Scotland, UK

The BLAST XML output format does not contain the percentage identify as an explicit field, so yes, you must calculate it from the number of identities and the alignment length.

See for example my BLAST XML to tabular conversion script:

(Note if you are using Python 2, beware of integer division if you do the calculation as currently written!)


