Converting Hmmer output to XML format
0
0
Entering edit mode
7.4 years ago
adi4bc ▴ 10

Hi,

I'm using phmmer search and I want to convert the output to XML, in order to have the percentage of identity parameter (which does not exist in any of the other formats, to the best of my knowledge). I've read about the package Hmmer.IO in Biopython, but didn't see an option to convert the output to XML. I've also read about Easel and esl-reformat command, but didn't find my desired option there either. Does anyone have a solution to this problem? My main goal is to extract the percentage of identity and coverage from Hmmer output, so if there's any other way to get these parameters besides the XML format, I'd love to hear about it.

Thank you!

hmmer biopython easel • 2.3k views
ADD COMMENT
0
Entering edit mode

and I want to convert the output to XML, in order to have the percentage of identity parameter

if the identity parameter is already in the output, why do you need to convert to XML ?

ADD REPLY
1
Entering edit mode

I don't see the percentage of identity anywhere in my output file. Here's a part of it (I know it's a bit messy, but I didn't find a way to attach a file here)

# phmmer :: search a protein sequence against a protein database
# HMMER 3.1b2 (February 2015); http://hmmer.org/
# Copyright (C) 2015 Howard Hughes Medical Institute.
# Freely distributed under the GNU General Public License (GPLv3).
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query sequence file:             seq.fasta
# target sequence database:        /biodb/FASTA/uniprot/swissprot.fa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       gi|923093981|gb|ALB07772.1|  [L=470]
Description: neuraminidase [Influenza A virus (A/goose/Taiwan/TNO20/2015(H5N8))]
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence              Description
    ------- ------ -----    ------- ------ -----   ---- --  --------              -----------
          0 1043.3  15.2          0 1043.1  15.2    1.0  1  sp|Q07572|NRAM_I80A6   Neuraminidase OS=Influenza A virus (st
          0 1042.0  15.1          0 1041.9  15.1    1.0  1  sp|Q07570|NRAM_I88A1   Neuraminidase OS=Influenza A virus (st
          0 1039.1  15.3          0 1038.9  15.3    1.0  1  sp|Q07599|NRAM_I63A3   Neuraminidase OS=Influenza A virus (st
....
Domain annotation for each sequence (and alignments):
>> sp|Q07572|NRAM_I80A6  Neuraminidase OS=Influenza A virus (strain A/Duck/Hokkaido/8/1980 H3N8) GN=NA PE=3 SV=2
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1043.1  15.2         0         0       1     470 []       1     470 []       1     470 [] 1.00

  Alignments for each domain:
  == domain 1  score: 1043.1 bits;  conditional E-value: 0
  gi|923093981|gb|ALB07772.1|   1 mnpnqkivtigsislglvvfnvllhavsiiltvlalgksenngicngtvvrehnetvriekvtqwyntsvveyvphwnegty 82 
                                  mnpnqki+tigsislglvvfnvllh vsii+tvl lg+  nngicn tvvre+netvriekvtqw+ntsvveyvp+wnegty
         sp|Q07572|NRAM_I80A6   1 MNPNQKIITIGSISLGLVVFNVLLHVVSIIVTVLVLGRGGNNGICNETVVREYNETVRIEKVTQWHNTSVVEYVPYWNEGTY 82 
                                  8********************************************************************************* PP
ADD REPLY
1
Entering edit mode

so , converting this to XML won't change anything isn't it ?

ADD REPLY
0
Entering edit mode

Okay, so do you know how to get the identity percentage and coverage from Hmmer output, using python?

ADD REPLY
0
Entering edit mode

Biopython's Bio.SearchIO module currently only reads in various formats like HMMER or BLAST - it does not handle output.

ADD REPLY

Login before adding your answer.

Traffic: 2203 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6