Entering edit mode
6.6 years ago
yarmda
▴
40
I want to retrieve the complete sequences of BLAST results that are stored in an XML file. I can see that hsp_hseq is not the complete sequence. Is there an easier way to do this than by pulling the identifier and downloading with Entrez?
<Hit>
<Hit_num>2</Hit_num>
<Hit_id>gi|755995789|gb|CP009605.1|</Hit_id>
<Hit_def>Bacillus cereus strain S2-8, complete genome</Hit_def>
<Hit_accession>CP009605</Hit_accession>
<Hit_len>5271178</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>366858</Hsp_bit-score>
<Hsp_score>198661</Hsp_score>
<Hsp_evalue>0</Hsp_evalue>
<Hsp_query-from>3859112</Hsp_query-from>
<Hsp_query-to>4061503</Hsp_query-to>
<Hsp_hit-from>311811</Hsp_hit-from>
<Hsp_hit-to>109360</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>-1</Hsp_hit-frame>
<Hsp_identity>201234</Hsp_identity>
<Hsp_positive>201234</Hsp_positive>
<Hsp_gaps>132</Hsp_gaps>
<Hsp_align-len>202488</Hsp_align-len>
<Hsp_qseq>AATAATAATAATTAAAATAAAAAAACTTTAGAATTTTCTTATTTCAAAACAGTAGACAAAATTCAAAAAATTGTGTTAGAATTTGTTTCAATATCATATCGCTTGTTAAATTCCTTTCAAAAGGAAAATAGGTACACGAACATTTCGTTTCGTGTTTAAAAGGGAAGCTTGGTGAAACTCCAACACGGTCCCGCCACTGTAAATGCTGAGATTTCTTTTTGATACCAC$
<Hsp_hseq>AATAAGAATAATAAAAATAAAAAAACTTTAGAATTTTCTTATTTCAAAACAGTAGACAAAATTCCAAAAATTGTGTTAGAATTTGTTTCAATATCATATCGCTTGTTAAATTCCTTTCAAAAGGAAAATAGGTACACGAACATTTCGTTTCGTGTTTAAAAGGGAAGCTTGGTGAAACTCCAACACGGTCCCGCCACTGTAAATGCTGAGATTTCTTTGTAGTGCCAC