Hi everyone,
I'm novice doing reciprocal blast and I have spent a lot of time trying it. I don't know how to do a reciprocal blast and how to parse the best hits and compare & find matches. So, an advice would be a big help to me.
I need to do a reciprocal Blast in Python (Jupyter) to identify same proteins.
For that, first I make a db with the respective FASTA with the coding sequences.
!makeblastdb -in $fastap_1 -dbtype 'prot'
!makeblastdb -in $fastap_2 -dbtype 'prot'
Then, I run blastp for both files:
blastp_ab = !blastp -query $fastap_1 -db $fastap_2 -out ab.txt
blastp_ba = !blastp -query $fastap_2 -db $fastap_1 -out ba.txt
Up to here everything works. Now, I have to parse the blast results to collect best hits. I have tried it, but not even print "hello??"
If I pass as argument the file "ab.txt", the function "collect_best_hits" returns me an empty dictionary. And if I pass the handle "blastp_ab" as argument, "collect_best_hits" functions return me and advice of "blastall" and the empty dictionary:
no BLAST output. Check that blastall is in your PATH
{ }
(Options)
filename = "ab.txt"
filename = blastp_ab
filename = open("ab.txt", "r")
collect_best_hits(filename)
The function contains:
import os import sys import csv import blastparser
def collect_best_hits(filename):
d = {}
try:
for n, record in enumerate(blastparser.parse_fp(filename)):
print "hola"
if n % 100 == 0:
print >>sys.stderr, 'loading 1 ...', n
print "!-", n, record
best_score = None
for hit in record.hits:
print "!- ", hit, len(record.hits)
for match in hit.matches:
print "!- ", match, len(hit.matches)
query = record.query_name
if query.startswith('gi'):
query = query.split('|', 2)[2]
subject = hit.subject_name
score = match.score
print "!- ", score
# only keep the best set of scores for any query
if best_score and best_score > score:
continue
best_score = score
x = d.get(query, [])
x.append((subject, score))
d[query] = x
if best_score and best_score != score:
break
except Exception as e:
print(e)
return d
What I'm doing wrong?
All help is welcome!