Biostar Beta. Not for public use.
Multiple protein accession number query Command LIne Blastp
0
Entering edit mode
18 months ago
john • 30
@john18423

So I've scoured the internet and I couldn't find any documentation on how to format a txt file with multiple protein accession numbers.. As of right now, I have formatted like this:

AAF45826
AAF48069

The issue is it only blasts the first protein accession number. Does anyone know the proper format so that the program blasts every protein accession number?

Thanks in advance

alignment blast • 1.1k views
ADD COMMENTlink
0
Entering edit mode
18 months ago
vkkodali ♦ 1.1k
@vkkodali30494

You can use Entrez Direct to download the sequences in fasta format on the fly like this:

blastp -query <(epost -db protein -input <acc_list.txt> | efetch -format fasta) -db <your_db>

The acc_list.txt file should contain valid protein accessions, one per line.

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.3