Biostar Beta. Not for public use.
NCBI Blast locally: filter by accession number and NOT by GI number
2
Entering edit mode
2.3 years ago
tlorin • 250
Switzerland

I have downloaded the NCBI nt database using the blastdb_update.pl perl script, but I want to blast some query file not on the whole nt database but on specific species. I know that when using blast locally it is possible to subset the nt/nr database using a list of GI identifiers, as explained here.

However, NCBI is phasing out GIs and we should instead use accession.version identifiers. I have downloaded those for my species, below is part of the file mygi.txt.

When I run

blastdb_aliastool -gilist mygi.txt -db nt -out sthg.out -title sometitle

I obviously get

BLAST Database error: Specified file is not a valid GI/TI list. since I am not providing a GI list.

I cannot find any command-line option in the manual to specify that I want to filter the nt database by accession number; any idea of how I can achieve that? I bet this option will have to be added by the BLAST team at some point :)


mygi.txt below

AF324813.1
AF324814.1
AF324815.1
AF324816.1
AF324817.1
AF324818.1
AF324819.1
AF324820.1
AF324821.1
AF324822.1
AF324823.1
AF324824.1
AF370451.1
AY198341.1
AY198342.1
ncbi • 1.9k views
ADD COMMENTlink
0
Entering edit mode

An alternative (and dirtier ;) ) possibility could be using this, then using makeblastdb and blast on this newly created database.

ADD REPLYlink
3
Entering edit mode
4 weeks ago
genomax 68k
United States

This solution adds a step but until NCBI updates the blastdb_aliastool to accept accession numbers this may the only way.

You can use blastdbcmd from blast+ package to retrieve sequences from nt db as fasta file followed by makeblastdb to make the blast indexes for the subset of sequences. my_acc.txt file is the file with accession numbers (one per line).

blastdbcmd -db /path_to/nt -entry_batch my_acc.txt -out my_seq.fa
ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1