Finding upstream or downstream sequences on BLAST on linux
0
0
Entering edit mode
5.0 years ago
yomirann • 0

Hi all,

I got a WG assembly sequence of an organism which is not on the online BLAST, so I'm using Linux. After creating a DB of it, I start searching for different genes and it works just fine - now I want to see the sequences upstream and downstream to my hit - but I didn't find out how to do it.

Please help! Thanks :-)

genome sequencing alignment gene • 1.5k views
ADD COMMENT
1
Entering edit mode

You could do as @gb suggests by parsing the -outfmt 6 output of blastn for sstart (column 9) and send (column 10), convert to BED file but add however many bases upstream and downstream you want, and intersect the BED file to the annotations in GFF or GTF format with Bedtools.

ADD REPLY
1
Entering edit mode

Hopefully you created your custom database with --parse_seqids option.

If you did then you can use blastdbcmd utility includes in blast+ to retrieve any sequences using the following option.

-range <String>
   Range of sequence to extract in 1-based offsets (Format: start-stop, for
   start to end of sequence use start - )
 -strand <String, `minus', `plus'>
   Strand of nucleotide sequence to extract
   Default = `plus'

Combine it with sstart and send as suggested by others.

ADD REPLY
0
Entering edit mode

https://www.ncbi.nlm.nih.gov/books/NBK279684/

See the option sstart in outfmt

ADD REPLY

Login before adding your answer.

Traffic: 1895 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6