How to specify Cut-off/parameters of sequence identity > 70% and query length > 40% in BLAST command line?
1
0
Entering edit mode
5.1 years ago
Kumar ▴ 120

I have been searching homologues sequences of my target gene families in my concerned genome sequences. I need to search the homologues with a score of > 70% identity and > 40% query length to the target gene families. Is it possible to include these cut-off/parameters in the BLAST command line. Thank you in advance.

alignment BLAST homologoues search • 6.1k views
ADD COMMENT
0
Entering edit mode

Blastn v. 2.13.0 support % query length coverage; from blastn -help:

-qcov_hsp_perc <Real, 0..100> Percent query coverage per hsp (high scoring pair, aka a continuous alignment between query & subject)

in addition to perc_identity cutoff.

ADD REPLY
1
Entering edit mode
5.1 years ago
gb ★ 2.2k
-perc_identity

There is no coverage cutoff parameter, so you need to do that yourself afterwards. (It is called qcovs in de outfmt options)

Here you can find all the parameters: https://www.ncbi.nlm.nih.gov/books/NBK279684/

Or use:

blastn -help
ADD COMMENT
0
Entering edit mode

Thank you gb, if it possible could you please elaborate qcovs in de outfmt options.

ADD REPLY
0
Entering edit mode

I believe there is no filter parameter for that but you can output it and filter it afterwards, these are the option:

qlen means Query sequence length
slen means Subject sequence length
length means Alignment length

EDIT you changed the text of your comment:

qcovs is the percentage of basepairs from your query that is used for the alignment against the target. So with an coverage of 90%, the of your query that is used for the alignment with the target that is 90% of the total amount of basepairs or your query.

ADD REPLY
0
Entering edit mode

hello gb, Can you help me out with this? Whenever I use perc_identity it shows this "Error: (CArgException::eInvalidArg) Unknown argument: "perc_identity", Im using tblastn

ADD REPLY
0
Entering edit mode

You sure you use blastn? Also don't forget the dash. You can double check the parameter with the blastn -help command.

EDIT: The error is correct. tblastn does not have such option. https://www.ncbi.nlm.nih.gov/books/NBK279684/

ADD REPLY
0
Entering edit mode

yeah, just checked it, thank you. So, if I have to get the sequences with >=50%, i have to manually check them right?

and just a confirmation pident is used for identity percentage, right?

ADD REPLY

Login before adding your answer.

Traffic: 1517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6