To get number of hits from blastp output file
1
0
Entering edit mode
4.9 years ago
fec2 ▴ 50

Hi all,

I have multiple blastp output (format 6) in a directory, I wish to calculate the number of hits with sequence identity of more than 40% for each output file, therefore, I have tried:

for i in *.tsv; do awk '$3>=40' $i | wc -l; done

However, this command only give me a list of number in the terminal without matching it with the blastp output, any modification that I can do so that I know the number belongs to which blastp output file? Thank you.

genome • 1.6k views
ADD COMMENT
5
Entering edit mode
4.9 years ago
AK ★ 2.2k

Hi fec2,

Try:

for i in *.tsv; do echo -ne "${i}\t" && awk '$3>=40{print $2}' ${i} | sort -u | wc -l; done

It should returns something like:

blast_out1.tsv  217
blast_out2.tsv  172
blast_out3.tsv  215
ADD COMMENT
0
Entering edit mode

Thank you very much! Is it possible to get the result in an output file?

ADD REPLY
0
Entering edit mode

You're welcome. Try this:

(for i in *.tsv; do echo -ne "${i}\t" && awk '$3>=40{print $2}' ${i} | sort -u | wc -l; done) > output.txt
ADD REPLY
0
Entering edit mode

The command working well, thank you again!

ADD REPLY
0
Entering edit mode

Hi fec2,

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLY
1
Entering edit mode

Hi thanks for your comment, I have accepted the answer. Thank you.

ADD REPLY
0
Entering edit mode

Hi,

May I know where can I find the manual for the meaning of all these command? I am new in this field and but have no clue where to find all these information. Really appreciate your help.

ADD REPLY
1
Entering edit mode

Hello fec2,

You can use man echo, man awk, man sort, and man wc. I'd recommend "Ch3. Remedial Unix Shell" and "Ch7. Unix Data Tools" in the book: Bioinformatics Data Skills by Vince Buffalo. :-)

ADD REPLY

Login before adding your answer.

Traffic: 2429 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6