a question about blast
1
1
Entering edit mode
7.3 years ago
jinych2bgi ▴ 20

i use two species proteome blast each other.(a blast b, and b blast a )

but , i find some proteins blast result exists in one,but not exists in another?

Why is there this kind of situation?

<h6>#</h6>

Example

$ grep ENSECAP00000001495 ../hs/blast.out | head -5 ENSECAP00000001495 Equas0016397 74.78 115 29 0 1 115 112 226 8e-37 148 ENSECAP00000001495 Equas0019271 77.42 93 21 0 23 115 44 136 1e-35 144 ENSECAP00000001495 Equas0021079 71.30 115 33 0 1 115 528 642 3e-35 142 ENSECAP00000001495 Equas0021802 74.19 93 24 0 23 115 7 99 4e-34 139 ENSECAP00000001495 Equas0005581 67.83 115 36 1 1 115 111 224 6e-32 132

<h6>#</h6>

$ grep ENSECAP00000001495 ../dk/blast.out | head -5 jinyuanchun 14:23:02 /ifs4/BC_COM_P6/F13FTSECKF1619/DIAifmR/DONimoM/2.hs2dk_gene/orth $

blast • 1.2k views
ADD COMMENT
2
Entering edit mode
7.3 years ago

There can be many reasons for this. Here is a probably non-comprehensive list:

  • E-values depend on database size. Because the E-values depend on the number of sequences in the database but not in the query, a hit may pass the E-value cutoff when BLAST is performed one way but not the other (especially true if the proteomes have vastly different size).
  • Reporting only top-N hits. Depending on your settings for the search, BLAST will report only the top-N hits for each query sequence. Depending on the number of other, better hits a hit may thus not be shown when searching one way or the other.
  • Asymmetric repeat masking. Again depending on settings, BLAST may mask repeats in your query sequences but not in the database. Reversing the search thus changes which sequence gets subjected to repeat masking, which can lead to big difference.
ADD COMMENT

Login before adding your answer.

Traffic: 2619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6