Blast scores...two annotations for the same piece of sequence
0
1
Entering edit mode
9.1 years ago
friasoler ▴ 50

Hello everybody!!!

I have a sequence of DNA that matches with two different proteins depending whether I look at the scores or at the sequence identity in BLAST....Which criterion should I trust the most? I have designed primers using this sequence to measure the gene expression of this gene, that's why it is so critical for me to know the exact match to the sequence. Here is the sequence:

GCCGCAGCCCCGCTGCAGACGCGCCGCGTCCCCGCCGGAGAAGGAGCGAGGCCGTTCCCTGCGCATCCTGCAGCAGCATGACTCTTCAGGCTGACTTTGATGGTGCTGCAGAAGATGTAAAAAAaTTAAAAaCaAGACCAACTGATGAAGAACTGAAGGAACTATATGGATTCTACAAACAGGCTACTGTTGGAGATATTAATATTGAATGTCCAGGAATGCTAGATTTGAAAGGCAAAGCCAAATGGGAGGCATGGAACCTGAAAAAAGGTTTATCAAAGGAGGATGCCATGAATGCCTATATCTCTAAAGCAAGAGCAATGGTAGAAAAATATGGAATCTAGAATATTCAAAATAATTCCCACTAATAATTAACTACTCTTCAGTAGCTGATGAACTAACTTGAGAAAAAcGCAGTACTAACTCCTTTTTGTGTAGTCTGACACTAATATCTTTTAAGCATCAGCTGTTTGACTTTAAAGGGTATTTACATATATAATCGATTTTTAGCTTGTATATTAATCTAAATAAATTTGAACTGAATAAATTAAGCTTTATTAAGAATTGTGGATTTTtGTGGGTATTAAATTATATTTAGCATTTTGACAGAAGAAGACAAACAGAAAAGCTCTAACAGTTAAATAACATAGACATGATTTTTTGCAAGCAAGGTTATGGAATAAAGTGAAGAGTTTGTGCATAAGGAAGAGAAGAAGGAAAAGATGAAACCTTTTTtAAGACCCAAAGCCAATGTTTGaTTTTTAAAAAaaTCAGGAAAaCTTCCCCTTATAAAGGATTACAGAGGAGGACCAGAACAACTTTTAGGCATAACTGCATGCAATGTAGAGAAaGAAGTGACTTATTATAAATTGCTGTGGACTAACCTACACATTCTGCCATTAAAaTTGaGGgAAaTaCTCAtAGACTGGCaTTTTcTATGCATGTTGtGATATGTTTTATCAAGAAacTTTCATTAGATGGTTTCAGcAGATAAAAGTGATCTCCAGGAAGgTCATAAAAGGAAACATCtCCaTTTGTtAGTtCTtGCcAaCCTAAAAAaGATATTtGAAGTGTCAGAGAAaC

Thanks in advance

Roberto

alignment • 2.3k views
ADD COMMENT
2
Entering edit mode

I guess that depends on what the score/identity values is. if it is in a gray zone then this is a tricky question but if not than:

In Score we trust.

ADD REPLY
2
Entering edit mode

Bitscore > Evalue > Identity

High identity means nothing by itself because it can be for a very short alignment covering just a tiny proportion of the query sequence. Keep that in mind. Basic Local Alignment Search Tool.

ADD REPLY
1
Entering edit mode

As you said: "High identity means nothing by itself because it can be for a very short alignment covering just a tiny proportion of the query sequence". That is why in grey zone ( e = 10^-2 - 10^-4) the two values can give different results and thus pose a challenge while interpreting. In such cases I would always trust score value over identity. So I don't quite get your reply to my post.

ps

Thumbs up

ADD REPLY
1
Entering edit mode

Well, it was really meant as a reply for OP. Also, generally I wouldn't even consider hits with such high evalue, I mean 1 in 100 or even 10,000 isn't very good if your db has millions of sequences.

ADD REPLY
0
Entering edit mode

Thanks for your answers .-)

I have this extreme alternatives for the Alignment :

PREDICTED: Ficedula albicollis acyl-CoA-binding protein-like (LOC101820061), mRNA
Max score Total score Query cover E value Iden
887             887                46%                0.0      98%  
Select seq ref|XM_005040820.1|             PREDICTED: Ficedula albicollis S-acyl fatty acid synthase thioesterase, medium chain-like (LOC101815966), mRNA
Max score Total score Query cover E value Iden
239              239              12%                8e-59    99%

If I follow your criteria I have to choose: acyl-CoA-binding protein-like?

Thanks
roberto

ADD REPLY
2
Entering edit mode

My guess is that these are multi domain proteins and in the case of the second hit you're getting a nice hit to one domain, whereas in the first case you're getting a hit that covers multiple domains..

ADD REPLY
1
Entering edit mode

I second that. so if you are looking for a gene and not a domain the first hit should be your choice.

ADD REPLY
0
Entering edit mode

Tx you all very much .-)
Roberto

ADD REPLY

Login before adding your answer.

Traffic: 1446 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6