blastx output cut-off for creating gff
1
0
Entering edit mode
7.4 years ago
Whoknows ▴ 960

Hi friends,

I need some help for selecting right target sequences for my gff file. I'm working on a specie without ref. genome, I have got the scaffold from genome sequencing and I have performed blastx against SWISS-Prot database.

I want to convert my blastx output file to GFF; Should I define a cut-off based on identity percent or e-value for blastx output??
Because many of target sequences have identity percent less than 50% but their e-values are fine.

Thanks

gff blastx ngs • 2.3k views
ADD COMMENT
0
Entering edit mode

I think by using scaffold it means that your sequence is long enough not to be aligned by chance , so I would go for Identity in such case and here a tool that will help you to convert

ADD REPLY
0
Entering edit mode

I think that i have a similar problem - you are trying to select the best hit to include in the GFF file right? I am currently analysing a metatranscriptome dataset. After 'blasting' the sequences against the nr DB I was trying to find/develop a reasonable algorithm to select the "correct hit". I guess when mapping against a large DB like swissprot or nr the E-value is not a bad score to work with, but what about all the other scores such as alignment length, mismatches, and bit-score? I thought about combining all these scores to create a factor that will include all these scores but simply using something like (legnth/mismatches)*bitscore/Evalue sounds over-simplified for me... I mean- should all scores receive the same weight? are they all equally important? If anyone known about a tool that is meant to calculate the best hit from a blast output (preferably the standard 12 columns tabular format) I will be very happy to hear about it...

ADD REPLY
1
Entering edit mode
7.4 years ago
Whoknows ▴ 960

Hi

I found my answer !!!

In this below paper, they define default values for blastx Identity percent threshold = 0.5 and E-value = 1e-06

Genome Annotation and Curation Using MAKER and MAKER-P, Current Protocol in Bioinformatics,2014.
By Mark Yandell

ADD COMMENT

Login before adding your answer.

Traffic: 2865 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6