Genscan Gene Prediction
2
1
Entering edit mode
13.4 years ago
Gvj ▴ 470

Dear all, I am trying to use genscan with multifasta file. I am getting this error:

scoring sequence... Error : calloc failed on CUM_SCORE array

with simple fasta file its working fine. I have found one link which also asking about the same thing. have you faced this problem of limitation with 2MB? how did you solve it?

genome annotation error • 14k views
ADD COMMENT
5
Entering edit mode
13.4 years ago
Darked89 4.6k

Genscan does not work well with multifasta. There is a Perl script by Brian Osborne run_genscan.pl which helps. Let me know if you can not get it from BioPerl/Google cache.

I am not sure about the Genscan limits of individual single fasta entries.

ADD COMMENT
1
Entering edit mode
for i in {2..x}; do awk -v a="$i" 'BEGIN{RS=">"; tem="tmp"} NR==a{print a"n";print ">"$0 >tem; exit}' genome.fas ;  genscan your.smat tmp >>genscan_sh.out; rm tmp; done

dirty version of my script to run it.

note: 'x' in for loop is the number of fasta entry +1 . I

ADD REPLY
0
Entering edit mode

Did you succeed to convert the Genscan output format to GFF and GTF ? I have found few on net but non of them is working. IF you have a parser for Fgenesh, please share it also

ADD REPLY
1
Entering edit mode
10.9 years ago
yfhuang ▴ 10

I also meet this problem. It's sure that GENSCAN can't handle multifasta to predict gene. Therefore, multifasta should be split into single fasta. In addition, I try to short the length of input sequence. So far, the GENSCAN works well with parameter file of HumanIso.smat if the length of the input fasta is 5999940.

Hope this information is useful!!

ADD COMMENT

Login before adding your answer.

Traffic: 3040 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6