Question

New to Glimmer: trouble with .icm file

0

Entering edit mode

6.2 years ago

oars ▴ 200

I'm brand new to glimmer and gene assembly and prediction. I'm having a hiccup with the "build the model" step. I write the following command:

$ build-icm –r glm_ref.icm < glm_ref.long.seq

At this point I think I'm on track. My understanding is the model information is saved to a binary file named glm_ref.icm. Given this model, I can now use glimmer to predict genes in the reference genome.

USAGE:  build-icm [options] output_file < input-file

Read sequences from standard input and output to  output-file
the interpolated context model built from them.
Input also can be piped into the program, e.g.,
  cat abc.in | build-icm xyz.icm
If <output-file> is "-", then output goes to standard output

Options:
 -d <num>
    Set depth of model to <num>
 -F
    Ignore input strings with in-frame stop codons
 -h
    Print this message
 -p <num>
    Set period of model to <num>
 -r
    Use the reverse of input strings to build the model
 -t
    Output model as text (for debugging only)
 -v <num>
    Set verbose level; higher is more diagnostic printouts
 -w <num>
    Set length of model window to <num>

Next, I write the following command:

$ glimmer3 EC.fasta glm_ref.icm glm_ref

But then I keep receiving the following error:

Starting at Tue Feb  6 19:09:00 2018

ERROR:  Could not open file  glm_ref.icm
  errno = 2

I've looked in my directory and do not see a glm_ref.icm file (should I)? I've also looked in my glimmer folder and although I see a folder named ICM,, there is no specific file named glm_ref.icm?

glimmer gene prediction • 2.4k views

ADD COMMENT • link 6.2 years ago by oars ▴ 200

0

Entering edit mode

I'm not sure what went wrong but I repeated my steps on my mac (I was using a linux machine before), and it worked. If anyone else that's new to glimmer has a similar issue, the $ glimmer3 EC.fasta glm_ref.icm glm_ref command should produce an .icm file in your directory of choice. If successful, you'll get output similar to the following:

Sequence file = EC.fasta
Number of sequences = 1
ICM model file = glm_ref.icm
Excluded regions file = none
List of orfs file = none
Input is NOT separate orfs
Independent (non-coding) scores are used
Circular genome = true
Truncated orfs = false
Minimum gene length = 100 bp
Maximum overlap bases = 30
Threshold score = 30
Use first start codon = false
Start codons = atg,gtg,ttg
Start probs = 0.600,0.300,0.100
Stop codons = taa,tag,tga
GC percentage = 50.8%
Ignore score on orfs longer than 750
Analyzing Sequence #1
Start Find_Orfs
Start Score_Orfs
Start Process_Events
Start Trace_Back

ADD REPLY • link 6.2 years ago by oars ▴ 200