STAR problem: EXITING because of INPUT ERROR: could not open genomeFastaFile:
2
1
Entering edit mode
5.5 years ago
valizad2 ▴ 20

Hello,

I am trying to make the index file for the STAR alignment. However, for some reason STAR cannot open my genome file even though I am sure the path I am giving is correct and the file is not corrupted. I opened the file and it looked fine to me. I even downloaded the genome file from another source and the same issue persists. The issue should be with the file path but I checked everything and the path seem to be correct. I am working on a cluster if this information helps. Any ideas what's going on?

Thanks

RNA-Seq alignment • 13k views
ADD COMMENT
0
Entering edit mode

No one can answer this with so little detail. You have to at least post your command line.

ADD REPLY
0
Entering edit mode

Sure. here is the code I am using:

#!/bin/bash

#SBATCH --mem 200G
#SBATCH --job-name STAR-index
#SBATCH --mail-user email@illinois.edu
#SBATCH --mail-type END
#SBATCH --mail-type FAIL

cd /home/n-z/valizad2/Sepsis_RNAseq/Squirrelmonkey

module load STAR/2.6.1b-IGB-gcc-4.9.4

rm -r /scratch/valizad2

mkdir /scratch/valizad2

STAR --outTmpDir /scratch/valizad2/$SLURM_JOB_ID --limitGenomeGenerateRAM 206609344554 --runThreadN 12 --runMode genomeGenerate --genomeDir /home/n-z/valizad2/Sepsis_RNAseq/Squirrelmonkey/Squirrelmonkey_GenomeDirectory --genomeFastaFiles /home/n-z/valizad2/Sepsis_RNAseq/Squirrelmonkey/GCF_000235385_SaiBol_genomic.fna  --sjdbGTFfile GCF_000235385_1_SaiBol1_0_genomic.gff --sjdbOverhang 100

rm -fr /scratch/valizad2/$SLURM_JOB_ID

and this is the output I am receiving:

cat  slurm-1086823.out
Please use the outTmpDir parameter with STAR to point to the nodes local /scratch directory.  Without this parameter, the program causes
issues with our shared filesystem.
    We will kill your job if you do not use the --outTmpDir parameter.
    STAR --outTmpDir /scratch/username/$SLURM_JOB_ID ...
    Oct 20 23:21:23 ..... started STAR run
Oct 20 23:21:23 ... starting to generate Genome files

EXITING because of INPUT ERROR: could not open genomeFastaFile: /home/n-z/valizad2/Sepsis_RNAseq/Squirrelmonkey/GCF_000235385_SaiBol_genomic.fna 

Oct 20 23:21:23 ...... FATAL ERROR, exiting
  
ADD REPLY
0
Entering edit mode

Hi caggtaagtat,

Thank you so much for your feedback. I removed the space and it fixed the problem! The reason I had added that extra space was because I kept getting errors and STAR seemed to be counting the word after the genome file name as a part of the file name and path so I put extra space which did not help. Probably the first time I got the error was because of the dots in the name of my file which I changed later to fix the issue but since I had added the extra space, this did not resolve the issue.

Thank you again for your help!

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

This comment should have gone under @caggtaagtat's answer.

ADD REPLY
0
Entering edit mode

Now, I am having another issue with another STAR run. I specified the annotation file but for some reason, STAR doesn't see it.

Here is my code:

#!/bin/bash

#SBATCH --mem 200G
#SBATCH --job-name STAR-index
#SBATCH --mail-user email@illinois.edu
#SBATCH --mail-type END
#SBATCH --mail-type FAIL

cd /home/n-z/valizad2/Sepsis_RNAseq/Ringtaillemur

module load STAR/2.6.1b-IGB-gcc-4.9.4

rm -r /scratch/valizad2

mkdir /scratch/valizad2

STAR --outTmpDir /scratch/valizad2/$SLURM_JOB_ID --limitGenomeGenerateRAM 206609344554 --runThreadN 12 --runMode genomeGenerate --genomeDir /home/n-z/valizad2/Sepsis_RNAseq/Ringtaillemur/RingtailedLemur_GenomeDirectory --genomeFastaFiles /home/n-z/valizad2/Sepsis_RNAseq/Ringtaillemur/GCF_000165445_2_Mmur_3_0_genomic.fna --sjdbGTFfile GCF_0001654452_Mmur_30_genomic.gff --sjdbOverhang 100

rm -fr /scratch/valizad2/$SLURM_JOB_ID

and here is the error:

EXITING because of FATAL INPUT PARAMETER ERROR: when generating genome without annotations (--sjdbFileChrStartEnd or --sjdbGTFfile options)
do not specify >0 --sjdbOverhang
ADD REPLY
0
Entering edit mode

It is not a good practice to ask additional questions in an existing thread (even if they are related). You may want to create a new question in that case.

The error seems to be self explanatory. Have you tried to address the missing annotations or removing --sjdbOverhang option?

ADD REPLY
1
Entering edit mode
5.5 years ago
caggtaagtat ★ 1.9k

Hi, since its still not working after deleting the space, you probably get the error, since you usa a GFF file as an annotation file with the argument --sjdbGTFfile, which requires a GTF file, which leads to the error that says that STAR tries to build the genome with no annotation. So the easiest way to fix it would be to use a GTF file as an annotation file instead, if availible.

However, if you do not have a GTF file and you want to run STAR with the GFF file, there is a possibilty, described in this google group chat with STAR developer Alex Dobin.

Basically he says, that instead of --sjdbGTFfile,

for a "standard" gff3 file, you need to use --sjdbGTFtagExonParentTranscript Parent

ADD COMMENT
0
Entering edit mode

Hi,

Actually, what you said worked for that run. I had trouble with another run but I got it worked now. Thanks!

ADD REPLY
0
Entering edit mode
5.5 years ago
caggtaagtat ★ 1.9k

Hi, I'm not sure if that is really the problem, but there seems to be an extra empty space character in your STAR command right after the fasta file input.

So instead of:

STAR --outTmpDir /scratch/valizad2/$SLURM_JOB_ID --limitGenomeGenerateRAM 206609344554 --runThreadN 12 --runMode genomeGenerate --genomeDir /home/n-z/valizad2/Sepsis_RNAseq/Squirrelmonkey/Squirrelmonkey_GenomeDirectory --genomeFastaFiles /home/n-z/valizad2/Sepsis_RNAseq/Squirrelmonkey/GCF_000235385_SaiBol_genomic.fna  --sjdbGTFfile GCF_000235385_1_SaiBol1_0_genomic.gff --sjdbOverhang 100

it should be:

STAR --outTmpDir /scratch/valizad2/$SLURM_JOB_ID --limitGenomeGenerateRAM 206609344554 --runThreadN 12 --runMode genomeGenerate --genomeDir /home/n-z/valizad2/Sepsis_RNAseq/Squirrelmonkey/Squirrelmonkey_GenomeDirectory --genomeFastaFiles /home/n-z/valizad2/Sepsis_RNAseq/Squirrelmonkey/GCF_000235385_SaiBol_genomic.fna --sjdbGTFfile GCF_000235385_1_SaiBol1_0_genomic.gff --sjdbOverhang 100

STAR does not handly extra empty spaces very well, so maybe this is all it takes :)

ADD COMMENT

Login before adding your answer.

Traffic: 1884 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6