I have been having issues trying to implement an RNA-seq pipeline through Galaxy using an AWS cloud instance. My goal is to start from raw reads and get to gene counts. However, I have been having issues getting STAR to run. My current error is
Fatal error: Matched on fatal error EXITING: fatal error trying to allocate genome arrays, exception thrown: std::bad_alloc Possible cause 1: not enough RAM. Check if you have enough RAM 32002674832 bytes Possible cause 2: not enough virtual memory allowed with ulimit. SOLUTION: run ulimit -v 32002674832 Jun 03 21:16:00 ...... FATAL ERROR, exiting
I don't believe this is a memory issue as my current cloud instance is using a .xlarge image and the cloudman console says I have used 43G of my total 100G of data. My input datasets are a pair of paired end reads, loaded as fastq.gz. I have also loaded USCS GR38 gene annotation to be used and match the built in reference genome hg38 (which is from USCS I believe.
My STAR input is
-paired-end as individual datasets. -use built in index -use genome reference without builtin gene-model -Human (Homo sapiens) (b38): hg38 -Gene model: UCSC_GR38_annotation.gtf -length of genomic sequence around annotated junctions: 49
(read lengths were 50)
any help would be appreciated