unexpectedly low speed of STAR aligner
1
0
Entering edit mode
6.8 years ago

Hello everyone! I am trying to use STAR to align ~40 million reads (paired-end, stranded) to mm10 genome using the following command:

/software/STAR-STAR_2.4.0i/bin/Linux_x86_64/STAR --genomeDir /databank/STAR/$genome --readFilesIn $filename\_1.fastq $filename\_2.fastq --runThreadN 10 --outSAMstrandField intronMotif --outReadsUnmapped Fastx --outFileNamePrefix $filename

My computer is relatively powerful - Intel XeonR CPU E5-2650 0 @ 2.00 GHz x 18, 64 bits, 32 CPU, 31.4GiB RAM (and 32GiB Swap).

The speed of the mapping is quite low from the beginning (12 M/hr) and is decreasing over time (5 and lower). I can see that the swap memory gradually is being more and more used, up to 50% (which as I understand means that the computer is running out of RAM and starts to use very slow swap memory instead). The main RAM is used up to 90%.

I ran the very same command regularly on a less powerful computer before (16 CPU instead of 32, memory is the same) and it worked fine. The mapping speed is usually from 50-200 M/hr. I really don't understand what can be causing this problem, why this script takes up all the RAM on this computer but not on the other one. Does anyone have any suggestions of what could be the problem and how I could try to fix it? Any help will be very very much appreciated!

STAR RNA-Seq • 6.0k views
ADD COMMENT
0
Entering edit mode

Is it writing to a non-responsive network mount point?

ADD REPLY
0
Entering edit mode

Hm, it shouldn't. I am doing everything locally...

ADD REPLY
0
Entering edit mode

Does your library contain a significant (≥10%) of poly(A)-containing reads? We've seen a few different aligners slow to a crawl under such conditions (although I don't recall if STAR was one of them).

ADD REPLY
0
Entering edit mode

Hm, interesting! My library should not. I am doing nuclear RNA with rRNA depletion, without polyA-enrichment step

ADD REPLY
0
Entering edit mode

Should not ≠ does not. Easy enough to check/eliminate this possibility.

ADD REPLY
0
Entering edit mode

It is true. But I am sure that in my case the problem is not in in the, let's say, incompatibility of STAR with my particular data. I have already successfully aligned these data, using exactly the same script (500% sure), on a different computer which is supposed to have less computing power without absolutely any problems. And I am still trying to figure out if there are some hidden configurations of STAR or memory usage or smth like this on my new computer that crashes STAR

ADD REPLY
0
Entering edit mode
6.8 years ago

Alignment requires a certain additional amount of memory per thread; it's possible that with 16 threads you didn't swap but with 32 threads you do. So, you could tell it to use fewer threads (specifically, 16). You probably don't have 32 cores anyway; E5-2650 is 8 cores and only supports dual CPU systems at most, so there's not much point in running 32 threads (hyperthreading does not usually help alignment much). Star is pretty memory-hungry from my understanding so I'm surprised it even works with the mouse genome using 32 GB RAM.

ADD COMMENT
0
Entering edit mode

Actually I am using only 10 threads. Initially, I tried to run it with 15 threads, but then decided to reduce the number of threads, in case it was causing the memory crash. Well, from my previous experience Star does work on the mouse genome with 32 GB RAM, but you can't do absolutely anything else at the same time.

ADD REPLY

Login before adding your answer.

Traffic: 1643 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6