CD-HIT uses up all RAM and then crashes
0
1
Entering edit mode
4.9 years ago
VDL ▴ 10

I'm trying to use cd-hit to generate a 0.9 sequence identity cutoff of the Blast NR database.

Here's what I'm running:

cd-hit -i nr -o nr90 -c 0.9 -M 1000

But, even though I'm using the -M 1000 option, the command just gradually uses up all the available RAM (8gb) and then crashes. Any idea on how to fix this?

cd-hit • 1.5k views
ADD COMMENT
2
Entering edit mode

More RAM.

if you want to cluster all of nr, you’re going to need much more than 8gb.

This strikes me as an XY problem though, what are you trying to achieve?

ADD REPLY
0
Entering edit mode

I'm trying to replicate a result for a protein prediction problem that used this database. My understanding was that the -M flag was supposed to limit the amount of RAM that the program used. So it doesn't work?

ADD REPLY
0
Entering edit mode

The program needs to use at least a certain amount of RAM. You can’t make a program that needs X gb of RAM run on < X.

It’s probably hitting your limit of 1gb, and then crashing, RAM usage can be a somewhat complex thing to monitor. I’m not sure why it continues to use all 8gb when you specifiy 1, but regardless, NR is far, far too big to be done with even 8, I’d pretty much guarantee.

ADD REPLY
1
Entering edit mode

Maybe you want to use UniRef?

ADD REPLY
0
Entering edit mode

I'll probably use it if I can't get the other option to work, thanks for pointing that out.

ADD REPLY

Login before adding your answer.

Traffic: 2937 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6