Picard: EstimateLibraryComplexity -> OutOfMemoryError
1
1
Entering edit mode
9.9 years ago

I want to run EstimateLibraryComplexity.jar with a 9.8GB big bam file, but I always get a OutOfMemoryError. I already tried -Xmx (up to 60GB) and still get the error. Has anybody an idea of how to run EstimateLibraryComplexity on bigger bam files?

That's my call and the error message:

$ java -Xmx10g -jar EstimateLibraryComplexity.jar INPUT=file.bam OUTPUT=file.libraryComplexity

[Wed Jun 04 21:43:08 CEST 2014] picard.sam.EstimateLibraryComplexity INPUT=[file.bam] OUTPUT=file.libraryComplexity    MIN_IDENTICAL_BASES=5 MAX_DIFF_RATE=0.03 MIN_MEAN_QUALITY=20 MAX_GROUP_RATIO=500 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Wed Jun 04 21:43:08 CEST 2014] Executing as me@work on Linux 3.6.2-1.fc16.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_07-b10; Picard version: 1.114(444810c1de1433d9eca8130be63ccc7fd70a9499_1400593393) JdkDeflater
INFO    2014-06-04 21:43:08     EstimateLibraryComplexity       Will store 15494157 read pairs in memory before sorting.
INFO    2014-06-04 21:43:13     EstimateLibraryComplexity       Read     1,000,000 records.  Elapsed time: 00:00:05s.  Time for last 1,000,000:    5s.  Last read position: chr10:38,239,480
....
INFO    2014-06-04 21:53:21     EstimateLibraryComplexity       Read    30,000,000 records.  Elapsed time: 00:10:13s.  Time for last 1,000,000:  183s.  Last read position: chr15:34,522,127
[Wed Jun 04 22:54:26 CEST 2014] picard.sam.EstimateLibraryComplexity done. Elapsed time: 71.30 minutes.
Runtime.totalMemory()=5801312256
To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOfRange(Arrays.java:2694)
        at java.lang.String.<init>(String.java:203)
        at java.lang.String.substring(String.java:1913)
        at htsjdk.samtools.util.StringUtil.split(StringUtil.java:89)
        at picard.sam.AbstractDuplicateFindingAlgorithm.addLocationInformation(AbstractDuplicateFindingAlgorithm.java:71)
        at picard.sam.EstimateLibraryComplexity.doWork(EstimateLibraryComplexity.java:256)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183)
        at picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:124)
        at picard.sam.EstimateLibraryComplexity.main(EstimateLibraryComplexity.java:217)

And that's the java version:

$ java -showversion
java version "1.7.0_07"
Java(TM) SE Runtime Environment (build 1.7.0_07-b10)
Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)

EDIT: I also posted this question at SEQanswers!

java bam picard • 5.5k views
ADD COMMENT
0
Entering edit mode

This smacks of a bug in the program. Especially since it happened after over an hour of runtime. What version of picard tools are you running?

ADD REPLY
0
Entering edit mode

Picard version: 1.114

ADD REPLY
0
Entering edit mode

Thanks! Now I see it was waaaay over to the right in your original post! >.<

ADD REPLY
0
Entering edit mode

Out of curiosity, did it actually max out the space allocated when you used -Xmx60g?

ADD REPLY
0
Entering edit mode

I don't know any more. But when I check the used memory for the run above, it looks like it only used ~5GB (Runtime.totalMemory()=5801312256), doesn't it?

ADD REPLY
0
Entering edit mode

Indeed, this sounds like a bug. You might post a message to the samtools-help email list and see if one of the authors have run into this (if not, it looks like there's a bug report to be filed).

ADD REPLY
0
Entering edit mode

Here's another possibility: your tmp location is being filled up by the operation, so the error is actually triggered when you run out of swap disk. Do you mind checking the location of your /tmp/ folder, and the amount of free space on its host volume?

In the past I've resolved this by symlinking /tmp to a folder on a large volume.

ADD REPLY
0
Entering edit mode

I tracked the free space of the volume and the size of the /tmp/ folder and both are far away from being filled up. But thanks for the idea... was worth a try.

ADD REPLY
0
Entering edit mode

Have you tried raising the MIN_IDENTICAL_BASES parameter to something like 10 or even 15? With a BAM file that size, it actually makes sense that you would run out of memory during the sort step.

ADD REPLY
0
Entering edit mode

Hi David,

I know it's been a long time since you posted this thread.

I was curious to know on the resolution of the error ?

Could you please update the thread ?

Thanks

ADD REPLY
0
Entering edit mode

Sorry, but there is no update. I just stopped using PicardTools.

ADD REPLY
0
Entering edit mode
21 months ago
Sumaya • 0

Try to make a temporary direcotory for a temporary storage of Picard working files and then include the path of this directory with the option --TMP_DIR I hope this would help after a very long time when this thread was posted !

ADD COMMENT

Login before adding your answer.

Traffic: 1997 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6