Picard Markduplicates Error: Requested Array Size Exceeds Vm Limit
3
1
Entering edit mode
10.9 years ago
xiaoyanli82 ▴ 10

Hi, everyone,

I am working with wheat exome data, which are Illumina PE 100 reads. I analysed the data followed the GATK pipeline. Read were aligned to references using BWA and the BAM files were sorted by samtools. But when I tried to mark PCR dupication using Picard, I got error:

 java -Xms16g -Xmx256g -jar MarkDuplicates.jar I=IN.bam O=OUT.DeDup.bam METRICS_FILE=OUT.dedup REMOVE_DUPLICATES=false MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000 VALIDATION_STRINGENCY=LENIENT

Error :Exception in thread "main" java.lang.OutOfMemoryError: Requested array size exceeds VM limit

I also tried to other options:

java -Xmx160g -jar MarkDuplicates.jar I=IN.bam O=OUT.DeDup.bam METRICS_FILE=OUT.dedup REMOVE_DUPLICATES=false MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000 VALIDATION_STRINGENCY=LENIENT

Error:java.lang.OutOfMemoryError: Java heap space

The BAM file is 4G, java version is 1.7.0_11, picard is version 1.92 What can I do to solve this problem?

Thank you !

Xiaoyan

picard markduplicates • 9.2k views
ADD COMMENT
2
Entering edit mode
10.9 years ago
Jordan ★ 1.3k

I'm just going to guess here. Did you try increasing your PermGen space? If your perm gen space is low, no matter how much heap space you increase, it does not matter.

Try this:

-XX:MaxPermSize=1g

Or how much ever you think is necessary.

ADD COMMENT
0
Entering edit mode

It should be "-XX:MaxPermSize=1g", but I don't think this is an issue. Allocating (trying) so much memory is the one, IMO.

ADD REPLY
0
Entering edit mode

Oh yea. Sorry about. Fixed it. And it could be like you said too. But I'm curious to know how much RAM his system actually has!

ADD REPLY
0
Entering edit mode

The linux system has 1024g RAM

ADD REPLY
0
Entering edit mode

Sounds good then. What is OS and JVM implementation you are running? And what would be the output if you specify -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+PrintGCDetails -XX:+PrintGCTimeStamps?

ADD REPLY
1
Entering edit mode
10.9 years ago
NB ▴ 960

I has the same problem for human genomes and adding "MAX_RECORDS_IN_RAM=5000000" helped me. Maybe you can try that, here's the code that I use

 java -Xmx16g -jar MarkDuplicates.jar    \
    I=IN.bam \
    O=OUT.bam \
    METRICS_FILE=dupmetrics.txt \
    REMOVE_DUPLICATES=true \
    MAX_RECORDS_IN_RAM=5000000 \
    ASSUME_SORTED=true \
    VALIDATION_STRINGENCY=SILENT \
    TMP_DIR=$TMPDIR \
    CREATE_INDEX=true \
    OPTICAL_DUPLICATE_PIXEL_DISTANCE=10
ADD COMMENT
0
Entering edit mode

The error is due to Java heap space. The user tells the tool that it has 256 Gb of RAM that I don't think is normal. So I dont think "MAX_RECORDS_IN_RAM" is an issue.

ADD REPLY
0
Entering edit mode

I tried again following your suggestion, the problem was fixed, thank you very much!

ADD REPLY
0
Entering edit mode
10.9 years ago

Does your computer has 256 Gb RAM ? because you specified it on your command line -Xmx256g . If your bam file is 4 Gb, then I wont even care about changing that parameter and go with the default one.

ADD COMMENT
0
Entering edit mode

Yes, the system has 1024g RAM.

ADD REPLY

Login before adding your answer.

Traffic: 2898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6