java.lang.OutOfMemoryError: Java heap space not solve with -Xmx
0
0
Entering edit mode
5.3 years ago
mb2subi ▴ 10

Hi, I am trying to re-synchronize a set of fq paired-end reads using repair.sh but the following error arises:

Set INTERLEAVED to false
Started output stream.
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.regex.Pattern.compile(Pattern.java:1718)
    at java.util.regex.Pattern.<init>(Pattern.java:1351)
    at java.util.regex.Pattern.compile(Pattern.java:1028)
    at java.lang.String.split(String.java:2380)
    at java.lang.String.split(String.java:2422)
    at jgi.SplitPairsAndSingles.repair(SplitPairsAndSingles.java:723)
    at jgi.SplitPairsAndSingles.process3_repair(SplitPairsAndSingles.java:549)
    at jgi.SplitPairsAndSingles.process2(SplitPairsAndSingles.java:311)
    at jgi.SplitPairsAndSingles.process(SplitPairsAndSingles.java:237)
    at jgi.SplitPairsAndSingles.main(SplitPairsAndSingles.java:46)

The files are 40G each.

I tried to solve it using the parameter -Xmx with 2g, 10g, 20g, 80g and 300g and it didn't work. Using this parameter I got bigger output files from 1.5G to 5.9G but the software finally crashed.

I used another set of data (1g each) and it worked correctly.

bbmap repair.sh paired-end OutOfMemoryError Xmx • 3.5k views
ADD COMMENT
1
Entering edit mode

With large files (I assume compressed) like that there is not much you can do but perhaps split the files up and run them multiple times? Repair.sh will need large amounts of memory since it needs to keep a lot of information available.

BTW: How did the files get out of sync in first place? If that happened because of trimming them independently then I would go back and redo the trimming with paired files.

ADD REPLY
0
Entering edit mode

The facility sent it out of sync

ADD REPLY
0
Entering edit mode

Then ask them to provide original data files and do the trimming yourself (I assume that is what has caused the out-of-sync issue?).

ADD REPLY
0
Entering edit mode

The point is that won't be easy to ask them, for this reason first I'm trying to resolve by myself, if it's not possible the last option will be to ask them.

ADD REPLY
0
Entering edit mode

If this is your data then you have every right to get a copy of the original. That said if 300G is the max memory you have available (and it did not work) so you could try to split the original files up into 2+ pieces and try to see what max size works. You will have to do some bookkeeping to make sure you get all reads (and no duplicates) in the end.

ADD REPLY
0
Entering edit mode

Thanks. By the way, isn't possible to do something with the heap space?

ADD REPLY
0
Entering edit mode

How can split properly the fq.gz for doing the resynchronization?

ADD REPLY

Login before adding your answer.

Traffic: 1837 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6