Biostar Beta. Not for public use.
java.lang.OutOfMemoryError: Java heap space not solve with -Xmx
0
Entering edit mode
14 months ago
mb2subi • 0

Hi, I am trying to re-synchronize a set of fq paired-end reads using repair.sh but the following error arises:

Set INTERLEAVED to false
Started output stream.
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.regex.Pattern.compile(Pattern.java:1718)
    at java.util.regex.Pattern.<init>(Pattern.java:1351)
    at java.util.regex.Pattern.compile(Pattern.java:1028)
    at java.lang.String.split(String.java:2380)
    at java.lang.String.split(String.java:2422)
    at jgi.SplitPairsAndSingles.repair(SplitPairsAndSingles.java:723)
    at jgi.SplitPairsAndSingles.process3_repair(SplitPairsAndSingles.java:549)
    at jgi.SplitPairsAndSingles.process2(SplitPairsAndSingles.java:311)
    at jgi.SplitPairsAndSingles.process(SplitPairsAndSingles.java:237)
    at jgi.SplitPairsAndSingles.main(SplitPairsAndSingles.java:46)

The files are 40G each.

I tried to solve it using the parameter -Xmx with 2g, 10g, 20g, 80g and 300g and it didn't work. Using this parameter I got bigger output files from 1.5G to 5.9G but the software finally crashed.

I used another set of data (1g each) and it worked correctly.

ADD COMMENTlink
1
Entering edit mode

With large files (I assume compressed) like that there is not much you can do but perhaps split the files up and run them multiple times? Repair.sh will need large amounts of memory since it needs to keep a lot of information available.

BTW: How did the files get out of sync in first place? If that happened because of trimming them independently then I would go back and redo the trimming with paired files.

ADD REPLYlink
0
Entering edit mode

The facility sent it out of sync

ADD REPLYlink
0
Entering edit mode

Then ask them to provide original data files and do the trimming yourself (I assume that is what has caused the out-of-sync issue?).

ADD REPLYlink
0
Entering edit mode

The point is that won't be easy to ask them, for this reason first I'm trying to resolve by myself, if it's not possible the last option will be to ask them.

ADD REPLYlink
0
Entering edit mode

If this is your data then you have every right to get a copy of the original. That said if 300G is the max memory you have available (and it did not work) so you could try to split the original files up into 2+ pieces and try to see what max size works. You will have to do some bookkeeping to make sure you get all reads (and no duplicates) in the end.

ADD REPLYlink
0
Entering edit mode

Thanks. By the way, isn't possible to do something with the heap space?

ADD REPLYlink
0
Entering edit mode

How can split properly the fq.gz for doing the resynchronization?

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1