Picard Tools Markduplicates 'No Group 1' Error
1
0
Entering edit mode
10.8 years ago
bruce.moran ▴ 960

Hi all,

I am getting a funny error and can't figure out what the issue is:

[Mon Jun 24 17:29:31 IST 2013] Executing as bmoran@compute04 on Linux 2.6.32.59-0.7-default amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_02-b13; Picard version: 1.93(1476)
INFO    2013-06-24 17:29:31     MarkDuplicates  Start of doWork freeMemory: 1133547176; totalMemory: 1140457472; maxMemory: 16924540928
INFO    2013-06-24 17:29:31     MarkDuplicates  Reading input file and constructing read end information.
INFO    2013-06-24 17:29:31     MarkDuplicates  Will retain up to 67160876 data points before spilling to disk.
[Mon Jun 24 17:29:32 IST 2013] net.sf.picard.sam.MarkDuplicates done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=1140457472
To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
Exception in thread "main" java.lang.IndexOutOfBoundsException: No group 1
        at java.util.regex.Matcher.group(Matcher.java:487)
        at net.sf.picard.sam.AbstractDuplicateFindingAlgorithm.addLocationInformation(AbstractDuplicateFindingAlgorithm.java:94)
        at net.sf.picard.sam.MarkDuplicates.buildReadEnds(MarkDuplicates.java:488)
        at net.sf.picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:413)
        at net.sf.picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:161)
        at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
        at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicates.java:145)

I have no idea about Java so help appreciated. Seems to be a find/replace error in general (from Google) but I do not see where that is happening in markDuplicates. I give it a regex for the read names (format= HWI-ST0988:234:C1MAUXR:3:1131:7638:2239), but that is validated and correct (AFAIK!). I am using output from the latest stable version of STAR (2.3.0) which is apparently importable to Picard.

java -jar ~/bin/picard-tools-1.93/MarkDuplicates.jar I=./Aligned.out.sam O=./Aligned.rmdup.sam M=./pic.met REMOVE_DUPLICATES=TRUE AS=TRUE READ_NAME_REGEX="[a-zA-Z0-9-]+:[0-9]+:[a-zA-Z0-9]+:[0-9]+:[0-9]+:[0-9]+:[0-9]+" VALIDATION_STRINGENCY=LENIENT

Thanks,
Bruce.

markduplicates picard-tools • 3.7k views
ADD COMMENT
1
Entering edit mode
10.8 years ago

you need to specify a regular expression group (= specify the parenthesis) in your parameter: READ_NAME_REGEX

Example:

"[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).*
ADD COMMENT
0
Entering edit mode

Hi Pierre, thanks for the answer, thought it was something horrendously complicated, had tried all combinations of the REGEX I give above but the parentheses caught me out. Need them around numeric ([0-9]+) but not on the alphanumeric. Appreciate your response. Bruce.

ADD REPLY

Login before adding your answer.

Traffic: 3054 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6