Hi!
I am having some problens in work with GATK. The first thing is my sequences are paired-end and I nedd to pass in cutadapt to cut the parts that are not good, but the sequence_1 and sequence_2 stayed with differents sizes.
After that I just passed in bowtie2 to map and after I passed in samtools to transform in .sam, .bam and sort the sequences.
In the end, when I put GATK to run, they run the firste part: AddOrReplaceReadGroups, then give a error that saus that one or more argumments or inputs im my command are wrong. But I always use the same script to run java (if end-to-end and paired-end) and work.
The only thing I do this time that changes is that I use cutadapt and generated a different two input sizes (but I need to cut, because the sequences i am using are not good).
Someone knows something about that?
Thanks a lot
The error isn't caused by the difference in read sizes. Post the error message for more useful help.
Devon, sorry.
Now the error:
The error message says that one of the quality scores in an alignment is aberrantly high. This would normally only happen if (1) you either incorrectly specified the quality encoding when performing alignments, (2) the SAM/BAM file is corrupt or (3) one of the fastq files is corrupt or otherwise has an error. The a combination of subsetting the file and grep/awk should allow you to find the problematic alignment.
The error was thrown right in the beginning as indicated by the ProgressMeter so I would go with reason number (1) as mentioned by Devon above. This is just my guess :-)
When you said that the problem is in the quality, I just figure that sequences I'm working can be in another format. And this is the case: sequences were sequenced in Illumina 1.3+/1.5+, then the codification is different (is up to 100). So, do you know some program or script that converter to the codification for Illumina 1.8+. Thanks, guys
I feel like I wrote a script to do this at some point but don't know where I posted it online. In any case, the conversion is a simple decrement (subtract 31 from each value), so you should be able to write a converter easily enough. If not, you can always redo the alignments with the proper settings.
http://www-huber.embl.de/users/anders/HTSeq/doc/sequences.html#sequences (Check
write_to_fastq_file
)https://github.com/lh3/seqtk