sortsam (picard) error in identifyting samfile
0
0
Entering edit mode
5.9 years ago
shuksi1984 ▴ 60

What does SAMFormatException mean? How to troubleshoot this error. I got following error:

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. Not enough fields; File SRR6876052_mem.sam; Line 1

My command is:

nohup java -jar path/to/picard.jar SortSam INPUT=SRR6876052_mem.sam OUTPUT=SRR6876052_sortsam.bam SORT_ORDER=coordinate &
software error next-gen • 2.9k views
ADD COMMENT
0
Entering edit mode

Hello shuksi,

could you please post the first lines of your sam file? How have you produced this file?

fin swimmer

ADD REPLY
0
Entering edit mode

Following are the 1st few lines of sam file

@SQ SN:chr1 LN:248956422

@SQ SN:chr2 LN:242193529

@SQ SN:chr3 LN:198295559

@SQ SN:chr4 LN:190214555

@SQ SN:chr5 LN:181538259

@SQ SN:chr6 LN:170805979

@SQ SN:chr7 LN:159345973

@SQ SN:chr8 LN:145138636

@SQ SN:chr9 LN:138394717

This file was produced by the following command:

 bwa mem -M -R '@RG\tID:SRR6876052\tLB:SRR6876052\tPL:ILLUMINA\tPM:HISEQ\tSM:SRR6876052' /path/to/Homo_sapiens_assembly38.fasta path/to/SRR6876052_1.fastq path/to/SRR6876052_2.fastq  > SRR6876052_mem.sam
ADD REPLY
0
Entering edit mode

Hello again,

those lines starting with @ are the header lines. Beside this we need the first few lines that doesn't start by @.

fin swimmer

BTW: It's better to use the code formating button for file contents as well.

ADD REPLY
0
Entering edit mode

Beside this we need the first few lines that doesn't start by @

how do we achieve that? Is my file corrupt?

ADD REPLY
0
Entering edit mode

To get the whole header:

grep "@" SRR6876052_mem.sam

To get the first 10 lines that are not the header:

grep -m 10 -v "^@" SRR6876052_mem.sam
ADD REPLY
0
Entering edit mode

The command

grep -m 10 -v "^@" SRR6876052_mem.sam

Gives following output:

[M::bwa_idx_load_from_disk] read 3171 ALT contigs
[M::process] read 66226 sequences (10000126 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 27774, 2, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (309, 354, 405)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (117, 597)
[M::mem_pestat] mean and std.dev: (357.72, 74.61)
[M::mem_pestat] low and high boundaries for proper pairs: (21, 693)
[M::mem_pestat] skip orientation RF as there are not enough pairs

I dont think these should be the content of sam file

ADD REPLY
1
Entering edit mode

You have redirected the bwa standard error output stream (stderr) to the same location as the standard output stream (stdout. Check your bwa command as redirects such as 2>&1 will do this.

Check your shell configuration as well as some configurations will perform this redirect automatically.

ADD REPLY
0
Entering edit mode

You're correct, These status message shouldn't be in the sam file. In the command you gave above I cannot see why this happens.

Standard questions at this point: What version of bwa are you using? Which OS are you using?

fin swimmer

ADD REPLY
0
Entering edit mode

bwa-Version: 0.7.12-r1039 OS-Ubuntu 16.04.3 LTS (xenial)

In the command you gave above I cannot see why this happens

It is because I used, nohup?

ADD REPLY
0
Entering edit mode

It is because I used, nohup?

Yes, you are right. From nohup --help:

If standard error is a terminal, it is redirected to standard output.

I think you can repair your sam file by removing all lines that starts with [

grep -v "^[" SRR6876052_mem.sam > repaired.sam
ADD REPLY
0
Entering edit mode

Thank you. My command was not running since I used nohup. The solution to this, I used bash scripting and redirected the error using nohup to my .sh file.

ADD REPLY

Login before adding your answer.

Traffic: 1993 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6