Redundant @Sq Lines In Bam File
1
0
Entering edit mode
11.7 years ago

Hi all,

Anybody has the idea why redundant @SQ lines present in bam file header?

I created the bam file by the following procedure:

bowtie-build the genome
create sam file using samtools by aligning fastq files to the bowtie-build output
convert sam to bam using samtools

Those redundant files making error "Cannot add sequence that already exists in SAMSequenceDictionary" while I am trying to add Read Groups using picard-AddOrReplaceReadGroups.

Deeps

bowtie picard • 4.1k views
ADD COMMENT
3
Entering edit mode

what you write does not quite makes sense, sam files are not created by samtools, and bowtie-build does not align data. Edit you your post and add the commands that you used and perhaps a sample of what you call redundant @SQ lines

ADD REPLY
0
Entering edit mode
11.7 years ago

Hi Albert,

The editor deleted the new lines between the steps. I did not notice that.

The steps are:

  1. Build the bowtie index using bowtie-build for genome.

  2. create sam file using bowtie(not samtools) (by aligning fastq files to the bowtie-build output)

  3. convert sam to bam using samtools

The generated bam files contains duplicate @SQ lines in the header. I think I got the reason . One file used to build the bowtie index is the subset of another file. ChrY.fa is a part of ChrU.fa.

COMMAND: bowtie-build Chr1.fa,Chr2.fa.....ChrY.fa genomeindexbasename

ADD COMMENT
0
Entering edit mode

ok, good thing that you have tracked that down - I think that would have been a bit difficult to troubleshoot for us

ADD REPLY

Login before adding your answer.

Traffic: 1723 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6