Samtools Header From A Bacterial Genome Ba File
2
0
Entering edit mode
12.8 years ago
Sakti ▴ 510

Hello dear Bio masters,

I'm analyzing some reads from a bacterial genome (E. coli), and I'm trying to get the pileup to calculate coverage for a specific region from a bam file. However, whenever I check the header of my file it has the form:

samtools view -H coli.sorted.bam
@SQ    SN:gi|49175990|ref|NC_000913.2|    LN:4639675
@PG    ID:bwa    PN:bwa    VN:0.5.9-r16

And when I try to retrieve the information from my region (1670-2550) of interest using the "chromosome name" after the SN

samtools view -u coli.sorted.bam gi|49175990|ref|NC_000913.2|:1670-2550 | samtools pileup - > pileup.txt

I get:

-bash: 49175990: command not found
-bash: ref: command not found
-bash: NC_000913.2: command not found
-bash: :190-255: command not found
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_pileup] fail to read the header: non-exisiting file or wrong format.
[main_samview] fail to get the reference name. Continue anyway

So it is not accepting the gi|49175990|ref|NC_000913.2| as a name for search. If I omit it I get:

samtools view -u coli.sorted.bam 1670-2550 | samtools pileup - > pileup.txt
[main_samview] fail to get the reference name. Continue anyway.

And my file is empty. What am I doing wrong????? Any help will be highly appreciated.

Thanks!!!

samtools pileup bacteria bam • 4.6k views
ADD COMMENT
1
Entering edit mode
12.8 years ago
Docroberson ▴ 310

I'm not sure if the problem is with samtools. The bash errors come from bash seeing the vertical pipes and trying to pipe the output of one command into another.

Not sure if this actually will fix it, but try quoting it instead:

samtools view -u coli.sorted.bam "gi|49175990|ref|NC_000913.2|:1,670-2,550" | samtools pileup - > pileup.txt

That should differentiate the name with pipes in it from the actual pipe to samtools pileup.

ADD COMMENT
0
Entering edit mode

Thank you, yes, that worked :)

ADD REPLY
1
Entering edit mode
12.8 years ago
Drio ▴ 920

In your first one, your shell is trying to expand the command prior to execute it. Also you are not passing the reference genome in the pileup. Try:

 samtools view -u -h coli.sorted.bam "gi|49175990|ref|NC_000913.2|:1670-2550" | samtools pileup -vcf ./ref.fa -

where ref.fa is the fasta file for your reference genome.

You are also missing the reference genome in your second one. Try:

 samtools view -u -h coli.sorted.bam 1670-2550 | samtools pileup -vcf ./ref.fa -
ADD COMMENT
0
Entering edit mode

Yes, that worked :)

ADD REPLY
0
Entering edit mode

If it worked, could you please mark my reply as question answered ?

ADD REPLY
0
Entering edit mode

Done :) Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 1942 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6