Problems removing specific contig from .bam file
1
0
Entering edit mode
3.1 years ago
USA_225478 • 0

Hi there,

I have a contig that I’d like to remove from a bam file. I have tried removing it using picard and samtools but the darned thing is still in there.

This is what I’ve tried:

Step 1: create list of contigs I’d like to keep/remove

Keep:

samtools idxstats sampleID_markedup.bam | cut -f 1 | grep -v name_of_contig  > toKeep.txt

Remove:

cat > offendingContig.txt
name_of_contig

Step 2: Remove offending contig

Picard

Method 1: keep all but offending contig

java -jar $picard FilterSamReads I=sampleID_markedup.bam O=sampleID_trim.bam READ_LIST_FILE=toKeep.txt FILTER=includeReadList

Method 2: explicitly exclude contig

java -jar $picard FilterSamReads I=sampleID_markedup.bam O=sampleID_trim.bam READ_LIST_FILE=offendingContig.txt FILTER=excludeReadList

Samtools

samtools view -b -R toKeep.txt sampleID_markedup.bam > sampleID_trim.bam

For each method I don’t get any error messages (exit status zero and nothing written to error log), and a new .bam file is generated, but the contig is always still in there, which I check by using:

samtools view sampleID_trim.bam name_of_contig  | head -1 

Is the fact that the .bam file has already been marked for duplicates a possible issue?

Any help would be greatly appreciated.

Kind regards

samtools WGS bam genome picard • 1.7k views
ADD COMMENT
1
Entering edit mode
3.1 years ago
 samtools view in.bam `samtools idxstat in.bam | cut -f1 | grep -E -v '^(chr2|\*)$'`
ADD COMMENT
0
Entering edit mode

Hi Pierre, sorry for the late reply I've been off work for a few days.

Thank you so much, this works perfectly!

ADD REPLY
0
Entering edit mode

cool. Please, validate my answer (green tick on the left)

ADD REPLY
0
Entering edit mode

I tried to do that yesterday but the green tick appears to be missing!

ADD REPLY
0
Entering edit mode

enlarge your screen. https://imgur.com/8nvX23z

ADD REPLY
0
Entering edit mode

I've tried this using it, but I get an error saying that the command list is too long;

samtools view sample_4.sorted.bam `samtools idxstat sample_4.sorted.bam | cut -f1 | grep -E -v '^(prok_GCA_000243075.1_CP003171.1_from_11_to_747_total_737|\*)$'`
-bash: /usr/local/pace-apps/manual/packages/samtools/1.14/intel-19.0.5/bin/samtools: Argument list too long

sample_4.sorted.bam is the name of the bam file from which i need to remove the contig named "prok_GCA_000243075.1_CP003171.1_from_11_to_747_total_737".

What am I doing wrong?

ADD REPLY

Login before adding your answer.

Traffic: 2665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6