Question: Bcftools Call method not detecting single nucleotide deletions?
0
Entering edit mode

Hi, I'm new at this and trying to construct a sequence validation pipeline.

I'm using this command to call variants in my script:

bcftools mpileup -f "$1" "$2" | bcftools call -mv -Ob -o temp_variants.bcf

When calling bam files with a double deletion, it is able to detect the variant. However, when there is only a single deletion in the bam file, the variant file is empty?

Any ideas/help would be much appreciated!

Entering edit mode
0

not clear, show us the VCF lines and the reads.

ADD REPLYlinkeditmoderate 10 months ago
Pierre Lindenbaum
120k
Entering edit mode
0

Sorry, here's the VCF lines from a file with a double deletion:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  M12336
JGI_009855  292 .   attttttttttt    attttttttt  92  .   INDEL;IDV=59;IMF=0.867647;DP=68;VDB=0.175903;SGB=-0.693145;MQSB=1;MQ0F=0;AC=2;AN=2;DP4=11,16,24,17;MQ=60    GT:PL   1/1:119,4,0

Here's the VCF file from a bam file with a single deletion (empty):

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  M12276

And here's the read that I'm referencing:

JGI_009947  557 .   T   A,<*>   0   .   DP=5;I16=0,0,0,1,0,0,13,169,0,0,60,3600,0,0,1,1;QS=0,1,0;SGB=-0.379885;MQ0F=0   PL  13,3,0,13,3,13

It has a depth of 5, so I'm assuming there is some sort of variant call here?

ADD REPLYlinkeditmoderate 10 months ago
kennyworkman
• 0
• updated 10 months ago
Pierre Lindenbaum
120k
Entering edit mode
0

And here's the read that I'm referencing:

Still not clear. That's not a read, that another line of a vcf file.

ADD REPLYlinkeditmoderate 10 months ago
Pierre Lindenbaum
120k
Entering edit mode
0

What is considered a "read"?

ADD REPLYlinkeditmoderate 10 months ago
kennyworkman
• 0
Entering edit mode
0

Kenny, that entry that you pasted was a single nucleotide variant record - these are stored per row in the VCF/BCF file. The aligned reads are stored in your BAM file.

The fact that this deletion is in a homopolymer of T bases is worrying - these are usually false positive calls, particulary at read depths as low as 5. Are you sure that the call is genuine, i.e., was it confirmed by a companion in vitro method?

To interpret that <*>, take a look here: What does <*> mean in a vcf file?

ADD REPLYlinkeditmoderate 10 months ago
Kevin Blighe
43k
Entering edit mode
0

Hello kennyworkman ,

what are the values for "$1" and "$2"?

fin swimmer

ADD REPLYlinkeditmoderate 10 months ago
finswimmer
11k
Entering edit mode
0

Yeah, that's not clear. Just whatever .bam file and reference fasta file I pass into the script.

ADD REPLYlinkeditmoderate 10 months ago
kennyworkman
• 0

Login before adding your answer.

Powered by the version 2.0