samtools/bcftools gave me no variant, why!?
1
0
Entering edit mode
7.6 years ago
reza ▴ 300

I am trying to get variant using samtools/bcftools. I used this command (samtools mpileup –u – f ref.fa alingn.bam | bcftools call –cv - > output.bcf) but program gave me no variant.

Ref.fa is my own assembled genome and align.bam is reads to map file of my own reads to reference genome (downloaded from NCBI).

What is problem?

Ref.fa and align.bam have been created in correct way?

Running command is correct?

Are there other variant caller and variant annotation for novel assembled genome?

software error snp • 2.1k views
ADD COMMENT
2
Entering edit mode

Have you looked at the data in IGV to see where there should be a couple variants? Have you played around with thresholds to see if one of them needs to be tweaked for your data?

ADD REPLY
0
Entering edit mode
7.6 years ago
Brice Sarver ★ 3.8k

Ref.fa is my own assembled genome and align.bam is reads to map file of my own reads to reference genome (downloaded from NCBI).

This doesn't make sense, and I'm assuming the error lies here. It's unclear what you did. You made a reference genome, but then you took those same reads and mapped them back to the reference itself? You would expect an extremely low number of variants (even less if you've masked or included hets) because your sample is literally your reference and variants are only calculated relative to a reference. The -M argument (output sites where the reference is masked) argument would help alleviate this.

Alternatively, I don't understand what you downloaded from NCBI. If you or others used an additional reference to generate the BAM and then are trying to use your own reference at other points, there will be, without appropriate handling, a mismatch between the contig names in the header and the reference itself. I have a vague memory of trying something years ago using bcftools variant calling; I was provided a BAM and reference with different contig names and the outputs were empty. I can't imagine this would be the case now using htslib, but it's something to check.

As far as other approaches, the GATK or freebayes works well for non-model systems. You might need to tweak some filtering parameters depending on your dataset.

ADD COMMENT

Login before adding your answer.

Traffic: 2004 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6