Deleted:Filtering variants for genertion of consensus sequence
0
0
Entering edit mode
2.9 years ago
lechu ▴ 20

Hello,

I used bcftools to generate a bcf file with a command like this:

bcftools mpileup -f reference.fa -d 8000 alignments.bam | bcftools call -mv -Ov -o calls.vcf

Now, I'd like to use the bcftools consensus and use a genomic reference, to generate a consensus sequence. I am not interested in calling rare variants. Tis will be done for the whole genome, and I can tolerate some level of false positives. My vcf file contains information from several samples (multiple bam files were used), but I only want to calculate the consensus for all sample together. I expect the samples to be identical in terms of genotype, and I am not trying to find differences differences between them in this experiment (multiple sample are used only to increase depth).

My question is about the suggested starting parameters to filter the vcf file before generating the consensus with bcftools consensus

I am using the below command form the bcftools manual as a starting point, but I don't understand fully the meaning of the expressions. For example what are RBS and DV?

bcftools filter -sLowQual -g3 -G10 \
    -e'%QUAL<10 || (RPB<0.1 && %QUAL<15) || (AC<2 && %QUAL<15) || %MAX(DV)<=3 || %MAX(DV)/%MAX(DP)<=0.3' \
    calls.vcf.gz

My criteria for what I would like to include in my consensus are as follows:

  1. At least depth of 10 (refers to cumulative depth from all sample, so if I have 10 samples, each contributing to only DP=1 at a position, this criterion would be satisfied)
  2. A variant frequency of at least 0.7
  3. Some quality filtering (not sure what would be a good value to start).

I would be great to have some tips from you on how to set this filtering.

Cheers, Lech

vcf snp consensus variants bcftools • 561 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2616 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6