Problems associated with handling of missing characters on bcftools consensus and vcf-consensus
0
0
Entering edit mode
3.0 years ago

I intend to construct a species-level phylogeny using an exome dataset which has multiple individuals sampled for each species. I have generated a vcf file for each species post the mpileup step. However, when I use the vcf2phylip.py script, the script seems to split the sequences at the individual level yet again (I have multiple tips corresponding to each individual sampled for all species on the phylogeny).

If your sequencing has missed spots in comparison to the reference, these tool (bcftools consensus and vcf-consensus) replaces the character "N" on the VCF file with corresponding spots from the reference when I create a fasta file. This alters the distance matrix and by default makes it more closely related to the reference than it actually is. How do I fix this?

missingcharacters bcftools vcf-consensus • 856 views
ADD COMMENT

Login before adding your answer.

Traffic: 1441 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6