Entering edit mode
5.4 years ago
dorarinyo88
▴
20
Hello, I had run two types of variant calling using Freebayes and Mpileup with several individuals. While calling the variants using Freebayes, one of the individuals already successfully converted into VCF file which is larger file size compare to the others. So, how am I going to validate it whether it is acceptable or not. Can I compare using the Mpileup VCF files? Thanks.
This is not very specific. You do not "convert an individual to a vcf file". This sounds like you have a big machine in the lab in which you put a complete human and on the other side a variant file is printed out after. Rather, you should write that "have performed alignment (using bwa? be specific), and have performed variant calling using this and that command (be as complete as possible!).
File size is a bad metric to compare the content. You could count lines, to get a better idea of the number of variants. And if you insist on using file size (bad idea, again) you should at least tell us the file sizes. We are very bad at reading your mind or what's on your screen.
Please make things as easy as possible for us to help you as fast as possible :-)
Hello dorarinyo88 ,
with "Mpileup" you mean the output of
samtools mpileup
orbcftools mpileup
? This would not be your final variant list. For this there is abcftools call
step neccessary.With the default parameters
freebayes
outputs a lot of clear false positiv variants. You can savely remove all variants with aQUAL
value below 1 before comparing.For validation you always need something to compare to, to which you trust. If you have good experience with another variant caller, you can take its result as comparison of course. I great tool for comparison I've found some weeks ago is hap.py.
fin swimmer
Hello there,
Thanks for the reply.
The "Mpileup" means the output of samtools mpileup. I'm positive with the result produce by samtools pileup. I just wanted to know that the file produce by Freebayes VCF file is more likely to be similar with the samtools mpileup VCF file. Is there anyway to validate it? I will try with the tools you have proposed. Thank you so much.
Hello again,
I doesn't make sense to compare the output of
samtools mpileup
andfreebayes
.freebayes
give you the final list of variants. Whereasmpileup
is a collection of information about every covered position, which is used bybcftools call
to decide which position should be investigate for a variant.fin swimmer
Terminology and difficulties aside, I like to use
to assess the number of calls and quality of my VCF files.
Also have a look at
to compare call sets.
SnpSift is also great.