Is there a way to filter a VCF (input stream) on variants where at least a specific subset of the samples in the VCF have a heterozygous or homozygous genotype containing the minor allele?
Or the same filter but then for genotypes containing a non-major allele?
I looked at the bcftools and SnpSift filter documentation and I could not find how to do that in one of these tools. Maybe I overlooked the option or combination of options that I should use?
Closest options that I found are in SnpSift are:
isHom( GEN[0] )
isHet( GEN[0] )
isVariant( GEN[0] )
isRef( GEN[0] )
But SnpSift does not have a isMinorAlleleGenotype( GEN[0] )
functionality.
Is there another tool that can do this?
Or am I best of implementing this myself using a VCF library?
Thank you Pierre but I was hoping to do this filter on the fly, pre-computing the "listOfMajorAleleles.vcf" is not possible in my situation. Also this does not take any specific set of samples in to account.
ah , I see. Please refine what is :" where at least a certain set of samples (in the VCF ?) have a genotype (what kind ?) containing the minor allele? ".
Updated question description, hope it's more clear now.