Hi everyone,
Using GATK, I've generated a vcf file of SNPs called across a population of ~200 samples. I wish to phase these variants so as to better study identity-by-descent (IBD) amongst the individuals. However, I cannot find any clear indication as to whether I should phase the variants and then filter the outcome based on Minor Allele Frequency (MAF) or filter the variants first and then phase the remaining SNPs. I'll be using BEAGLE for phasing and IBD detection and vcftools to filter based on MAF.
Is anyone able to advise me on the best order in which to carry out these stages, please?
Many thanks,
Ian