Filtering a vcf based on known variants
2.4 years ago
idedios • 30
USA/Irvine/NeoGenomics Laboratories

I have a variant table with hg19 genomic coordinates that I want to use to filter a vcf. What is a good way to go about filtering the vcf assuming I am only looking for lines where the genomic coordinates match between the vcf file and my known variants table?

24 months ago
Alice • 280
USA

What are your file formats? You can use bedtools to intersect variants with coordinates, it can work with both -bed and -vcf files.

As per Alice, you can use BEDTools; however, this will produce output that is not in VCF format.

If you still want to have the VCF format after you filter, then use bcftools filter --regions-file FILE (see Here for further information).

In both situations, your regions to filter should ideally be in BED format.