Biostar Beta. Not for public use.
Question: Extract SNPs from VCFfile located in genes based on GFF file information
0
Entering edit mode

I have a VCF file with SNPs and genes subset of GFF file (only genes are present). How to extract SNPs in VCF format located in genes from my data?

ADD COMMENTlink 18 months ago Denis • 70 • updated 18 months ago finswimmer 11k
8
Entering edit mode

Use bedtools:

$ bedtools intersect -a input.vcf -b genes.gff -header -wa > output.vcf

EDIT:

For (very) large vcf files it might be more efficient to bgzip and tabix index the vcf file, convert your gff to bed and use tabix to query the regions

1. bgzip and index

$ bgzip -c input.vcf > input.vcf.gz
$ tabix input.vcf.gz

2. gff to bed

E.g with BEDOPS:

$ gff2bed < genes.gff > genes.bed

3. Query the regions

$ tabix -R genes.bed -h input.vcf.gz > output.vcf

fin swimmer

ADD COMMENTlink 18 months ago finswimmer 11k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0