Question: Find SNPs in the exome from WGS data
Entering edit mode


I have called variants in WGS data with GATK. Now I would like to filter out alla heterozygous SNPs in the exome and which genes they are present in.

The output should be structured:

chrom | position | GT (nucleotides) | gene

1 | 100 | A/T | geneA

1 | 200 | G/T | geneA

2 | 100 | A/C | geneB

Thanks for any help!


ADD COMMENTlink 3.6 years ago jh • 20
Entering edit mode

Thanks a lot for the advice, VEP works great!

I can't figure out how to print the reference allele as a separate column.

Does anyone know?


ADD REPLYlink 3.6 years ago
• 20
Entering edit mode

You could use Ensembl's VEP to annotate all your filtered variants, then filter based off that. Alternatively, you could try Gemini, which would allow you to make complex and flexible queries around large sets of data (based on SQLite)

ADD COMMENTlink 3.6 years ago andrew.j.skelton73 5.7k
Entering edit mode

I would highly recommend GEMINI (which you can use on VEP or SnpEff annotated VCF files). It will let you do family-wise queries and lots of different ways of slicing the data. But even for the simple case you describe I think you'll likely find all of the additional capabilities appealing.

ADD REPLYlink 3.6 years ago
Dan Gaston

Login before adding your answer.

Powered by the version 1.8