Biostar Beta. Not for public use.
Find Intersections Between ClinVar Track and VCF In IGV
0
Entering edit mode
2.0 years ago
otwtgin2010 • 150

Looking for a way to find the locations where a VCF and a Clinvar Track Both Have Entries. And also being able to filter these entries based on the nature of either entry. Perhaps something like finding where the VCF has a SNP/Indel at the same location the ClinVar file has an entry whose significance is pathogenic, etc.,

Does anyone know of a way to see this? I can easily jump from one entry to the next in either the ClinVar or VCF Track, but I don't see something that lets you filter/jump to track intersections meeting specific criteria. Just looking to understand the nature of the area surrounding these intersections to see what alternative mappings there might be.

I reviewed IGV's scripting options, but it doesn't seem to offer this, nearly as I can tell.

Any thoughts would be much appreciated.

Thanks very much!

2
Entering edit mode

Well, instead of visual intersections using IGV, try intersecting or annotating sample vcf with clinvar vcf with bcftools/vcftools/bedtools. This would append meta information (from clinvar vcf) to sample vcf and then you can filter the vcf by clinical significance of your choice.

3
Entering edit mode
16 months ago
rbagnall ♦ 1.4k
Australia

You can download clinvar as a vcf, aligned to GRCh37 or GRCh38 from here

The vcf INFO column contains the clinvar allele id (ALLELEID) and clinical significance, e.g. CLNSIG=Benign, CLNSIG=Pathogenic, so you could load both in IGV, or compare your vcf with the clinvar vcf for similar alleles using, e.g. BCFtools, vcftools or bedtools etc

0
Entering edit mode

Thanks! Sorry for the delay. Very new to all this. OK - so i am banging my head against this one. Nearly as i can tell, this should work.

bcftools view -i 'INFO/CLNSIG ~ "Pathogenic"' grch37_clinvar.vcf.gz

i see CLNSIG in the INFO column, many of which have a value of Pathogenic, but it just doesn't work. It doesn't throw an error - just nothing is returned. Am i doing something wrong here?

Thanks so much.

1
Entering edit mode

The clinvar.vcf needs to be bgzip compressed and tabix indexed first.

uncompress then bgzip

gunzip grch37_clinvar.vcf.gz | bgzip > grch37_clinvar.bgzip.vcf.gz

tabix index

tabix -p vcf grch37_clinvar.bgzip.vcf.gz

view alleles in grch37_clinvar.bgzip.vcf.gz that have pathogenic in the CLNSIG info field

bcftools view -i 'INFO/CLNSIG ~ "Pathogenic"' grch37_clinvar.bgzip.vcf.gz

0
Entering edit mode

Very strange. No change at all for me. It reads the file - the filters just do not work. i had to gunzip and then bgzip separate, or else i was seeing an empty bgzip.vcf.gz file. Other than that, followed those steps exactly.

gunzip clinvar_20180701.vcf.gz bgzip clinvar_20180701.vcf tabix -p vcf clinvar_20180701.vcf.gz bcftools view -i 'INFO/CLNSIG ~ "Pathogenic"' clinvar_20180701.vcf.gz

It gets to here, but nothing is displayed beyond that.

CHROM POS ID REF ALT QUAL FILTER INFO

The thing that is odd is that this filter seems to work fine bcftools view -i 'ALT="A"' clinvar_20180701.vcf.gz

It's like if it's in the INFO column, it just all falls apart.

And yet i know it sees CLNSIG, because i change CLNSIG to anything other than CLNSIG, it complains.

[filter.c:1298 filters_init1] Error: the tag "INFO/CLNSIG2" is not defined in the VCF header

1
Entering edit mode

hmm, try

= "Pathogenic"

rather than

~ "Pathogenic"
0
Entering edit mode

Thanks. Yes, tried that also. No difference. What i did find is that i can filter with the first column of the INFO - which in this case is ALLELEID. i also saw that my bcftools version was 1.2 (this is what apt-get install pulled) - and i think the latest is 1.8. i will try and install 1.8 and see if there is any change.