VEP for cancer annotation using COSMIC
3
2
Entering edit mode
3.2 years ago
yussab ▴ 90

Dear all,

I'm struggling in finding how to obtain information about COSMIC database after I annotate a vcf using VEP. The command that I used is:

vep      -i Mutect2_unfiltered_10643_vs_2434.vcf.gz        
     -o Mutect2_unfiltered_10643_vs_2434_VEP.ann.vcf        
     --assembly GRCh38       
     --species homo_sapiens                                                             
     --offline  --cache  --cache_version 99        
     --dir_cache /.vep  --everything  --filter_common        
     --fork 4  --format vcf  --per_gene  --stats_file Mutect2_unfiltered_10643_vs_2434_VEP.summary.html        
     --total_length  --vcf

I get back a vcf annotated file, but I'm not able to find any COSMIC information. Maybe there is a way to filter only for COSMIC?

Thank you in advance,
Youssef

COSMIC Cancer VEP • 4.4k views
ADD COMMENT
3
Entering edit mode
3.2 years ago
Emily 23k

The COSMIC variants should appear listed as SOMATIC under the co-located variants. You may find that your --filter_common option is excluding some COSMIC variants. I would recommend running without this then using the VEP filter tool with --filter "SOMATIC" to find COSMIC variants.

ADD COMMENT
0
Entering edit mode

Thank you Emily, I'll try and let you know how it worked! :)

ADD REPLY
0
Entering edit mode

Dear Emily, I filtered my vcf file using this command:

filter_vep -i Mutect2_filtered_VEP.ann.vcf.gz -o out_filtered_SOMATIC.vcf --filter "SOMATIC"

I obtain the result in the table, however I'm still confused on how to interpret these vcf file.

**filtering_status=These calls have been filtered by FilterMutectCalls to label false positives with a list of failed filters and true positive s with PASS.

**normal_sample=2682

**source=FilterMutectCalls

**source=Mutect2

**tumor_sample=10643

**VEP="v99" time="2021-02-10 15:29:22" cache="/hpcshare/genomics/sarek_analyses/work/b6/54109e400c87375cc7a5d169ec2bc2/vep_cache/homo_sapiens/99_GRCh38" ensembl-funcgen=99.0832337 ensembl=99.d3e7d31 ensembl-variation=99.642e1cd ensembl-io=99.441b05b 1000genomes="phase3" COSMIC="90" ClinVar="201909" ESP="V2-SSA137" HGMD-PUBLIC="20184" assembly="GRCh38.p13" dbSNP="153" gencode="GENCODE 33" genebuild="2014-07" gnomAD="r2.1" polyphen="2.2.2" regbuild="1.0" sift="sift5.2.2"

**INFO=-ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: Allele|Consequence|IMPACT|SYMBOL|Gene|Feature_type|Feature|BIOTYPE|EXON|INTRON|HGVSc|HGVSp|cDNA_position|CDS_position|Protein_position|Amino_acids|Codons|Existing_variation|DISTANCE|STRAND|FLAGS|VARIANT_CLASS|SYMBOL_SOURCE|HGNC_ID|CANONICAL|MANE|TSL|APPRIS|CCDS|ENSP|SWISSPROT|TREMBL|UNIPARC|GENE_PHENO|SIFT|PolyPhen|DOMAINS|miRNA|HGVS_OFFSET|AF|AFR_AF|AMR_AF|EAS_AF|EUR_AF|SAS_AF|AA_AF|EA_AF|gnomAD_AF|gnomAD_AFR_AF|gnomAD_AMR_AF|gnomAD_ASJ_AF|gnomAD_EAS_AF|gnomAD_FIN_AF|gnomAD_NFE_AF|gnomAD_OTH_AF|gnomAD_SAS_AF|MAX_AF|MAX_AF_POPS|CLIN_SIG|SOMATIC|PHENO|PUBMED|MOTIF_NAME|MOTIF_POS|HIGH_INF_POS|MOTIF_SCORE_CHANGE"-

*CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 10643 2682

chr1 16534 . C T . map_qual;normal_artifact;panel_of_normals AS_FilterStatus=map_qual;AS_SB_TABLE=38,71|1,12;DP=126;ECNT=2;GERMQ=29;MBQ=32,30;MFRL=305,438;MMQ=23,23;MPOS=29;NALOD=-9.266e+00;NLOD=7.26;PON;POPAF=0.305;TLOD=9.60;CSQ=T|downstream_gene_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000450305|transcribed_unprocessed_pseudogene||||||||||rs15642|2864|1||SNV|HGNC|HGNC:37102||||||||||||||||||||||||||||||||||||||||||,T|downstream_gene_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000456328|processed_transcript||||||||||rs15642|2125|1||SNV|HGNC|HGNC:37102|YES||1|||||||||||||||||||||||||||||||||||||||,T|intron_variant&non_coding_transcript_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000488147|unprocessed_pseudogene||8/10|ENST00000488147.1:n.1067+73G>A|||||||rs15642||-1||SNV|HGNC|HGNC:38034|YES|||||||||||||||||||||||||||||||||||||||||,T|downstream_gene_variant|MODIFIER|MIR6859-1|ENSG00000278267|Transcript|ENST00000619216|miRNA||||||||||rs15642|835|1||SNV|HGNC|HGNC:50039|YES|||||||||||||||||||||||||||||||||||||||||,T|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00000344266|CTCF_binding_site||||||||||rs15642||||SNV|||||||||||||||||||||||||||||||||||||||||||| GT:AD:AF:DP:F1R2:F2R1:SB 0/1:8,4:0.416:12:5,3:3,1:3,5,0,4 0/0:101,9:0.090:110:53,5:46,3:35,66,1,8

(I changed #/< with */- for formatting issues)

ADD REPLY
0
Entering edit mode

I don't know if this has been changed. But due to the IP reason of COSMIC, VEP had not been able to check alleles. It was pure loci match without allele matching. If the later is what you want, probably better to write your own annotator

ADD REPLY
0
Entering edit mode
2.5 years ago

if you want cosmic annotation, you should use the option --custom

http://asia.ensembl.org/info/docs/tools/vep/script/vep_cache.html#gfftypes
ADD COMMENT
0
Entering edit mode
2.5 years ago

Run VEP annotation

vep -i in.vcf -o OUT.TXT --fasta /hg19.fasta --assembly GRCh37 --offline  --cache --dir_cache /$HOME/.vep --everything --fork 4 --format vcf --per_gene --stats_file Mutect2_unfiltered_10643_vs_2434_VEP.summary.html --total_length –vcf

Output will look like this

IMPACT=MODERATE;STRAND=-1;VARIANT_CLASS=substitution;SYMBOL=KRAS;SYMBOL_SOURCE=HGNC;HGNC_ID=6407;BIOTYPE=protein_coding;CANONICAL=YES;CCDS=CCDS8703.1;ENSP=ENSP00000256078;SWISSPROT=RASK_HUMAN;TREMBL=Q9UM97_HUMAN,Q71SP6_HUMAN,P78460_HUMAN,L7RSL8_HUMAN,I1SRC5_HUMAN;UNIPARC=UPI0000133132;GENE_PHENO=1;SIFT=deleterious(0);PolyPhen=probably_damaging(0.987);EXON=2/6;DOMAINS=Gene3D:3.40.50.300,Pfam:PF00071,Prints:PR00449,PROSITE_profiles:PS51421,PANTHER:PTHR24070,PANTHER:PTHR24070:SF186,SMART:SM00173,SMART:SM00174,SMART:SM00175,SMART:SM00176,Superfamily:SSF52540,TIGRFAM:TIGR00231,Low_complexity_(Seg):seg;***SOMATIC=1,1,1,1,1,1,1***;PHENO=1,1,1,1,1,1,1

Then filter

filter_vep -i OUT.TXT -o out_filtered.txt -filter "SOMATIC"

I hope that works

ADD COMMENT

Login before adding your answer.

Traffic: 2946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6