VEP vs webserver Polyphen scores
1
2
Entering edit mode
7.8 years ago
regrant ▴ 20

Hello, I have recently noted a discrepancy the scores given by running PolyPhen at its website vs getting the PolyPhen score from ensembl's variant effect predictor (VEP, hg19). Here is one example, with the polyphen scores bolded:

POLYPHEN

VEP

  • Pasted data: AKT2:c.136G>A
  • 1000 Genomes continental allele frequencies: Enabled
  • 1000 Genomes global minor allele frequency: Enabled
  • APPRIS: Enabled
  • BLOSUM62(p): Enabled
  • Condel(p): Enabled
  • Condel Score/prediction(p): Prediction and score
  • CSN(p): Enabled
  • ExAC allele frequencies: Enabled
  • Exon and intron numbers: Enabled
  • Filter by frequency: Exclude variants with MAF greater than 0.05 in 1000 genomes (1KG) combined population
  • Find co-located known variants: Enabled
  • Gene symbol: Enabled
  • HGVS: Enabled LoFtool(p): Enabled
  • MaxEntScan(p): Enabled
  • PolyPhen: Prediction and score
  • Protein: Enabled
  • PubMed IDs for citations of co-located variants: Enabled
  • Get regulatory region consequences: Yes
  • Restrict results: Disabled
  • SIFT: Prediction and score
  • Transcript biotype: Enabled
  • Transcript database to use: Ensembl transcripts
  • Transcript support level: Enabled
  • Polyphen Score for transcript ENST00000392038: 0.805

Which score is more trustworthy?

-edited, as my initial question (why they are different) has been answered (differing protein databases)

polyphen VEP annotation missense • 3.9k views
ADD COMMENT
4
Entering edit mode
7.8 years ago
Emily 23k

VEP PolyPhen scores are pre-computed rather than run on the fly. This means that the scores may have used an older version of PolyPhen, and that the protein database that the proteins were compared to will be older. If there has been a massive expansion of a particular protein family, this would result in a change in the PolyPhen score. You can see which version of PolyPhen and when the protein database snapshot was taken in the Ensembl documentation.

ADD COMMENT
0
Entering edit mode

Sorry,

I uploaded one of my pheno.RNA.passed.somatic.indel.vcf in Vep, I tried to filtrate by Pathogenicity predictions based on SIFT (Prediction and score) and PolyPhen (Prediction and score). But, after analysis done, results shows only empty columns related to SIFT or Polyphen with no score. I am just wondering why?

Thank you

ADD REPLY
1
Entering edit mode

Were any of the variants missense?

ADD REPLY
0
Entering edit mode

Sorry I am not sure but I am seeing most is frameshift_variation and frameshift_deletion ; This is link of VEP for one of my .vcf files. Could you please have a look?

I was trying Vep in Linux but I am getting error so that would be nice if I could get results from web tool. However, by web tool I would need to filter by SIFT or polyphen

http://grch37.ensembl.org/Homo_sapiens/Tools/VEP/Ticket?tl=2H9vfkjI8DGQReZk

Thanks a lot

ADD REPLY
1
Entering edit mode

No, there weren't any. I used the filter at the top of the table and filtered by consequence is missense_variant. You only get SIFT and PolyPhen scores for missense variants, and the columns only display if they have data in them.

ADD REPLY
0
Entering edit mode

Thanks a lot

Mostly tools visualizing mutation results like lollipop plots needs

Chromosome  Start_Position  End_Position    Reference_Allele    Variant_Allele

In VEP output I am not able to find protein change column. Is there any protein change column in Vep results please?

ADD REPLY
1
Entering edit mode

Yes, there is a column for amino acid and codon change, which you should be able to see data for for your inframe and frameshift deletions. The columns are only shown when there's data in them, so while you're just viewing the first few variants, which don't have these data associated, you can't see them. Filter by consequences to see the relevant variants and the columns should appear.

You could also get protein domains by ticking the box on the input form. Coming in the next release will also be the option to see the variants on the protein structure, if there is one available.

ADD REPLY
0
Entering edit mode

Yeah thanks I saw that for few genes. Is the Vep working on GRCh37 (hg19)? Because when I am loading some columns of my Vep results in cBioPortal for visualization in mutation plot says

Critical annotation error. Annotation services might be temporarily down, or your input format might be invalid.

These are some of my inputs manipulated from Vep

Chromosome  Start_Position  End_Position    Reference_Allele    Variant_Allele
1   56987   56988   A   -
1   56987   56988   A   -
1   56987   56988   -   -
1   4133415 4133416 -   -
1   4611030 4611030 -   -
1   4872820 4872820 -   -
1   10442174    10442175    A   -
1   10442174    10442175    A   -
1   10442174    10442175    A   T
1   10442174    10442175    A   T
1   10442174    10442175    A   T
1   10442174    10442175    A   T
1   10442174    10442175    A   T
1   10442174    10442175    A   T
1   10442174    10442175    A   T
1   10442174    10442175    A   T
ADD REPLY
1
Entering edit mode

VEP is working just fine. I have no idea about cBioPortal

ADD REPLY
0
Entering edit mode

Really thank you so much

Sorry, because I don't have any missense variants, so I don't need to any filtration? I mean can I carry on with all of my variants as I don't have any SIFT or PolyPhen score?

ADD REPLY
1
Entering edit mode

It depends what you're trying to do. Some people filter by the consequences themselves, or the known variant frequencies, or by lists of genes known to be linked to their phenotype of interest.

ADD REPLY
0
Entering edit mode

Sorry,

I have uploaded a vcf file contains snp in Vep but I am not able to find reference allele when I am downloading results as txt

http://grch37.ensembl.org/Homo_sapiens/Tools/VEP/Ticket?tl=oICvFAZfvx7Og6CX

Could you please have a look because I need both reference and variant allele

Thank you for any help

ADD REPLY
1
Entering edit mode

If you export as VCF you'll get the reference in that.

ADD REPLY
0
Entering edit mode

Sorry,

I have uploaded my separate SNV and indel as vcf in Vep. I want to score the functional impact of each missense non-synonomous mutation (from 0, non-impactful to 1 highly impactful), synonymous mutation a score of 0 impact and truncating mutations (Non-sense and frameshift mutations) a score of 1 by SIFT and PolyPhen. Finally removing each gene with less than 7 mutations. Is it possible to do these filtrations all at once?

ADD REPLY
0
Entering edit mode

Sorry to be this much disturbing,

I have likely annotated my .vcf files (SNV) with Vep just by default setting. But results says

Variants processed 4996 
Variants filtered out 61

These 61 SNVs have been filtrated based on which criteria when I have run with default parameters?

http://grch37.ensembl.org/Homo_sapiens/Tools/VEP/Results?db=core;tl=10EHmTMtwHwuCD3A-5006118

ADD REPLY
0
Entering edit mode

Your job says that you've filtered by frequency:

Filter by frequency: Exclude variants with MAF greater than 0.01 in 1000 genomes (1KG) combined population

Also, please do not hijack old posts to ask continued questions. regrant opened this post over 2.5 years ago: I'm sure they do not want to be receiving notification after notification on your particular problems that are mostly unrelated to their original post. You should have started a new post with your first question, and I should have told you about this back then too.

ADD REPLY
0
Entering edit mode

I am sorry

You righ

ADD REPLY

Login before adding your answer.

Traffic: 1892 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6