VEP coordinates do not result to same sequence as ref sequence in gsvar file
1
0
Entering edit mode
3.3 years ago

Dear all,

For some analysis, I have gsvar files of whole genome variants compared to GRCh37. My logic, is to combine the VEP coordinates with the transcript.coding_sequence from pyensembl.
For one particular deletion:

chr     start       end         ref     obs

chr1    248616705   248616711   TGCTGCG -

The VEP column, of the same variant row, follows:

OR2T2:ENST00000342927:frameshift_variant:HIGH:exon1/1:c.612_618del:p.Cys204Ter:PF13853 [Olfactory receptor] Using the pyensembl the coding sequence for the VEP coordinates is:

>>>seq = "ATGGGCATGGAGGGTCTTCTCCAGAACTCCACTAACTTCGTCCTCACAGGCCTCATCACCCATCCTGCCTTCCCCGGGCTTCTCTTTGCAATAGTCTTCTCCATCTTTGTGGTGGCTATAACAGCCAACTTGGTCATGATTCTGCTCATCCACATGGACTCCCGCCTCCACACACCCATGTACTTCTTGCTCAGCCAGCTCTCCATCATGGATACCATCTACATCTGTATCACTGTCCCCAAGATGCTCCAGGACCTCCTGTCCAAGGACAAGACCATTTCCTTCCTGGGCTGTGCAGTTCAGATCTTCCTCTACCTGACCCTGATTGGAGGGGAATTCTTCCTGCTGGGTCTCATGGCCTATGACCGCTATGTGGCTGTGTGCAACCCTCTACGGTACCCTCTCCTCATGAACCGCAGGGTTTGCTTATTCATGGTGGTCGGCTCCTGGGTTGGTGGTTCCTTGGATGGGTTCATGCTGACTCCTGTCACTATGAGTTTCCCCTTCTGTAGATCCCGAGAGATCAATCACTTTTTCTGTGAGATCCCAGCCGTGCTGAAGTTGTCTTGCACAGACACGTCACTCTATGAGACCCTGATGTATGCCTGCTGCGTGCTGATGCTGCTTATCCCTCTATCTGTCATCTCTGTCTCCTACACGCACATCCTCCTGACTGTCCACAGGATGAACTCTGCTGAGGGCCGGCGCAAAGCCTTTGCTACGTGTTCCTCCCACATTATGGTGGTGAGCGTTTTCTACGGGGCAGCCTTCTACACCAACGTGCTGCCCCACTCCTACCACACTCCAGAGAAAGATAAAGTGGTGTCTGCCTTCTACACCATCCTCACCCCCATGCTCAACCCACTCATCTACAGCTTGAGGAATAAAGATGTGGCTGCAGCTCTGAGGAAAGTACTAGGGAGATGTGGTTCCTCCCAGAGCATCAGGGTGGCGACTGTGATCAGGAAGGGCTAG"
>>>seq[612-1:618]
'CGTGCTG'

You can see that this sequence is not the same as the ref column in gsvar (TGCTGCG).

Does anyone have encountered the same case?

Grateful to your ideas to resolve such cases.


Thank you and keep safe!

Damianos

VEP pyensembl deletion GRCh37 • 1.0k views
ADD COMMENT
0
Entering edit mode

Thank you very much for the thorough explanation!

I know understand the phenomenon :)

ADD REPLY
3
Entering edit mode
3.3 years ago
Emily 23k

This is down to a particular quirk of HGVS in right shifting any deletions/insertions which have the same outcome. The location in the CDS of your input deletion region is 607-613, whereas the HGVS shows it to be 612-618. The sequence of this region is:

607 - TGCTGCGTGCTGA - 619
       C  C  V  L

If you delete either 607-613 or 612-618, you get:

607 - TGCTGA - 619
       C  *

Since the outcome is the same whichever bases you delete, the HGVS is shifted to the right-most location where this is true.

ADD COMMENT

Login before adding your answer.

Traffic: 2829 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6