Why this SNV (chr5:161300148-161300148) is G in BioMuta Database ,rather than T or A from Ensembl ??
2
0
Entering edit mode
8.0 years ago
winter_li ▴ 60

HI , I found this SNV base(chr5:161300148-161300148 281 G A ) is G in BioMuta2 Complete Dataset from https://hive.biochemistry.gwu.edu/tools/biomuta/index_download.php, as the follow picture : enter image description here

but I search this position chr5:161300148-161300148 via Ensembl (http://asia.ensembl.org/Homo_sapiens/Location/View?r=5:161300148-161300148;db=core) , I looked that chr5:161300148-161300148 base is A or T ,as the following picture : enter image description here

sequencing SNP Assembly gene ensembl • 2.0k views
ADD COMMENT
2
Entering edit mode
8.0 years ago
Denise CS ★ 5.2k

If coordinates 5:161300148-161300148 are relative to GRCh37, then it's indeed G in Ensembl GRCh37. However there is no SNV (single nucleotide variant) in that region. If 5:161300148-161300148 are relative to GRCh38, there is a CNV (structural variant or SV, not SNV) in the region. The coordinates for SNPs in Ensembl are indeed reported on the base, rather than between bases. So 5:161300148-161300148 is correct for SNPs in Ensembl and the start coordinate should be the same as the end coordinate.

ADD COMMENT
0
Entering edit mode

HI , How to konw 5:161300148-161300148 is G to GRCh37 in Ensembl ? but I search http://www.ensembl.org/Homo_sapiens/Location/View?r=5:161300148-161300148;db=core ,this position is A/T . SO what happened ?

And I search https://hive.biochemistry.gwu.edu/tools/biomuta/index_download.php , BioMuta 2.0, ref base in that position is G , it's different from Ensemble .

ADD REPLY
0
Entering edit mode

The clue is on the URL, www.ensembl.org. That's the main site containing the latest assembly of the human genome, GRCh38. For the previous assembly GRCh37, the URL is grch37.ensembl.org. If you search for those coordinates using the latter URL you will see G.

ADD REPLY
0
Entering edit mode
8.0 years ago

I can't immediately find it on BioMuta2, but my initial assumption is that it's coordinates are in genome build GrCh37 and you're looking at Ensembl in GrCh38. Additionally. your SNP notation contains just a single coordinate (begin == end), which is not how a SNP in Ensembl is present, because the coordinates are between the bases and not one the bases e.g. you position would be chr5:161300148-161300149. That for sure is a difference between your two datasets. I can't find on which genome build BioMuta is based.

Edit: I was confused, please look at the reply of Denise!

ADD COMMENT

Login before adding your answer.

Traffic: 2323 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6