Updating dbSNP rs ids from old SNP data
2
3
Entering edit mode
8.7 years ago
devenvyas ▴ 740

I have downloaded some SNP data sets published in 2012 (http://www.biologiaevolutiva.org/dcomas/north-african-affy-6-0-data-henn-et-al-submitted/, http://mega.bioanth.cam.ac.uk/data/Ethiopia/).

I am trying to merge the data with recent data, but it seems to me that the data is using an older Hg build (I am assuming Hg18) and the rs numbers don't match my relatively new, existing data as well as I would expect them.

For example, here is a site that corresponds in those two 2012 map files

1    rs7519837    1.06103    1500664
1    rs7519837    0    1500664

but when you search for it in dbSNP, the coordinates are different.

I was wondering, how can I individually update the coordinates and rsIDs of these map files? Thanks!

snp dbSNP • 5.8k views
ADD COMMENT
0
Entering edit mode

I have a question related to this, so I thought I might just stay on the same thread:

I have a VCF file of variants (what else?) from hg18-aligned sequences. I need to convert these variants to hg19.

My question is: do I need to be concerned about the difference in sequence between the two genome versions?

For example:

One of my variants is at chr10 8365, and the hg18 reference sequence is a T, but our bamfile found a C at that spot.

In this case, the hg19 also has a T at that spot, but could it possibly be different? And if so, are there any tools to account for sequence position differences?

Thanks in advance!

Wyatt

ADD REPLY
0
Entering edit mode

Detailed post on tools for converting coordinates between genome builds: Converting Genome Coordinates From One Genome Version To Another (Ucsc Liftover, Ncbi Remap, Ensembl Api)

ADD REPLY
1
Entering edit mode
8.7 years ago
h.mon 35k

You can use liftOver, or CrossMap.

ADD COMMENT
1
Entering edit mode

@h.mon, liftOver doesn't reflect the rs number changes, right?

ADD REPLY
0
Entering edit mode
Those do not support Plink format files.
ADD REPLY
1
Entering edit mode

You can easily change the plink bim/map file into the bed format required by liftOver, then you can convert it back to the plink format.

ADD REPLY
0
Entering edit mode

Any suggestion on how I would do that?

ADD REPLY
0
Entering edit mode

Perl (my choice) or awk (a general favorite around Biostars). If the file is not too large in relation to your computer memory, you can even do this on excel / libreoffice calc.

ADD REPLY
0
Entering edit mode

I found a script that converts Plink MAP to UCSC BED.

However, there is still a big problem with LiftOver. It omits a whole bunch of results, which means I can't just convert the output back to MAP, because there are inconsistent numbers of rows.

Successfully converted 274294 records: View Conversions
Conversion failed on 5237 records.    Display failure file    Explain failure messages
ADD REPLY
0
Entering edit mode
8.7 years ago
devenvyas ▴ 740

I found a script that does the coordinate update for you. I still need to find out how to do the dbSNP rs id update.

http://genome.sph.umich.edu/wiki/LiftMap.py

ADD COMMENT
0
Entering edit mode
Its an ID. What can you change?
ADD REPLY
1
Entering edit mode

dbSNP RS IDs get updated over time. For example, when two SNPs turn out to be the same SNP later on, then dbSNP revokes one of the RS ID as it is a synonym. Old Plink data, however, is stuck with the old RS ID.

ADD REPLY
1
Entering edit mode

that's true. Did you find a way to solve this?

ADD REPLY

Login before adding your answer.

Traffic: 2028 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6