Biostar Beta. Not for public use.
Multiple rsIDs at chromosomal location?
1
Entering edit mode
13 months ago
rrbutleriii • 60
US, Chicago

In the VCF format, there is the option for the ID field to have multiple semi-colon separated values. In theory, there could be two dbSNP rsIDs in a single line (i.e. two indels at chr:pos), but for programming purposes, that should not happen, correct? dbSNP has merged all variants for a given position to a common rsID?

ADD COMMENTlink
3
Entering edit mode
14 months ago
France/Nantes/Institut du Thorax - INSE…

dbSNP has merged all variants for a given position to a common rsID?

I'm afraid no:

$ wget -q -O - "ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/All_20180418.vcf.gz" | gunzip  -c | grep -v "#" | cut -f 1,2 | uniq -d  | head
1   10051
1   10055
1   10108
1   10109
1   10128
1   10132
1   10177
1   10228
1   10229
1   10235

.

$ wget -q -O - "ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/All_20180418.vcf.gz" | gunzip  -c | grep -v "#" | cut -f 1,2,3,4,5 | grep -w 10051 -m2
1   10051   rs1052373574    A   G
1   10051   rs1326880612    A   AC
ADD COMMENTlink
0
Entering edit mode

Follow up: So when parsing a vcf, would I then have to anticipate some variant callers giving me: 1 10051 rs1052373574;rs1326880612 A G,AC

I haven't ever seen that before, but I don't see anything to prohibit it.

ADD REPLYlink
1
Entering edit mode

Correct - nothing to prohibit it; however, it can cause issues for downstream analysis tools. Most will not support multi-allelic calls like this.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1