Entering edit mode
4.9 years ago
rrbutleriii
▴
260
In the VCF format, there is the option for the ID field to have multiple semi-colon separated values. In theory, there could be two dbSNP rsIDs in a single line (i.e. two indels at chr:pos), but for programming purposes, that should not happen, correct? dbSNP has merged all variants for a given position to a common rsID?
Follow up: So when parsing a vcf, would I then have to anticipate some variant callers giving me:
1 10051 rs1052373574;rs1326880612 A G,AC
I haven't ever seen that before, but I don't see anything to prohibit it.
Correct - nothing to prohibit it; however, it can cause issues for downstream analysis tools. Most will not support multi-allelic calls like this.