Can Base Pair Positions Change Between Patches Of Builds? (Say Between Grch37.P5 And Grch37.P10)
2
2
Entering edit mode
10.1 years ago
Abdel ▴ 410

Can base pair positions change between patches of builds? Are there for example SNPs or genes that can have different base pair positions between GRCh37.p5 and GRCh37.p10?

ncbi snp genes position • 3.6k views
ADD COMMENT
1
Entering edit mode
10.1 years ago
deanna.church ★ 1.1k

I'm not sure where you are getting your data- but here are some things to think about. 1. Assembly: The only difference between the patch releases (GRCh37.p*) and GRCh37 are the actual patch sequences. There are a few examples of patches that have updated between patch releases, but this is relatively rare- and the data was released as part of GRCh37 never changes. Here is some more information about the assembly model: http://www.ncbi.nlm.nih.gov/assembly/model/. For more information about the GRC you can check here: http://genomereference.org

  1. Annotation runs: Annotation runs tend to be independent of assembly builds. Ensembl re-annotates on a regular basis, as does NCBI. Because the patch releases happen quarterly, it is typical that they will have different patch versions of the assembly in any given release. So- annotation locations can change between patch releases- but this can be due to annotation run issues (new evidence, algorithms, etc). Also including different sequences in the assembly can affect the annotation - for example, if a gene isn't well represent in the Primary assembly (say GRCh37) it may align to a related location and get annotated there. If the GRC releases a fix patch containing the real location for this gene, the the annotation at the location in the Primary assembly would likely (hopefully!) change.

Does that help?

Deanna

ADD COMMENT
0
Entering edit mode
2.4 years ago
DavidStreid ▴ 90

No (I believe), assuming that gene/snp still maps to the primary assembly.

Patches "add information to the assembly without disrupting the chromosome coordinates" and like Deanna said, these patches, i.e. >patch_id ... in the reference FASTA file, are the only differences. So assuming that dbsnp entry, sequencing read, etc. of that snp/gene still maps to the primary assembly in both GRCh37.p5 and GRCh37.p10, and not a patched scaffold unique to GRCh37.p10, the position is the same.

For instance, take an example SNP mapping to the primary assembly, chromosome 8 - it says it's mapped to chromosome 8 (>chr8 ... in the reference file) and that position will stay constant, as long as anything w/ that SNP still maps there.

However, take another SNP mapping to a patched scaffold, NT_187515.1 - if a read w/ this SNP were mapped to a reference w/o this patch/scaffold, then it would map to a different scaffold, and position.

ADD COMMENT

Login before adding your answer.

Traffic: 2138 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6