VT normalize error with inconsistent fasta file for clinvar20181028 GRCh37 build
1
1
Entering edit mode
5.4 years ago

I keep ran into trouble while trying to normalize a clinvar VCF file with vt program, hg19 build. Tried every latest hg19/v37 reference fasta but still having the same error:

[variant_manip.cpp:96 is_not_ref_consistent] reference bases not consistent: Y:555381-555381 A(REF) vs N(FASTA)
[normalize.cpp:209 normalize] Normalization not performed due to inconsistent reference sequences. (use -n or -m option to relax this)

Do you know anywhere to find the reference file clinvar used to build their latest GRCh37 VCF, or anyway to solve this problem? Much appreciated!

software error VT clinvar • 1.9k views
ADD COMMENT
0
Entering edit mode
5.4 years ago

Hello vnttung.iseartclub ,

this position is located in the PAR-Region. This region is usually masked with N on the Y chromosome in the reference files used for alignment. The reasons for that are described in more details in this tutorial.

I wonder how clinVar can be sure that this variant is located on Y and not on X. Nevertheless you have two option:

  1. Ignore variants that are located in the PAR region of the Y chromosome for normalization
  2. Find a reference sequence where this region isn't masked. One way is describe in Which human reference genome should I use?

fin swimmer

PS: @ Bastien Hervé This time I added the link to wiki again ;)

ADD COMMENT

Login before adding your answer.

Traffic: 1709 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6