How to choose correct VCF for Human reference genome?
0
0
Entering edit mode
5.6 years ago

Dear all,

based on this post, it is clear that the human reference sequence provided by NCBI is the best, as I also experienced computationally (see this post). My question is now on the downstream application. What would be the correct VCF associated with the GRCh38 file GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz?

Is clinvar.vcf.gz the right one?

And what about the Homo_sapiens.GRCh38.dna.toplevel.fa.gz? Would it use the same VCF or there is another specific one?

Finally, how can I check beforehand if the headings of the reference fasta file match with those of the VCF file? This to avoid problems such as this or this.

Thank you.

alignment genome VCF • 1.5k views
ADD COMMENT
1
Entering edit mode

Hello marongiu.luigi ,

the only things you have to take care about are:

  1. The vcf is based on hg38 if you aligned to one of the hg38 reference genomes or hg19 respectivly
  2. the naming convention for the chromosomes is the same as in the reference you've aligned to

fin swimmer

ADD REPLY
1
Entering edit mode

Is clinvar.vcf.gz the right one?

This totally depends on the question you want to answer. "Technically correct" is everything that is based on the same reference genome.

ADD REPLY

Login before adding your answer.

Traffic: 1993 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6