Download latest reference genome assembly for exome sequencing alignment and variant calling
0
0
Entering edit mode
5.8 years ago
svlachavas ▴ 790

Dear Community,

i would like to search and download the latest possible human reference genome assembly hg38/GRCh38, in order to use it both in the process of sequence alignment of raw reads, as also for variant calling concerning exome sequencing. However, I'm a bit confused about the available options and the different sources, such as UCSC and NCBI. In detail:

1) If i want to download the latest reference genome human assembly available, then this would be the option : ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/GCF_000001405.38_GRCh38.p12 ?

and specifically the option GRCh38.p12_genomic.fna.gz ?

2) Moreover, the alternative option which is "relatively equivalent" from UCSC, is in the following link:

http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/

However, this contains the original hg38 assembly of 2013 ? and not the latest release like NCBI from above ? or it also includes the relative updates ?

Thank you in advance,

Efstathios-Iason

reference genome dna sequence alignment • 2.5k views
ADD COMMENT
4
Entering edit mode

See this blog post from Heng Li: Which human reference genome to use?

ADD REPLY
0
Entering edit mode

Thank you very much for your link

ADD REPLY
1
Entering edit mode

Take a look at GENCODE which is the official source of human genome data.

ADD REPLY
0
Entering edit mode

Dear genomax,

thank you for your alternative proposal-so, you would suggest for my purpose, the GENCODE reference assembly ? or there are some strengths on each source, that i would have to take into account ?

ADD REPLY
1
Entering edit mode

GRCh38 reference assembly is identical every where and original release did occur in December 2013. Since then patch releases have occurred (but they don't affect chromosomal coordinates). Depending on where you get your annotations they may be slightly different. Is this targeted or whole exome sequencing?

ADD REPLY
0
Entering edit mode

Dear genomax,

thank you for your information and comments- actually whole exome sequencing has been performed (Genomic DNA captured using Agilent in-solution enrichment methodology/paired-end 75 bases massively parallel sequencing on Illumina HiSeq4000) and i already have the fastq files. So, my next step is the alignment of the files, and then variant calling as mentioned.

ADD REPLY

Login before adding your answer.

Traffic: 3082 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6