Differences Between Reference Human Genome Assemblies From Different Sources
2
4
Entering edit mode
10.5 years ago
alpha2zee ▴ 120

I am relatively new to analysis of whole transcriptome RNA sequencing data. I am planning to map human RNA sequencing reads against the reference human genome/transcriptome (i.e., generate BAM files from fastq files).

I notice that reference genome assemblies are available from a number of sources: UCSC (currently as hg19), Ensembl (currently as GRCh37.73), 1000 Genome project (currently as v37), etc. All of these releases seem to be based on Genome Research Consortium's GRCh37 release.

(1) What are the differences between such different genome assemblies?

(2) What are the differences between the different releases from Ensembl (e.g., GRCh37.70 vs .71)?

(3) For my purpose, aligning raw reads to obtain gene expression data for differential expression analysis, does it matter if one used a particular GRCh37-based reference assembly for a group of samples, and, in the future, for another group of samples used a different GRCh37-based assembly (either a different source or the same source but a different release)?

(4) Finally, can I use the reference genome assembly from one source or release and a gene annotation file from another source or release as long as they all are based on GRCh37?

Thank you.

rna-seq • 13k views
ADD COMMENT
4
Entering edit mode
10.5 years ago

1) What are the differences between such different genome assemblies?

see http://plindenbaum.blogspot.fr/2013/07/g1kv37-vs-hg19.html

2) What are the differences between the different releases from Ensembl

see What's the difference between two versions of the same assembly ?

3)

For human, I would say you'd better use the data of the GATK bundle to stay close to their pipeline

4) yes but you'll' have to verify that they use the same names for the chromosomes (e.g. "chr" prefix)

ADD COMMENT
3
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 2001 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6