Biostar Beta. Not for public use.
the latest human reference genome fasta file
4
Entering edit mode
3.5 years ago
hana • 170
Malaysia

Hi all

I would like to download the latest human reference genome (GRCH38) in fasta and gtf format for my RNA seq analysis. I would like to know which database is the beast,Genbank version 21 or ensemble?

where can I get the fasta file of whole genome of ensemble version?

Is the below link below contains this file?

ftp://ftp.ensembl.org/pub/release-77/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz

Is there any difference between Genebank and Ensembl's alignment and annotation output result ?

thanks in advance

RNA-Seq • 16k views
ADD COMMENTlink
1
Entering edit mode

You can download it from UCSC database: http://hgdownload.cse.ucsc.edu/downloads.html#human

ADD REPLYlink
0
Entering edit mode

Hi all

I would like to download the latest human reference genome (GRCH38) in fasta and gtf format for my RNA seq analysis. I would like to know which database is the beast,Genbank version 21 or ensemble?

where can I get the fasta file of whole genome of Ensembl version?

Is the below link below contains this file?

ftp://ftp.ensembl.org/pub/release-77/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz

Is there any difference between Genbank and Ensembl's alignment and annotation output result ?

thanks in advance

ADD REPLYlink
0
Entering edit mode
2.1 years ago
Manvendra Singh ♦ 2.1k
Berlin, Germany

database is the beast?????

Yes, Its the one from ensembl.

You can download it from here, same way as you previously downloaded hg19 from UCSC

whole genome fasta

http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/

GTFs from

http://genome.ucsc.edu/cgi-bin/hgTables?

ADD COMMENTlink
0
Entering edit mode

Which file is contained the whole genome file?

 [hg38.2bit](http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.2bit) ?    

thank you

ADD REPLYlink
2
Entering edit mode

Just directly download the fasta file. There's no need to deal with 2bit.

ADD REPLYlink
1
Entering edit mode

Exactly, hana was asking about .2bit so I wrote that he can convert them as well.

ADD REPLYlink
0
Entering edit mode

Sorry @Devon, from where I could find fasta file for each individual chromosome for hs37d5.fa ?

ADD REPLYlink
1
Entering edit mode

https://www.gencodegenes.org/human/release_5.html assuming you are referring to release 5. All releases can be found on this page at GENCODE.

ADD REPLYlink
1
Entering edit mode

Yes , you need to convert it to fasta

You can get the utility program TwoBitToFa from here:

http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/

Once you downloaded it, you must change permissions first to allow it to be executed as a program.

Then you execute it from a terminal:

without arguments to see the options:

$ /path/to/twoBitToFa

twoBitToFa - Convert all or part of .2bit file to fasta
usage:
twoBitToFa input.2bit output.fa
ADD REPLYlink
0
Entering edit mode

thank you very much

ADD REPLYlink
0
Entering edit mode

Hi

I have already download the fasta and gtf files of hg38 from USCS database and run tophat.

But I have a problem with running cuffllinks . I got the below error

cufflinks --GTF genome.gtf -o /home/ra/cufflinks /home/ra/accepted_hits.bam

[20:55:32] Loading reference annotation.
Error parsing strand (1) from GFF line:
uc001aaa.3 chr1 + 11873 14409 11873 11873 3 11873,12612,13220, 12227,12721,14409, uc001aaa.3

Can you please tell me what dose it mean and how can I solve it

thank you

ADD REPLYlink
0
Entering edit mode

That's not a GTF file. You have to explicitly set the output format to GTF, otherwise you'll get all of the table columns as is.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1