Where To Download Refseq Mrna Sequence Data?
2
0
Entering edit mode
10.8 years ago
newDNASeqer ▴ 760

I'm following this tutorial to create indicies of genome and transcripts at http://icb.med.cornell.edu/wiki/index.php/Elementolab/BWA_tutorial, but could not find the RefSeq from http://physiology.med.cornell.edu/faculty/elemento/lab/files/refGene.txt.07Jun2010.fa (the file is no longer there, throws a 403 error).

I have tried to google it, and found this NCBI ftp: ftp://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/RefSeqGene/, however, I'm not sure which file I should use for that tutorial. Can someone help me out? thanks

ps: I am trying to set up everything for doing exome-sequencing analysis (variant calling).

refseq • 6.9k views
ADD COMMENT
1
Entering edit mode
10.8 years ago

FYI, I think you only need the genome reference sequence.

There are some tools that require you to provide gene coordinates, but it isn't absolutely necessary. For example, you can use ANNOVAR to annotate SNPs without having to download an independent gene table. My own variant calling pipeline is BWA --> samtools (mpileup) --> VarScan --> ANNOVAR, and the gene location information is included as part of the ANNOVAR databases.

The aligner website will probably provide you with a link to the indexed genome files (although the BWA website isn't working for me right now). Illumina should also provide a download with several versions of indexed files for commonly used genomes (mouse, human, etc), but you might need an iCOM account:

https://icom.illumina.com/

You can also index your own sequence (from NCBI, UCSC, etc.), if you need to.

ADD COMMENT
0
Entering edit mode

Oh, I just re-read your title.

To clearify, you probably don't need the RefSeq information for DNA-Seq variant calling (using the pipeline that I mentioned above).

For RNA-Seq you probably will need a table of mRNA locations. However, I probably wouldn't remmend using BWA for RNA-Seq. You should use something like TopHat or STAR to handle the exon junctions.

Either way, you probably really want the coordinates rather than the mRNA sequences themselves.

ADD REPLY

Login before adding your answer.

Traffic: 2725 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6