Biostar Beta. Not for public use.
Quickest Way To Get Human Gene Symbols From Refseq Build 37
2
Entering edit mode
3.4 years ago
Paris, France

Hello,

I was wondering what is the quickest way to get a listing of the human Gene Symbols from Refseq Build 37. Thannks in advance for your suggestions.

Fred

ADD COMMENTlink
7
Entering edit mode
11 months ago
France/Nantes/Institut du Thorax - INSE…
 curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/refGene.txt.gz" |\
   gunzip -c | cut -d '        ' -f 13 |\
   sort -u
ADD COMMENTlink
3
Entering edit mode

Why not "sort -u" instead of "uniq | sort | uniq"? http://unixhelp.ed.ac.uk/CGI/man-cgi?sort

ADD REPLYlink
0
Entering edit mode

you're right !

ADD REPLYlink
0
Entering edit mode

obviously, a single command is much quicker than a few clicks on a web browser.

ADD REPLYlink
0
Entering edit mode

Thanks a lot Pierre. In the meantime I was looking in the ftp directory at NCBI without finding a nice tab delimited file that would fit my needs. Very sincerely. Fred

ADD REPLYlink
2
Entering edit mode
10 months ago
deanna.church ♦ 1.1k
Bethesda, MD

RefSeq and Gene work with HGNC to get correct gene nomenclature on the NCBI annotation. NCBI is now making GFF files for each annotation run (current run is annotation run 104). You can find the files here: ftp://ftp.ncbi.nlm.nih.gov/genomes/H_sapiens/GFF/

The name attribute on the 'gene' lines is the HGNC name, if one exists. If not, it will typically be a 'LOC' designator that is used as a placeholder until HGNC can name it.

ADD COMMENTlink
1
Entering edit mode
11 months ago
Santiago de Compostela, Spain

when it comes to gene nomenclature I always trust the most the HUGO Gene Nomenclature Committee (HGNC), which provides an always up-to-date gene list here, although you may find more specific information at their downloads section.

but anyway, if I would have to look for a plain list of all current gene symbols I would go to to BioMart, select the latest gene database available (currently Ensembl Genes 69), not create any filter, and select only the "associated gene name" at the attributes section.

ADD COMMENTlink
0
Entering edit mode
10 months ago
Canada

You can get this from UCSC table browser. Select genome version and RefSeq genes for the track. This will give you a table with RefSeq id and gene names.

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1