Entering edit mode
7.6 years ago
bioguy24
▴
230
Is there a source or a way to get current transcript versions from a list of NCBI gene id's or by using the gene symbol? I have tried LRG_RefSeqGene and that seems to work but only for ~3600 of my 4700 genes.
For example, my data is tab delimited and by using awk I can search LRG for a match but all of them are not found. Maybe it is better to use the NCBI ID but I am not sure. Thank you :).
NCBI ID Symbol
2 A2M
53947 A4GALT
51146 A4GNT
8086 AAAS
13 AADAC
Can UCSC mysql be used and return the NCBI ID as well as transcript. For example,
mysql --user=genomep --password=password --host=genome-mysql.cse.ucsc.edu -A -D hg19 -e 'select distinct refGene.name,gbCdnaInfo.version from refGene,gbCdnaInfo WHERE refGene.name=gbCdnaInfo.acc' > refseq_version.txt
refseq_version.txt
name version
NM_000014 4
NM_000015 2
NM_000016 5
Can NCBI ID be added to this list?
You should be able to find this in gene2accession file found here. UCSC appears to work as well as you have shown above.
Thank you very much :).