Biostar Beta. Not for public use.
Question: Extracting chromosomal locations by ENTREZID
1
Entering edit mode

I feel like I'm missing smth very trivial, but how to automatically extract the most complete information on genes chromosomal locations using ENTREZID (preferably in R). I specifically have problems with uncharacterized loci (but sometimes with some genes also). When I use some typical approaches in R I get NA for those loci (e.g. by extracting info from org.Hs.eg.db, v. 3.7.0). For example, this gene - https://www.ncbi.nlm.nih.gov/gene/?term=LOC105370787.

ADD COMMENTlink 14 months ago aln • 290 • updated 9 months ago Biostar 20
4
Entering edit mode

Not a solution in R but you can use NCBI unix utils to get this information.

$ efetch -db gene -id LOC105370787 

1. LOC105370787
uncharacterized LOC105370787 [Homo sapiens (human)]
Chromosome: 15; Location: 15q15.1
Annotation: Chromosome 15 NC_000015.10 (40075943..40083225, complement)
ID: 105370787
ADD COMMENTlink 14 months ago genomax 68k
Entering edit mode
0

Thx, at least it sounds much more automatic than googling manually:) But why this info is not in the org.db package, while identifiers for the genes in question are totally present? Because it has not been updated yet?

ADD REPLYlink 14 months ago
aln
• 290
4
Entering edit mode

If you want a tab-delimited output that can be easily imported into R, you can use xtract, another tool from the Entrez Direct package as follows:

esearch -db gene -q LOC105370787 | esummary | xtract -pattern DocumentSummary -element Id,Name -group GenomicInfoType -element ChrAccVer,ChrStart,ChrStop
105370787       LOC105370787    NC_000015.10    40083224        40075942

There's additional information in the XML output of esummary that may be of interest to you.

ADD COMMENTlink 14 months ago vkkodali ♦ 1.1k
Entering edit mode
0

Thanks a lot! Very convenient.

ADD REPLYlink 14 months ago
aln
• 290
0
Entering edit mode

I've also found that if one needs only chromosome and band it is possible to extract this info from the NCBI Homo_sapiens.gene_info.gz file, which has tab separated format.

ADD COMMENTlink 14 months ago aln • 290

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0