get chr, chrStart, chrEnd, GeneSymbol, strand from UCSC using mysql
1
0
Entering edit mode
9.1 years ago
jfertaj ▴ 110

Hi all,

I am trying to retrieve chrom, chromStart, chromEnd, HUGO symbol, and strand from UCSC genome browser using a mysql query. However I am still a newbie using mysql and end up with an error about syntax. I have been using an answer posted by Pierre Lindenbaum here

My sql query looks like that:

`select distinct
C.chrom,
C.chromStart,
C.chromEnd,
X.geneSymbol
K.strand
from
knownCanonical as C,
knownGene as K,
kgXref as X

where

C.chrom=C.chrom and
C.chromStart=C.chromStart and
C.chromEnd = C.chromEnd and
X.geneSymbol=X.geneSymbol and
K.strand=K.strand

Many thanks in advance

ucsc mysql genome • 2.3k views
ADD COMMENT
2
Entering edit mode
9.1 years ago

X.geneSymbol should be X.geneSymbol,

ADD COMMENT
0
Entering edit mode

Thanks Devon, but still get the same error:

ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '`select distinct
C.chrom,
C.chromStart,
C.chromEnd,
X.geneSymbol,
K.strand
from
' at line 1
ADD REPLY
0
Entering edit mode

Oh, you're literally using the `s? I had assumed that those were just to mark the beginning/end of the query in your post. Get rid of those and put a semicolon (;) at the end of the command.

ADD REPLY
0
Entering edit mode

Just to make this explicit, the command should be:

select distinct C.chrom,C.chromStart,C.chromEnd,X.geneSymbol,K.strand from knownCanonical as C, knownGene as K, kgXref as X where C.chrom=C.chrom and C.chromStart=C.chromStart and C.chromEnd = C.chromEnd and X.geneSymbol=X.geneSymbol and K.strand=K.strand;

for mm10, you get results like:

+-------+------------+----------+---------------+--------+
| chrom | chromStart | chromEnd | geneSymbol    | strand |
+-------+------------+----------+---------------+--------+
| chr1  |    3214481 |  3671498 | 0610005C13Rik | -      |
| chr1  |    3648310 |  3658904 | 0610005C13Rik | -      |
| chr1  |    4343506 |  4360314 | 0610005C13Rik | -      |
| chr1  |    4490927 |  4497354 | 0610005C13Rik | -      |
| chr1  |    4773199 |  4785726 | 0610005C13Rik | -      |
| chr1  |    4807892 |  4846735 | 0610005C13Rik | -      |
| chr1  |    4857693 |  4897909 | 0610005C13Rik | -      |
ADD REPLY
1
Entering edit mode

And if the OP wants the output without the vertical lines, |, all he/she has to do is to add a -B to the query. Like this:

mysql -h  genome-mysql.cse.ucsc.edu -A -u genome -D hg19 -e 'select distinct G.chrom,G.txStart,G.txEnd,G.strand,X.geneSymbol from knownGene as G, kgXref as X where X.geneSymbol in ("LPAR1", "HIPK3", "ASXL1", "KIAA0182", "FOXN2", "ZFY", "MYO9B", "ANRIL 14-5") and X.kgId=G.name' -B
ADD REPLY

Login before adding your answer.

Traffic: 2643 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6