Biostar Beta. Not for public use.
Doc/tutorial UCSC MySQL
0
Entering edit mode
14 months ago
user31888 • 60
United States

I am trying to get the chromosome position of a query gene using UCSC mysql commands (in hg19 genome).

I cannot find any documentation or tutorial about the syntax of the program to be used.
Any idea?

ADD COMMENTlink
5
Entering edit mode
15 months ago
Seattle, WA USA

Example:

$ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -N -e "SELECT k.chrom, kg.txStart, kg.txEnd, x.geneSymbol FROM knownCanonical k, knownGene kg, kgXref x WHERE k.transcript = x.kgID AND k.transcript = kg.name AND x.geneSymbol LIKE 'CTCF';" > CTCF.bed

The chromosome and positions of CTCF in hg19 are in the first three columns of the unsorted BED file result.

A gene can have multiple transcripts, so you can get more than one record for a given HGNC gene name.

This result relies on three tables in the UCSC Genome Browser for database hg19 called: knownCanonical, knownGene and kgXref.

The schema of knownCanonical is located here: http://genome.ucsc.edu/goldenpath/gbdDescriptionsOld.html#KnownCanonical

The schema of knownGene is located here: http://genome.ucsc.edu/cgi-bin/hgTables?db=hg19&hgta_group=genes&hgta_track=knownGene&hgta_table=knownGene&hgta_doSchema=describe+table+schema

Likewise, the schema of kgXref is located here: http://genome.ucsc.edu/goldenpath/gbdDescriptionsOld.html#KgXref

The rest is just a SQL query based on the schemas of the three tables, along with database and host parameters that are specific to UCSC.

Part of the "magic" is knowing what tables and fields to use. This comes from experience with the Genome Browser and exploring the links to schemes that are usually available from the table description pages on the Genome Browser site, as well as scouring discussion threads and asking UCSC mailing lists directly, when that information is difficult to find, or seems to be unavailable.

ADD COMMENTlink
0
Entering edit mode

Thanks Alex !

Do you know where I could find any documentation about the syntax of the command you used (specially for the '-e' argument)?

ADD REPLYlink
0
Entering edit mode

Please see the edit.

ADD REPLYlink
0
Entering edit mode

Thanks Alex for the links!

But the command line doesn't work (it is running indefinitely).

ADD REPLYlink
0
Entering edit mode

Works for me. Maybe their server is slow at the moment? You might check with the UCSC Genome Browser mailing list.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1