Biostar Beta. Not for public use.
Chromosome Position From Ucsc Genome Browser
3
Entering edit mode
17 months ago
Gjain 5.3k
Göttingen, Germany

Hi all,

I am looking for the coordinates of all the chromosomes in a particular species from ucsc genome browser.

for example, in HG19:

chr1:1-249,250,621
chr2:1-243,199,373
.
.
.
chrX:1-155,270,560

Is there any way to get this list say for human(HG19 or HG18), mouse(mm9 or mm8).

Thanks for your help.

ADD COMMENTlink
3
Entering edit mode

adzpka azdopi, azdazd azdpkpok azdl azd zefpi,ẑepofioif, zeofpzoa,efoi,pẑop zefopi;oi,zoefo azd

azd


$ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -e 'select chrom,RAND(),RAND() from refFlat limit 4'
+-------+--------------------+------------------+
| chrom | RAND()             | RAND()           |
+-------+--------------------+------------------+
| chr1  |   0.81548497994941 | 0.25082845264192 |
| chr1  |   0.80768735409499 | 0.28595144328284 |
| chr1  | 0.0066951848628886 | 0.17562162519956 |
| chr1  |   0.85802260214279 |  0.7632481658489 |
+-------+--------------------+------------------+
ADD REPLYlink
2
Entering edit mode
ADD REPLYlink
0
Entering edit mode

hahah now I understand those random strings :)

ADD REPLYlink
0
Entering edit mode

Thanks Pierre, but I was just looking to find the chromosome start and end.

ADD REPLYlink
0
Entering edit mode

do you know what kind of features it describes ? (genes... )

ADD REPLYlink
0
Entering edit mode

Just a bed file of chromosome coordinates. Basically Chrom# Start End

ADD REPLYlink
4
Entering edit mode
16 months ago
France/Nantes/Institut du Thorax - INSE…

the sizes of the chromosomes are stored in a table named " chromInfo " for each build/organism. Is it the information you're looking for ?

e.g for the chromosomes "chr1":

$ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A  -e 'select "hg18" as build,size from hg18.chromInfo where chrom="chr1" union select "hg19",size from hg19.chromInfo where chrom="chr1"  union select "mm9",size from mm9.chromInfo where chrom="chr1"'
+-------+-----------+
| build | size      |
+-------+-----------+
| hg18  | 247249719 |
| hg19  | 249250621 |
| mm9   | 197195432 |
+-------+-----------+
ADD COMMENTlink
1
Entering edit mode

Not exactly, but using your answer I got mine. Thanks

ADD REPLYlink
3
Entering edit mode
15 months ago
Deutschland

Just another possible way of getting the chromosome sizes:

fetchChromSizes hg18 | perl -ane 'print "$F[0]:1-$F[1]\n";' > hg18.chromSizes

You can download the fetchChromSizes tool at UCSC.

ADD COMMENTlink
0
Entering edit mode

Thanks. I did not know about this tool.

ADD REPLYlink
2
Entering edit mode
17 months ago
Gjain 5.3k
Göttingen, Germany

Thanks Pierre, I used your solution and tweeked a bit to find what I was looking for.

 mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, size from hg19.chromInfo order by chrom"
+-----------------------+-----------+
| chrom                 | size      |
+-----------------------+-----------+
| chr1                  | 249250621 |
| chr10                 | 135534747 |
| chr11                 | 135006516 |

Then I can just convert them to chrom start end where start is always 1 and end is size.

so in the end I have:

chr1:0-249250621
chr10:0-135534747
chr11:0-135006516
.
.
.
ADD COMMENTlink
1
Entering edit mode

Just an update:

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, size, CONCAT(chrom,':',0,'-',size) as coords  from hg19.chromInfo order by chrom"                                                                               
+-----------------------+-----------+--------------------------------+
| chrom                 | size      | coords                         |
+-----------------------+-----------+--------------------------------+
| chr1                  | 249250621 | chr1:0-249250621               |
| chr10                 | 135534747 | chr10:0-135534747              |
| chr11                 | 135006516 | chr11:0-135006516              |

or if you just need information for chromosomes and save it in a tab separated file:

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -B --skip-column-names -e "select chrom, 0, size as coords  from hg19.chromInfo where chrom NOT LIKE 'chr___%' and chrom NOT LIKE 'chrUn_%';" > hg19.genome
ADD REPLYlink
1
Entering edit mode

hello everyone! I was just wandering, supposing you want to use this file later with, for example bedtools, do the start positions need to be listed as 0 instead of 1? thanks!

ADD REPLYlink
2
Entering edit mode

he start positions need to be listed as 0 instead of 1?

yes

ADD REPLYlink
1
Entering edit mode

Yes, you are correct. They internally store everything as the 0-based system and just in the browser, it is 1-based.

For more details: Database/browser start coordinates differ by 1 base

I am confused about the start coordinates for items in the refGene table. It looks like you need to add "1" to the starting point in order to get the same start coordinate as is shown by the Genome Browser. Why is this the case?
Our internal database representations of coordinates always have a zero-based start and a one-based end. We add 1 to the start before displaying coordinates in the Genome Browser. Therefore, they appear as one-based start, one-based end in the graphical display. The refGene.txt file is a database file, and consequently is based on the internal representation.

We use this particular internal representation because it simplifies coordinate arithmetic, i.e. it eliminates the need to add or subtract 1 at every step. If you use a database dump file but would prefer to see the one-based start coordinates, you will always need to add 1 to each start coordinate.

If you submit data to the browser in position format (chr#:##-##), the browser assumes this information is 1-based. If you submit data in any other format (BED (chr# ## ##) or otherwise), the browser will assume it is 0-based. You can see this both in our liftOver utility and in our search bar, by entering the same numbers in position or BED format and observing the results. Similarly, any data returned by the browser in position format is 1-based, while data returned in BED format is 0-based.

For a detailed explanation, please see our blog entry for the UCSC Genome Browser coordinate counting systems.
ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1