The 'Bin' Column Used By Sam, Ucsc...
3
4
Entering edit mode
14.0 years ago

Hi all,

Some mysql tables at the UCSC use a special column named 'bin'. For example in http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/snp130.sql:

CREATE TABLE `snp130` (
  `bin` smallint(5) unsigned NOT NULL default '0',
  (...)

It is not a primary key and it seems that this bin-thing is also used by the samtools (e.g. http://samtools.sourceforge.net/tabix.shtml )

What is that column? How is it used?

Pierre

database index • 5.2k views
ADD COMMENT
8
Entering edit mode
14.0 years ago

Sorry, I found an answer to my question in http://samtools.sourcearchive.com/documentation/0.1.6~dfsg/bam__index_8c-source.html

The UCSC binning scheme was suggested by Richard Durbin and Lincoln Stein and is explained by Kent et al. (2002). In this scheme, each bin represents a contiguous genomic region which can be fully contained in another bin; each alignment is associated with a bin which represents the smallest region containing the entire alignment. The binning scheme is essentially another representation of R-tree. A distinct bin uniquely corresponds to a distinct internal node in a R-tree. Bin A is a child of Bin B if region A is contained in B.

In BAM, each bin may span 2^29, 2^26, 2^23, 2^20, 2^17 or 2^14 bp. Bin 0 spans a 512Mbp region, bins 1-8 span 64Mbp, 9-72 8Mbp, 73-584 1Mbp, 585-4680 128Kbp and bins 4681-37449 span 16Kbp regions. If we want to find the alignments overlapped with a region [rbeg,rend), we need to calculate the list of bins that may be overlapped the region and test the alignments in the bins to confirm the overlaps. If the specified region is short, typically only a few alignments in six bins need to be retrieved. The overlapping alignments can be quickly fetched.

ADD COMMENT
0
Entering edit mode
ADD REPLY
3
Entering edit mode
14.0 years ago

Hello Pierre,

[?]

[?]

https://lists.soe.ucsc.edu/pipermail/genome/2010-April/021993.html

Hope this helps

ADD COMMENT
0
Entering edit mode

thanks but it doesn't say how it works :-)

ADD REPLY
0
Entering edit mode
11.9 years ago

Hi Pierre, I'm facing the 'bin' field in UCSC table. I've read your blog http://plindenbaum.blogspot.it/2010/05/binning-genome.html about that. I usually work with Perl and I'm not familiar at all with Java...so is quite impossible to translate your java code. The only articles that explain how to manage this field give few details. Do you know a Perl script that does the same thing? Otherwise can you suggest me a more detailed article? Thanks in advance

ADD COMMENT
1
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 2657 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6