Is there a standard way to index a BAM Index (BAI)?
0
2
Entering edit mode
9.5 years ago
danvdk ▴ 80

A BAM index (BAI) file lets you map loci on the genome to a range of byte offsets in a BAM file. It's essential for browsing large pileups in interactive visualizations like IGV and BioDalliance.

But at some point, the BAM file grows so large that its corresponding BAI file also gets unwieldy. For example, I have an 80GB BAM file with a 9MB BAI file. Loading this file over even a relatively fast network takes many seconds, far longer than users of modern web pages are accustomed waiting.

One solution to this problem would be to only load portions of the BAI file. For example, if I'm looking at chr20, there's no need to download the portions of the BAI file that deal with the other chromosomes. The BAI format doesn't lend itself well to random seeking, however, so this would require some kind of index.

Is there a standard way to index a BAM Index file?

bai alignment bam • 4.6k views
ADD COMMENT
1
Entering edit mode

I wound up implementing a BAI indexer in Python (bai-indexer) and added support for this to BioDalliance.

ADD REPLY
0
Entering edit mode

There's not, though that's probably not a bad idea. You might propose something on the samtools devel email list.

ADD REPLY
2
Entering edit mode

Just making the observation that when we need to index the index file something is evolving the wrong way

like the meme goes ... I've indexed your index file so you can be indexing while indexing ...

ADD REPLY
0
Entering edit mode

Perhaps the answer is the CRAM file format that was added to samtools:

CRAM goes mainline

ADD REPLY

Login before adding your answer.

Traffic: 2610 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6