Biostar Beta. Not for public use.
Block break from gVCF files
Entering edit mode
2.1 years ago
win • 810

Hi all.

I am generating a gVCF file from Isaac Variant Caller and it outputs a gVCF file with all the non variant sites as blocks and I want to break the blocks into single variants line.

There is a way to do this using a BED file so the question is where can a BED file be found for the entire genome OR can this be accomplished without a BED file?

Any help will be highly appreciated.

gVCF • 1.5k views
Entering edit mode

Is this for the whole human genome? Breaking the a whole genome gVCF into a per-nucleotide VCF would result in a very large file.

Entering edit mode
16 months ago
rbagnall ♦ 1.4k

A bed file for the entire human genome (hg19) looks like this:

chr1 1 249250621
chr2 1 243199373
chr3 1 198022430
chr4 1 191154276
chr5 1 180915260
chr6 1 171115067
chr7 1 159138663
chr8 1 146364022
chr9 1 141213431
chr10 1 135534747
chr11 1 135006516
chr12 1 133851895
chr13 1 115169878
chr14 1 107349540
chr15 1 102531392
chr16 1 90354753
chr17 1 81195210
chr18 1 78077248
chr19 1 59128983
chr20 1 63025520
chr21 1 48129895
chr22 1 51304566
chrX 1 155270560
chrY 1 59373566
chrM 1 16571

You can break up the blocks of a gVCF file using the break_blocks utility of gvcftools:

Entering edit mode

Thank you. Could you please share how this was generated?

Entering edit mode

I use BWA program to align NGS data to the human genome. The genome has to be indexed by BWA before running and it produces a .fai file, which has the nucleotide length of each chromosome (second column of .fai file).

The nucleotide chromosome lengths are also found here:

Select the Assembly Statistics, and then Primary Assembly tabs. Assembled molecule for each chromosome gives the nucleotide length.


Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1