Update 'ID' column in VCF file using BCFtools annotate
2
5
Entering edit mode
7.4 years ago
mrxcm3 ▴ 80

I have a very large VCF file where the 'ID' column is a unique ID comprising of 'chr:bp'. I would like to update the 'ID' column to dbSNP IDs.

I have downloaded a bed file [chr, from, to, rsid], which I have sorted and tabix indexed. The bedfile is for hg19, which is correct for my data, chromosomes are formatted with 'chr' and their is no header.

It seems that the BCFtools annotate function does allow 'ID' column to be updated, but I am not clear how. I have tried;

i) bcftools annotate -a dbsnp.bed.gz -c 'CHROM,POS,-,ID' my.vcf.gz

ii) bcftools annotate -a dbsnp.bed.gz -c 'CHROM,FROM,TO,ID' my.vcf.gz

neither of which updated the ID column. I also tried removing the 'ID' first from the VCF -R , and piped the vcf into the two commands above. Perhaps this is not the right tool? Any advice appreciated.

bcftools vcftools • 20k views
ADD COMMENT
4
Entering edit mode
7.4 years ago
William ★ 5.3k
bcftools annotate -c CHROM,FROM,TO,ID -a my_ids.bed.gz   -o output.vcf  input.vcf.gz

works fore me.

It did also take me some time to get it to work and it is hard to debug were the mistake is:

Some things to try / check:

  1. VCF is 1 based, BED is zero based. POS 10 in VCF is start 9 end 10 in BED. https://genome.ucsc.edu/FAQ/FAQformat#format1

  2. Make sure your chromosome names match exactly, Chr1 and chr_1 are not the same for bcftools.

  3. Remove the quotes around the column list 'CHROM,FROM,TO,ID' -> CHROM,FROM,TO,ID

  4. Test with a very small subset of your VCF and BED file that should produce an annotated VCF file. This makes it faster to debug and test different options / formattings until you get it right.

ADD COMMENT
0
Entering edit mode

This worked. It was my chromosome names that were the issue. I thought the bed file has to have 'chr1' format in order to be tabix indexed.

ADD REPLY
0
Entering edit mode

@mrxcm3,

How do you fix chromsome names?

ADD REPLY
0
Entering edit mode

bcftools annotate with the --rename-chrs parameter

ADD REPLY
3
Entering edit mode
7.4 years ago

Might be possible with bcftools annotate, but I use snpsift annotate for the same job. It takes a vcf file from dbSNP for annotation.

ADD COMMENT

Login before adding your answer.

Traffic: 3131 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6