bcftools merge deleting all genotypes from second file
0
0
Entering edit mode
6.1 years ago

Hello All,

I'm trying to merge 2 vcf.gz files and I'm running into a strange behavior using bcftools merge. All of the positions in my second file are being setting to missing (./.) during the merge. Does anyone have any tips for how I might fix this problem?

Here is my command:

bcftools merge -O v -m file1.vcf.gz file2.vcf.gz > out.vcf

Thanks for your help!

bcftools merge • 1.8k views
ADD COMMENT
0
Entering edit mode

All of the positions in my second file are being setting to missing (./.) during the merge

ALL ? can you confirm this ? is there any position that shouldn't be set to './.' (unknown ) ? see also : https://github.com/samtools/bcftools/issues/402

ADD REPLY
0
Entering edit mode

ALL ? can you confirm this ?

Thanks for the suggestion, I can confirm this. I looked closer at the file and while both input files are 85K loci large, the output is 171K. What is happening is that the two files are being concatenated and sites in file2 are set to unknown in the top half, and sites in file1 are set to unknown in the bottom half.

I thought perhaps the problem is the ID column in file2 is set to '.' for every position, and in file1 the ID column is complete and reads CHR_POS. I add the ID field to file2 to see if that was the problem. Unfortunately that did not fix it. So I'm back to square 1. Do you know which field bcftools uses to merge? My files are not in the same order. Perhaps that's the problem?

thx.

ADD REPLY
0
Entering edit mode

My files are not in the same order. Perhaps that's the problem?

It's always better to work with sorted files. So you should give it a try. The other thing I see is, you are using -m in your command. But there is an argument missing for it, isn't it?

-m none   ..  no new multiallelics, output multiple records instead
-m snps   ..  allow multiallelic SNP records
-m indels ..  allow multiallelic indel records
-m both   ..  both SNP and indel records can be multiallelic
-m all    ..  SNP records can be merged with indel records
-m id     ..  merge by ID

fin swimmer

ADD REPLY

Login before adding your answer.

Traffic: 2269 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6