compare and merge VCF files
2
0
Entering edit mode
5.4 years ago
qwzhang0601 ▴ 80

We have SNP array data and whole exon sequencing based SNP calling results for the same group of samples.

Now we have genotype data in VCF format from both techniques. The samples are the sames. But the list of SNPs can be different, with some overlapped SNPs from both data.

We want to merge the VCF files (of genotypes) by array data and whole exon sequencing. We wonder whether there are some tools can do this for us (e.g., vcftools). Especially, in our case there are about 26k SNPs whose genotype were called by both array and whole genome sequencing data. And for those overlapped SNPs, I think there must be some genotypes were called differently by two techniques, for certain SNPs and individuals. So I also concern how to deal with the inconsistent genotypes calling when merging the two VCF files.

Thanks.

VCF genotype • 2.3k views
ADD COMMENT
0
Entering edit mode

VCFtools has vcf-compare and vcf-merge. BCFtools has bcftools stats and bcftools merge. Both should do what you want.

ADD REPLY
0
Entering edit mode

Thanks!I will take a look.

ADD REPLY
1
Entering edit mode
5.4 years ago
Shicheng Guo ★ 9.4k

Suppose you have chr22.chip and chr22.imputation to be merged. you can try the following way:

plink --bfile chr22.chip --list-duplicate-vars 
awk '{print $4}' plink.dupvar | grep -v ID > plink.dupvar.id 
plink --bfile chr22.chip --exclude plink.dupvar.id --make-bed --out chr22.chip.rmdup
plink --bfile chr22.imputation --list-duplicate-vars 
awk '{print $4}' plink.dupvar | grep -v ID > plink.dupvar.id 
plink --bfile chr22.imputation --exclude plink.dupvar.id --make-bed --out chr22.imputation.rmdup
plink --bfile chr22.imputation.rmdup --bmerge chr22.chip.rmdup --make-bed --out merge
plink --bfile chr22.chip.rmdup --flip merge-merge.missnp --make-bed --out chr22.chip.rmdup.flip
plink --bfile chr22.imputation.rmdup --bmerge chr22.chip.rmdup.flip --make-bed --out merge
plink --bfile chr22.imputation.rmdup --exclude merge-merge.missnp --make-bed --out chr22.imputation.rmdup.rm3
plink --bfile chr22.chip.rmdup.flip --exclude merge-merge.missnp --make-bed --out chr22.chip.rmdup.flip.rm3
plink --bfile chr22.imputation.rmdup.rm3 --bmerge chr22.chip.rmdup.flip.rm3 --make-bed --out merge
plink --bfile merge  --genome --out merge.ibd

by the way, plink will break all the phase status, so if you want to keep phasestatus. be careful.

ADD COMMENT
0
Entering edit mode

It seems a little bit complex. But thanks.

ADD REPLY

Login before adding your answer.

Traffic: 1544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6