Multiple alleles in REF and ALT in VCF file
0
0
Entering edit mode
3.3 years ago
Volka ▴ 180

Hi all, I am looking at some data in my VCF file and came across this line below:

20      62855516        20:62855516:GC:AC       GC      AC      .       .       PR;AC=8;AN=70   GT      0/0     0/0     0/0     0/0     0/1     0/1     0/1   0/0      0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/1     0/0     0/0     0/0     0/0     0/1     0/1     0/1     0/0     0/0   0/0      0/1     0/0     0/0     0/0     0/0     0/0     0/0

My question is, is this considered a multiallelic site? How should I handle this entry? I am also looking to compare sites with another VCF, and the equivalent position in the other VCF has G in REF and A in ALT, is there a way to clean the data to consider only the first allele for this entry?

I've tried to remove/fix these entries with bcftools view -m2 -M2 -v snps and bcftools norm -m -any but it doesn't seem to catch it.

Thanks.

vcf allele indel bcftools vcftools • 865 views
ADD COMMENT
0
Entering edit mode

Instead of removing, try decomposing vcfs with Vt. Refer to the decompose biallelic block substitutions section here: https://genome.sph.umich.edu/wiki/Vt

ADD REPLY

Login before adding your answer.

Traffic: 2140 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6