When should you left-align INDELs (and why?)
1
2
Entering edit mode
5.9 years ago

Say I have two VCFs with 100 samples in each file. Each VCF was joint-called separately and now I want to merge the variant calls.

Do I need to left-align the INDELs in the merged VCF? I've used bcftools norm in the past and got odd results. It seems that vt is a better tool for this.

Is left-aligning only useful for common variants? If I'm interested in rare variants (<0.5% AF) would left-alignment actually matter?

Thanks

Here's an example of bcftools norm

Original VCF

chr7    157009949       .       AGCGGCGGCGGCG   AGCGGCGGCGGCGGCGGCG,A,AGCGGCGGCGGCGGCGGCGGCG,AGCGGCGGCG,AGCGGCGGCGGCGGCGGCGGCGGCG,AGCGGCGGCGGCGGCG

Left-Aligned VCF (with multiallelics split into biallelic calls)

chr7    157009949       .       A       AGCGGCGGCG  
chr7    157009949       .       A       AGCGGCGGCGGCG
chr7    157009949       .       A       AGCGGCG
chr7    157009949       .       A       AGCG
chr7    157009949       .       A       AGCGGCGGCGGCGGCG   
chr7    157009949       .       AGCG    A
chr7    157009949       .       AGCGGCG A
chr7    157009949       .       AGCGGCGGCGGCG   A

Left align indel VCF • 5.2k views
ADD COMMENT
1
Entering edit mode

Hello,

could you please give an example of an "odd result" of bcftools norm?

fin swimmer

ADD REPLY
0
Entering edit mode

Edited the main query above.

ADD REPLY
9
Entering edit mode
5.9 years ago

Suppose one of your VCF files has a non-left-aligned insertion, represented as REF=AG, ALT=AGT, starting at position 99999, and another file has an insertion represented as REF=G, ALT=GT, starting at position 100000. If you don't left-align, these may not be recognized as the same variant, and downstream analysis will suffer.

ADD COMMENT
0
Entering edit mode

Thanks for the clear answer. Do you recommend vt or bcftools for normalization?

ADD REPLY
0
Entering edit mode

Either will work (as long as you aren't using a very old bcftools version). The latest bcftools should be faster, especially if compiled with "libdeflate".

ADD REPLY

Login before adding your answer.

Traffic: 2874 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6