Is indel realigning necessary for INDEL discovery?
2
0
Entering edit mode
6.0 years ago
deepti1rao ▴ 50

I understand that GATK's Indel realigner tool helps in finding the right snps. But, does one need to use it when finding only indels?

indel GATK indelrealigner • 4.5k views
ADD COMMENT
3
Entering edit mode
6.0 years ago

GATK's HaplotypeCaller is both capable of detecting SNVs and InDels using a method that performs local de novo assembly (kind of a local realignment) to call variants, although it doesn't output any realigned bam. so, in summary, there's no need to use IndelRealigner if you are going to call variants through HaplotypeCaller.

that being said, note that GATK4 has removed IndelRealigner from its guts as it is not needed anymore... if you are going to use GATK's pipeline. as Devon says, using IndelRealigner does still make sense if you want to use any other variant caller (including GATK's UnifiedGenotyper) for whatever reason (GATK4's HaplotypeCaller definitely produces higher confidence calls than samtools+bcftools).

ADD COMMENT
1
Entering edit mode

Hello,

GATK's HaplotypeCaller is both capable of detecting SNVs and InDels using a method that performs local realignments to call variants (although it doesn't output any realigned bam)

I often read that people say the HaplotypeCaller is doing local realignment. But that's not true. It's doing local de-novo assembly.

From the manual:

The HaplotypeCaller is capable of calling SNPs and indels simultaneously via local de-novo assembly of haplotypes in an active region. In other words, whenever the program encounters a region showing signs of variation, it discards the existing mapping information and completely reassembles the reads in that region.

fin swimmer

ADD REPLY
1
Entering edit mode

I question whether it's really de novo. I presume they're putting the reference sequence into their de Bruijn graph too (at least that's what I've done when implementing this sort of thing).

ADD REPLY
1
Entering edit mode

It's not 'true' de novo assembly as the reads being assembled have already been mapped to a particular region of the genome. I've done a similar thing for SV calling and I did it without incorporating the reference sequence into the de Bruijn Graph. Whether you do or not, your assembly is already biased towards the reference allele due to the mapping step.

ADD REPLY
0
Entering edit mode

I must agree with you both. I've updated my answer to be more precise on what GATK states and how I personally have always considered it. thank you for the clarification.

ADD REPLY
1
Entering edit mode
6.0 years ago

You don't even need to use it when finding SNPs, so there's no reason to use it for finding InDels. It's mostly still around for those using the unified genotyper rather than the haplotype caller.

ADD COMMENT
0
Entering edit mode

I am using samtools mpileup and bcftools to call variants. In this context, I want to know if i should use the indelrealigner. Alternatively, do you suggest switching to GATK for variant calling, using haplotype caller?

ADD REPLY

Login before adding your answer.

Traffic: 2180 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6