Hello,
I am a beginner in variant calling with samtools
and trying to understand its usage. Often the options -m
and -F
are used, together with the option -p
. This is what appears in the documentation for samtools mpileup
:
-p, --per-sample-mF apply -m and -F per-sample for increased sensitivity
-m, --min-ireads INT minimum number gapped reads for indel candidates [1]
-F, --gap-frac FLOAT minimum fraction of gapped reads [0.002]
-m
is explicit and it applies to indel detection, whereas -F
deals with gapped reads, which I guess applies to indel detection as well. Also, since -p
applies both options per-sample, I would guess that they address the same type of analysis (indel detection).
My questions are:
Do both
-m
and-F
apply to indel detection and have nothing to do with SNP detection?How to interpret the default value of
-F
. Is it correct to say that out of500
reads overlapping a genomic region at least0.002*500=1
reads should be gapped in order to do indel related calculations?What if
-m
is TRUE and-F
is not? (i.e. there is 1 gapped read, but 505 reads overlap the region).When would I want to modify the default values? It seems that the defaults are very low.
Thanks!