I am analysing some cell-free DNA samples (in the hope of finding variants from circulating tumour DNA) and I am aware of the problems with trying to call variants on such fragmentary DNA data. However, this paper: https://www.nature.com/articles/nature12065 (Murtaza et al, 2013, Non-invasive analysis of acquired resistance to cancer therapy by sequencing of plasma DNA, Nature), explains a protocol for calling and filtering the variants on DNA extracted from plasma samples.
My question is this: following their protocol (map -> samtools mpileup -> varscan -> filtering), what are the best values for the filters in varscan (and indeed, whether to call snps/indels separately or jump straight to the varscan mpileup2cns after building the mpileup file) to obtain the best sensitivity and reduce false positives?
For some samples, we do have matched tumour tissue samples and peripheral blood, so we can implement a white list (for variants in tissues) and black list (for variants in the hopefully-normal blood sample), so downstream analysis is fine.
For some others, we don't have this, so we could apply a black list from a pooled panel of normals (as in Mutect2 - which is not recommented for cfDNA analysis), and a white list around known hotspots for the cancer we are studying.