Hello everyone,
I am using PoPoolation to calculate Tajima's Pi for 18 pools. I am running the Variance-sliding.pl script with this parameters for each pool
perl Variance-sliding.pl --fastq-type sanger --measure pi --input Pool1.pileup --min-count 2 --min-coverage 10 --max-coverage 100 --min-covered-fraction 0,1 --pool-size 30 --window-size 300 --step-size 300 --min-qual 20 --output Pool1.pi --snp-output Pool1.snps
I chose 300 because I am not using a reference genome but a De Novo assembly (created with Stacks) based on my data (RAD-PE) with a medium length of the contigs of 300. I set up the --min-covered-fraction
to 0,1 because 0,6 (default value) is too high for my data. But, even so, I have many SNPs with a lower value than 0,1 of the --min-covered-fraction
and Tajima's Pi will not be calculated for them.
So, here the question? How can I choose the best value for the --min-covered-fraction
? If I set it to 0 Tajima's Pi will be calculated for every SNPs with the --min-covered-fraction
above 0. But, is that correct? Should I choose it considering some kind of proportion of the window-size (e.g. I saw that if I increase the window-size the values of the--min-covered-fraction of SNPs decrease)?
I am sorry if the question is stupid but I am new in genomic analysis.
Thank you
Maria