In a current project, I am trying to find suitable software for two specific tasks. Our first attempts to process the data has led to an overall number of alleles that is impossible to be real, so we need to impose some stricter filters. Most I can write scripts to perform, or use existing tools. The two tasks I cannot find suitable tools for are:
1) Software that will filter fastq's based on the top blast hit for each read. I have tried using blast+, but I can't currently find a way of efficiently going through the hits, converting the accession number to something meaningful to me, and then filtering based on that (likely using grep and a few key words I'm looking for).
2) Software for genotyping highly polymorphic regions. I've tried using jMHC, and I've had success, but it was not very user friendly and even with the path to the MUSCLE binary, it appears to have assigned separate genotypes to alleles that are likely just not aligned correctly. I've found AmpliSAS, but I have no experience with perl, so it might be a little hard to implement.
Thanks in advance.
I have no idea how common people try to filter reads in the manner I suggest in step 1. But, in case there is anyone that does, I have written an R script that parses blast+ output tables in the manner I suggest above - albeit a little specialised in my case.