I'm trying to map genomic sequencing reads (Illumina HiSeq PE100) to a related reference genome. The coding region divergence is about 1% between the organism and the reference, so I allowed 5~8 mismatches in 100bp reads as well as allowing small indels, hoping this could accommodate the higher divergence expected outside the exons. But in the coverage plot, coding regions still got the most coverage. This bias is so severe that it looks like an mRNA-Seq experiment. Of course, there are regions with relatively uniform coverage outside the exons (so they should be true genomic reads), but they're much rarer than the coverage 'deserts' elsewhere. The overall coverage, based on kmers, is about 5X, which can be a reason why this is happening. Also, is there anything wrong I did in terms of the way I approach the mapping process?