Entering edit mode
6.5 years ago
Alternative
▴
270
Dear all,
We run targeted exome sequencing for ~100 samples divided in 6 sequencing batches. We noticed a batch effect whereby the samples coming from 2 of the 6 batches are heavily mutated. All samples were sequenced using the same protocol.
Any idea on how to remove batch effect from exome sequencing (targetted) samples.
Thanks for your help
did you look at any BAM QC ?
Yes, I just run some:
Duplication is very high but expected since we have targeted sequencing.
Additionally, since samples correspond to PPFE, I checked in my VCFs the count of C > T conversions expected to be enriched for such samples. There, I found that the 2 over-mutated batches are the one highly enriched for C > T. This sounds like the reason now.
I believe the question now is: How can we remove/correct batch effects related to C > T conversions / PPFE from VCF files?
Thanks,