Question

BWA mapping parameters for DNA capture sequencing protocol

0

Entering edit mode

4.8 years ago

graeme.thorn ▴ 100

I want to map samples (70bp paired-end) to a 30-gene panel using BWA (mem). They were generated using a capture kit so should have few if any off-target reads.

Should I map the reads to the whole genome and just filter on the gene panel, or just restrict mapping to those genes (+- some margin, say 1000bp)?

This is for variant calling downstream, so accurate mapping to the genes of interest is most important.

Alternatively, are there any better algorithms I could use? (BWA aln/sampe for instance)

alignment bwa dna-seq capture • 2.2k views

ADD COMMENT • link 4.8 years ago by graeme.thorn ▴ 100

score 2 · Answer 1 · 2019-07-23

2

Entering edit mode

4.8 years ago

ATpoint 82k

Whole genome, that filter for reads mapping to the genes you captured. Please use the search fucntion: How To Align Reads Obtained From Sequence Capture

ADD COMMENT • link 4.8 years ago by ATpoint 82k

0

Entering edit mode

Except that with short reads (70bp) and few sequences (30), there are many more locations in the genome that they can map to which is not part of the target, so that answer really does not apply. If I were doing all exons captured or >10K as in that previous answer, then I would do whole genome mapping followed by filtering rather than filtering then mapping on the 30-gene panel.

ADD REPLY • link 4.8 years ago by graeme.thorn ▴ 100

2

Entering edit mode

You have no control over what kinds of off-targets you capturing array binds. Of course you design it to minimize off-targets but this is not a perfect process. Therefore, you always align to the entire genome.

ADD REPLY • link 4.8 years ago by ATpoint 82k

0

Entering edit mode

Hopping in to provide support for ATpoint's comments - they're right. Map to the whole genome, then subset or restrict variant calling to regions of interest. You want your mapping qualities to reflect any uncertainty in genomic placement to produce the most accurate variant calls for your regions of interest. As a thought experiment, consider what might happen if your targets are similar to pseudogenes or contain any sort of duplicated/repetitive element.

ADD REPLY • link 4.8 years ago by Brice Sarver ★ 3.8k

0

Entering edit mode

I am getting (target) mapping rates of between 13% (from a particularly degraded sample) and 40%. Is this normal for such an experiment? I have no experience in this (previous work being on whole genome, whole exome or RNA sequencing) so I have no intuition as to whether this is a good rate or not

ADD REPLY • link 4.8 years ago by graeme.thorn ▴ 100