align reads to regions with similar sequences
0
0
Entering edit mode
6.0 years ago
qwzhang0601 ▴ 80

When we do alignment of NGS data (i.e., RNA-seq, ChIP-seq) to the genome, we usually allow certain mismatches for the alignment.

Suppose we allow 2 mismatches (we also accept reads mappable to multiple loci) and a read can match to loci A of the genome with 0 mismatch, match loci B with 1 mismatch and match loci C with 2 mismatches, then what we will expect to get from the aligner (e.g., STAR, bowtie, tophat2)? Only the best matched loci were reported, or all three loci will the reported in the SAM file?

Thanks

alignment • 1.2k views
ADD COMMENT
1
Entering edit mode

If these reads have a good mean quality (above 25-30 phred score based) it may means that these reads correspond to a real repetitive locations, which I think is not a common task for RNA-seq or Chip-seq. However, at least in Bowtie2 and HISAT2 you can decide what to do for multi-hit sequence. Read the manual.

ADD REPLY

Login before adding your answer.

Traffic: 1591 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6