Biostar Beta. Not for public use.
bowtie: poor mapping with high quality reads
0
Entering edit mode
15 months ago
European Union

Apologies for my inexperience with bowtie.

I have a series of map files all containing reads with very consistent mapping quality: ~35-40.

If not showing go to https://ibb.co/j1fOza

However, when I map them with a fairly generic bowtie command:

bowtie -t -v 2 -p 8 --solexa-quals hg19 -1 end1.fastq -2 end2.fastq out.map


I get consistently poor alignment rates:

the highest is: reads with at least one reported alignment: 537070 (11.58%)

the lowest is: reads with at least one reported alignment: 53707 (0.01%)

There is no documentation on the experiment specifying whether a primer is present in each of these reads. and I am certain it is hg19.

As you can see from the above picture there there is a dip in quality in the first 5 base pairs of the concerned read. This dip is present in all of the reads that I am studying- I thought to get rid of these using the 'Trimmomaster' tools:

fastq_quality_trimmer -t 36 -i end1.fastq -o end1_trim.fastq
fastq_quality_trimmer -t 36 -i end2.fastq -o end2_trim.fastq


However the mappings that resulted from these trimmed reads were consistently even poorer than the originals....

Can anyone critique my use of bowtie to see if I can fix this?

alignment bowtie • 861 views
1
Entering edit mode

--solexa-quals

Unless this data is ancient (in NGS terms) it is unlikely to be in solexa (phred+64) format. You are also using an aligner that does not allow gapped alignments. I suggest that you give bbmap.sh from BBMap suite a try instead of bowtie.

0
Entering edit mode

Try taking some of the unmapped reads and do a blastn. Afterall, it could be a lot of issues. I've gotten data for someone elses samples before, so rule out that possibility first.

0
Entering edit mode

The "dip" is expected in Illumina machines, since the phred score of a base depends on that of the preceding bases and that won't exist at the beginning of reads. Try local alignment instead, bowtie2, and playing with --score-min if needed. Do blast a few reads though too, as suggested by mforde84 .

2
Entering edit mode
18 months ago
Walnut Creek, USA

There are lots of potential problems here. For one thing, how did you get 123bp reads? Are they preprocessed in some way? What platform are they from, and what year? What kind of experiment is it? And why are you using Bowtie1 on such long reads?

You do not need to trim the first 5bp; the dip in claimed quality scores for those bases is false. You may or may not need to do trimming, but the first thing you need to do is use the proper aligner; bowtie1 is fairly good for really short reads (30bp and less), but not for longer reads. Try bowtie2 instead. Also, pairs should never be trimmed independently, only together (E.g., using BBDuk) or the pairing gets broken. Also, you are probably setting the quality score flag incorrectly. All modern reads use Sanger (ASCII-33) quality scores, but you specified old Illumina (ASCII-64), so yeah, the trimming is butchering the data.

0
Entering edit mode

Marked accepted since OP didn't.

0
Entering edit mode
15 months ago
European Union

As per Brian Bushnell's suggestion: bowtie2 greatly increased the rate of alignment.

bowtie2 -x hg19 --very-fast -p 8 -1 end1.fastq -2 end2.fastq -S out.sam


Thank you all for your suggestions