Biostar Beta. Not for public use.
Problem mapping paired-end Illumina reads
0
Entering edit mode
20 months ago
biostart • 290
Germany

Hello, Could you please advise on the following:

We have ChIP-seq data with paired-end Illumina reads. For some of the samples only about 11% or reads could be mapped with Bowtie. When remapping these samples with Bowtie2, up to 85% reads could be mapped, but the pairs have been lost, meaning that for most mapped reads there is no pair available. What could go wrong and how to fix it? Thanks!

ADD COMMENTlink
0
Entering edit mode

Try BWA with just one file, over even a subset of your reads. BWA will estimate insert size from the mapping, and its output may help you understand what went wrong with your Bowtie mapping. If you want to stick with Bowtie / Bowtie2, you then may use BWA estimated mean and sd insert size values as input for Bowtie.

ADD REPLYlink
0
Entering edit mode

I can estimate the average DNA fragment length as ~150 based on the reads which successfully aligned

ADD REPLYlink
0
Entering edit mode
19 months ago
swbarnes2 5.7k
United States

In my limited experience with bowtie, the default settings require a very stringent ranges of acceptable insert sizes. Try changing these in your command line to be more generous.

ADD COMMENTlink
0
Entering edit mode

I tried changing to "-X 1000", but it did not help

ADD REPLYlink
0
Entering edit mode

The other possibility is that the reads from your fastqs are out of sync. Are they the exact same number of lines?

ADD REPLYlink
0
Entering edit mode

The numbers of reads is the same, but their quality seems to be different: I have mapped with Bowtie separately each of the two paired fastq files: for one file I've got 50% reads with at least one reported alignment, whereas for the second fastq file I've got 31% reads with at least one reported alignment. I guess this explains how I end up with even smaller percent of aligned pairs when using paired-end alignment. This is then unrelated to the insert size... But how to fix this is the question

ADD REPLYlink
0
Entering edit mode

Have a look at the quality of read 1 and read2 with FASTQC or similar (fastp ). Do the quality values of the second read drop off markedly along the read ? Try trimming ? Or as suggested above BWA. Bad R2 is pretty common, especially on >150bp reads from some illumina sequencers.

ADD REPLYlink
0
Entering edit mode

FastQC reports for both read 1 and 2 for the problematic samples look similarly problematic, I am posting the images below. Any idea how to correct this?

FastQC quality score

ADD REPLYlink
0
Entering edit mode

per base sequence content

ADD REPLYlink
0
Entering edit mode

GC content

ADD REPLYlink
0
Entering edit mode

kmer content: k-mers

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1