Low mapping rate
1
2
Entering edit mode
9.2 years ago
hlsz.laszlo ▴ 50

Dear all,

Recently, I obtained several ChIP-seq data from Saccharomyces Cerevisiae.

After the Illumina sequencing, each fastq contains around ~20 million 50 bp reads. I aligned the reads either with BWA MEM or Bowtie2 to the sacCer3 genome with a very low mapping rate (20% mapped, 80 % unmapped).

I can't figure it out what can cause the unmappability of the reads. Even the input DNA does not align to the genome (50%). I tried to switch genomes but I got always the same overall mapping rate.

What can possibly happened?

Kind Regards,
Laszlo

mapping ChIP-seq • 8.9k views
ADD COMMENT
2
Entering edit mode

Hi Laszlo, did you try to take the unmapped reads and blast them? Look whether it's a high level of contamination or if they map to cerevisiae then you might have to tweak the parameters.

ADD REPLY
1
Entering edit mode

Thank you for the answers.

Tha data is clean from TrueSeq adaptors. Firstly, I run fastqc to check the quality and everything was ok.

I used the default parameters of the aligners.

I tried to align reads to human, mouse or e.coli genome, but the alignment rate was under 1%.

I will try to blast the unmapped reads to find the source.

Thanks again for the answers. Ill update this thread with the blast results.

ADD REPLY
0
Entering edit mode

May be you need to clean your data ?

ADD REPLY
0
Entering edit mode

Try blasting a few of the unmapped reads. Perhaps you got the wrong samples back or your samples had a high level of contamination by another species.

ADD REPLY
2
Entering edit mode
9.2 years ago
mxs ▴ 530

Hi,

to me this looks like a classic mappability problem caused by mapping the reads to repetitive regions. For example if you are trying to map 30-mers to human genome then approx. 25% of the genome will be unmappable if only unique positions are mapped (check the bowtie parameters). What I usually do as one of the first steps is to create a mappability tract (GEM-mappability tool) for the reference species. Then map reads, followed by creating a track of mapped reads and uploading it to the one of the browsers (UCSC or ensembl). The two will give me the information about which regions are mappable and which ones are not and where the mapped reads align to.

Unfortunately UCSC does not contain the mappability info-track for S. cer. so you will need to make one yourself.

Cheers
mxs

ADD COMMENT
1
Entering edit mode

Cerevisiae doesn't have that many repetitive regions. Even if you do mapping ignoring them it would be 80% mapped 20% unmapped, not the other way around.

ADD REPLY

Login before adding your answer.

Traffic: 2660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6