Biostar Beta. Not for public use.
All reads with mapq value <= 1, htseq does not attribute counts to any feature
0
Entering edit mode
12 months ago
fhsantanna • 440
Brazil

I am doing an experiment of rnaseq. I have mapped Iontorrent reads (single end, 2 million reads, length average = 50) to a bacterial genome using bowtie2 and bbmap. Problem is that htseq does not attribute mapped reads to any gene, because all alignments have mapq scores =<1. Only if a utilize the option "-a 0" in htseq, I can see that the genes have mapped reads. I was reading throughout biostars that a mapq threshold should be set to 5, at least. Visually, I perceived that most of the alignments were on rRNA gene, demonstrating that the ribosomal reduction was not efficient. But, anyway, most of the genes contained aligned reads. So, is it acceptable to set a mapq score to 0? If not, what are my possibilities? How could I improve mapq scores?

rnaseq mapq • 1.5k views
ADD COMMENTlink
1
Entering edit mode

What is the reason for the low MAPQ scores? I think these scores tell you about how reliable the reads mapped to a certain position, and scores of 1 (or lower) means that the change is 80% (or more) that the read is wrongly positioned!!!

I don't know if it's worth continuing with these odds???

Read more about MAPQ here: http://www.acgt.me/blog/2014/12/16/understanding-mapq-scores-in-sam-files-does-37-42

ADD REPLYlink
0
Entering edit mode

I believe the mapq scores are low because reads are aligning to more than one point in the genome. However I do not understand why these scores are too low and how I can solve this problem.

ADD REPLYlink
2
Entering edit mode

The MAPQ score is defined by the PHRED scale:

Q=-10log10(P)

...where Q is the quality score (MAPQ) and P is the probability of being correct. So, if the mapper thinks a site has a 99% chance of being correct, it gets a mapq of 20. It the aligner thinks it has a 90% chance of being correct, it gets a mapq of 10. By definition, a mapq of 3 or lower has at most a 50% chance of being correct; and therefore, no read that aligns equally well to more than 1 location can ever have a mapq>3. A mapq of 0 is the minimum and it is very low, indicating that either the read aligned to many locations equally well, or that it did not match any location very well.

It would be great if you could easily see a number indicating how well a read aligns to a reference... but the SAM spec does not provide that in the required fields. Which is too bad. If you use BBMap's "idtag" flag, it will produce an optional tag (prefixed by "YI:") indicating the percent identity of the alignment. You will also need to use "ambig=all" to make it report multiple ambiguous alignments instead of just the first.

I don't know how htseq operates internally, but hopefully this will help.

ADD REPLYlink
1
Entering edit mode

I assume you have RNAseq data? Check this answer, and do the same: use BBDuk to check for rRNA contamination on your samples. You may also use SortMeRNA for this, I think it is slightly more sensitive, but a lot slower. Either way, you will know how bad your rRNA contamination was, and both programs output a "dirty" and a "clean" fastq files - you may then remap after removing the rRNA.

ADD REPLYlink
0
Entering edit mode

I filtered my reads using bbduk, at least ~50% of the reads are rRNA. However, this was not the problem, as you can see in my answer.

ADD REPLYlink
1
Entering edit mode

Dr. Simon Andrews just posted this blog article on mapq values as implemented by various aligners.

ADD REPLYlink
0
Entering edit mode
12 months ago
fhsantanna • 440
Brazil

I found a way to have better mapq scores using ion data. Here is the command for bowtie2: bowtie2 --local --very-sensitive-local -p 8 --mm -x example

Now I am able to count in htseq!

Source: https://ioncommunity.thermofisher.com/docs/DOC-7062

ADD COMMENTlink
0
Entering edit mode

Fair warning: The link above requires an account/login to view.

ADD REPLYlink
0
Entering edit mode

Yes, we have to subscribe to get access to the protocol.

ADD REPLYlink
0
Entering edit mode

What was your previous bowtie2 command? The command you posted seems pretty standard, I wonder what you were trying before.

ADD REPLYlink
0
Entering edit mode

Here is the previous command: bowtie2 -x example -U reads_1.fq -S eg1.sam

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1