RNA-seq mapping rate
1
0
Entering edit mode
5.4 years ago
afli ▴ 190

Hi, I have a basic question about RNA-seq analysis. If reads alignment rate is about 40-50% (from bowties, hisat2, or other alignment tools), would it be appropriate to increase the sequencing depth and get enough aligned reads to do analysis? Or this low alignment rate would cause some bias so we should abandon these samples? Thank you!

The sample is rice, and has high quality reference genome. I used bowtie2 to do the alignment, the summary is:

82280146 reads; of these:
  82280146 (100.00%) were paired; of these:
    41474464 (50.41%) aligned concordantly 0 times
    12443024 (15.12%) aligned concordantly exactly 1 time
    28362658 (34.47%) aligned concordantly >1 times
    ----
    41474464 pairs aligned concordantly 0 times; of these:
      1444965 (3.48%) aligned discordantly 1 time
    ----
    40029499 pairs aligned 0 times concordantly or discordantly; of these:
      80058998 mates make up the pairs; of these:
        73562998 (91.89%) aligned 0 times
        414858 (0.52%) aligned exactly 1 time
        6081142 (7.60%) aligned >1 times
55.30% overall alignment rate

The reason why the rate is low is that there is condamination of some bacterium. I just want to know if this kind of reads could be appropriate for downstream analysis.

RNA-seq • 6.7k views
ADD COMMENT
1
Entering edit mode

Would you mind adding the hisat2 alignment summary here ?

ADD REPLY
0
Entering edit mode

I've added the information above.

ADD REPLY
1
Entering edit mode

It might be interesting to know which species you're working with since a mapping rate of 40% would seem very low in human or mice but not in another species that is less well annotated. And the tissue you are working with obviously also plays into that evaluation.

ADD REPLY
1
Entering edit mode

Please be as complete as possible and add information such as:

  • organism
  • commands used
  • alignment summary data
  • read length
  • library prep method
  • ...
ADD REPLY
0
Entering edit mode

Thank you for your suggestion.

ADD REPLY
0
Entering edit mode

I don't think bowtie2 is a suitable aligner for spliced reads, as I assume rice has.

ADD REPLY
1
Entering edit mode

In case of bacterial contamination, you can use e.g. BBSplit to separate the reads originating from the bacterium. While continuing with the "host" reads, you may want to control for the bacterial influence (directly to the gene expression, or indirectly by distortion of the fragment ratios in the library). You can include it as a factor in your DE-model and check it as Devon suggested with a PCA or a NLDA.

ADD REPLY
1
Entering edit mode

Do the samples have a sufficient read length, so > 50bp. I experienced on downloaded data that low mapping rates might primarily be due to poor read length (like 36bp or 25bp).

ADD REPLY
2
Entering edit mode
5.4 years ago

As a rule of thumb if one of your samples has a much lower alignment rate than the others you're probably going to exclude it in downstream analyses, since it will tend to have other problems. Make a PCA and see if it sticks out as an outlier. If so, exclude it. If not, then I guess you can keep it.

ADD COMMENT
0
Entering edit mode

Thank you, this sounds reasonable.

ADD REPLY

Login before adding your answer.

Traffic: 2522 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6