Question

Illumina Smallrna-Seq Experiment Using Smrna-Seq V1.5 Library Prep - Strange Results When Aligning Reads

1

Entering edit mode

12.2 years ago

Sebastian Kurscheid ▴ 300

Hi,

my first question here in BioStar - so hello everyone! :-)

We have performed a smRNA-Seq experiment using the Illumina v1.5 smRNA library prep kit.

The two groups are ticks (I. scapularis) either fed on an B.burgdorferi-infected mammalian host or an uninfected animal. We dissected the ticks and extracted total RNA from their midgut/hemocoel tissues which contained the mammalian blood and bacteria in the infected group. The RNA integrity was confirmed using Agilen BioAnalyzer and we sent total RNA to our core facility for library preparation and sequencing on a GAIIx.

The core facility multiplexed the same, and after sequencing demultiplexed the data. I took over the analysis from this point, starting with the removal of the Illumina RNA adaptors and sorting of the trimmed reads into two groups:

1) miRNA-candidates 16-25bp reads
2) ncRNA-candidates >25bp reads

For the infected group this resulted in approximately 2 million miRNA-candidate reads, uninfected about 50% more.

Then I attempted mapping of these reads to
1. known mature Iscapularis miRNA (all data from latest miRBase release)
2. known hairpin Iscapularis
3. all other mature miRNA/hairpin RNAs
using bowtie with modified parameters (-l 15 --seedmms 1)

To my surprise only a tiny fraction of the miRNA-candidates mapped to known Iscapularis miRNAs:
infected: 250
uninfected: 354

The number of reads mapping to hairpins were improved, but nowhere near what I expected (One positive thing is that miRNA-mature and hairpin-mapped reads overlap):
infected: 4346
uninfected: 4515

The numbers again improve when mapping to all known miRNAs from miRBase, going into the low 5 digit range, but still representing only a tiny fraction of the total reads.

My next step was to map these reads to the transcriptome and genome of Iscapularis as well as the mouse (mammalian host). This now results in roughly half of the miRNA-candidate reads for both groups being mapped... It still leaves me with the majority of reads being "unknowns".

My questions for people who have done smRNA-Seq experiments before:

Do you typically see any traces of mRNA in your data?

How well does the ligation of the 3'-OH Illumina adaptor discriminate between "true" small RNAs and other molecules?

Is there any way to salvage the data from this experiment, or should I consider it as invalid due to the mapping results so far?

Thanks everyone in advance!

illumina small next-gen sequencing • 4.2k views

ADD COMMENT • link updated 4.6 years ago by Biostar 20 • written 12.2 years ago by Sebastian Kurscheid ▴ 300

0

Entering edit mode

How many reads mapped to the B. burgdorferi genome? Or, how clean is the prep from the tick stomachs such that some reads aren't from another organism?

ADD REPLY • link 12.2 years ago by Larry_Parnell 16k

0

Entering edit mode

take some reads that didn't align and blast against nr

ADD REPLY • link 12.2 years ago by Jeremy Leipzig 22k

0

Entering edit mode

Hi Larry:

about 10,000 reads mapped to the Bburg genome; the prep is pretty "dirty" and I am expecting to see RNA from other bacteria; we've got some microbiome data which I am using to put together as many bacterial genomes as possible to align the reads to

ADD REPLY • link 12.2 years ago by Sebastian Kurscheid ▴ 300

0

Entering edit mode

Hi Jeremy: I am currently assembling contigs from the unaligned reads, and am going to blast them against NT actually

ADD REPLY • link 12.2 years ago by Sebastian Kurscheid ▴ 300