Should we assemble/merge R1 and R2 reads from Illumina MiSeq of fungal ITS amplicon before further analysis?
4
1
Entering edit mode
9.4 years ago
sentausa ▴ 650

Dear all,

I'm new in fungal ITS (internal transcribed spacer) metabarcoding analysis, so please ask if you don't understand my question.

Our lab received Illumina MiSeq sequencing result of our 150 samples from a company, but they give us only the R1 and R2 fastq files, without the consensus sequences. Our lab used to have 454 sequencing previously, and usually we get the consensus sequences. My question is, should we assemble those R1 and R2 into a consensus sequence before further analysis?

We have a pipeline for further analysis which is based on the 454 data, and I'm afraid that if we don't have consensus sequences beforehand, the R1 and R2 sequences would be recognized as two different ITS by the pipeline.

Any idea?

And thanks a lot in advance.

MiSeq metagenomics ITS Assembly • 6.8k views
ADD COMMENT
3
Entering edit mode
9.0 years ago
sentausa ▴ 650

After a few months, I've learnt one or two things about ITS analysis, and one of them is that R1 and R2 reads might be used without merging them beforehand. In fact, we might fail to identify many species using only the merged paired-reads (it's pointed out in this paper, for example).

ADD COMMENT
1
Entering edit mode
9.4 years ago
5heikki 11k

What's your fragment size? Do the pairs overlap? They should, if you knew what you were ordering. In this case, you should most definitely merge the pairs, QC the seqs, and proceed to OTU clustering and taxonomy assignments..

ADD COMMENT
0
Entering edit mode
9.4 years ago

If you are talking about QIIME pipeline, you should merge them. You can use Flash to do that. But check reads QC after that. You might have to trim them if quality is not very good. And double-check if the adapters are trimmed. Alternatively, you can use just one read but if you want to take an advantage of the PE then you should merge them.

ADD COMMENT
0
Entering edit mode
9.0 years ago

More library structure details are needed, however I completely agree with sentausa. For example in this case

  ----  ---- read1
 ----  ----  read2
----  ----   read3
======       consensus1
      ====== consensus2
...******... region of interest (barcode)

only assembled consensus sequences can be overlapped, therefore pre-overlapping of reads can decrease barcode yield.

ADD COMMENT

Login before adding your answer.

Traffic: 1700 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6