I have recently started working with RNA nonopore MinIon long reads. However, at the moment I have more doubts than answers.
From my (weak) understanding, there are 2/3 main approaches regardless to RNA nanopore miniIon:
- direct RNA
- direct cDNA (PCR-free)
- cDNA (with a PCR step)
The data that I have is from the direct cDNA (PCR-free protocol). And my doubts start here. From the nanopore website, it looks like this protocol generates 1D reads (so, reads with TTTTTTT at the start OR AAAAAAA at the ending). Can you confirm it? However, looking to my fastq-trimmed file, I have 2D reads, which means I have reads that start with TTTTTTT AND end with AAAAAAA. Does it be possible for direct cDNA protocol. If so, Can I conclude the strandness from these 2D data?
Thank you all in advance,
Best,
Rui Luís
How did you conclude this?
That's biologically unlikely, although you could have rare chimeric molecules.
Thank you for your comment! Correct me please if I'm wrong, but a 2D read is one that has cDNA double strand with a hairpin making the ligation between them in one of the sides. So, if the majority of my reads start with TTTTT and end with AAAAA, and the reads look a mirror (the sequence until the middle is the reverse complement of the sequence from the middle to the end), I concluded that I had 2D data. Is it a wrong way to think?
Please talk to the one who generated the data to figure out which protocol exactly was used. If basecalling was performed correctly (and it looks like it wasn't) you should NOT get such mirrored reads, but rather the consensus of the template and complement read.
That said, 2D sequencing is dead and deprecated, so your data must be rather old or non-standard.
Thank you so much for your answer. It helped a lot to direct my attentions.
The data was produced using the most recent protocol. So, it is probably my completely fault. Right now I am working with the basecalled files sent by the facility. But, having into account what you said, I think I should re-make the basecalling step. What is the software that you advise?
Could it be 1D^2 data?
You should use the Guppy basecaller.
Is this the latest cDNA PCR-free nanopore protocol?
For the 1D², is expected to find out the "mirror reads"? (even for a wrongly performed basecalling step)
Are you asking me which protocol was used to generate your data?
No, that would surprise me.
Thank you for your advice!