PCR-cDNA sequence from MinION not alligned
1
0
Entering edit mode
7 weeks ago
marco.barr ▴ 80

Hello everyone, in my initial experiment, I aligned my cDNA sequence from ONT MinION using the following parameters:

./minimap2 -ax map-ont /home/my_reference_PCR.fasta /home/input_pcr_PI4K.fastq > output_PI4K_aligned.sam

Then I built the bam file. I noticed that the alignment focused only on a specific region of the reference with a 10X coverage. To examine other regions, I extracted the reads of maximum length from the fastq file using the following command:

seqkit seq -m $(seqkit stats -T /home/input_pcr_PI4K.fastq | tail -n1 | cut -f 8) /home/input_pcr_PI4K.fastq -o /home/output_PI4K.fastq

However, when aligning this new file using both the -ax map-ont and -x splice commands (since it's a cDNA sequence), I get 0% mapping according to samtools flagstats. I can't figure out why. Is there something wrong with the extraction or do I need to adjust the alignment parameters further? I hope you can help me. Thanks a lot.

minimap2 alignment cDNA • 447 views
ADD COMMENT
0
Entering edit mode
7 weeks ago
Michael 54k

It is possible that the longest reads do not align to the reference properly. Possibly they are more noisy than the rest of the data or from repetitive regions? Possibly you need to play with the options as the alignment may be sensitive to the error rate too. If you want to know what these reads are you could extract the first few as fasta sequence and BLAST or Exonerate them. If you don't get a match then, they are probably from contaminants or symbionts.

ADD COMMENT
0
Entering edit mode

I checked and I don't find any matches on BLAST, I'm changing various parameters continuously but I get the same result. What if I considered doing a de novo assembly? Using Canu since I work with long reads?

ADD REPLY
0
Entering edit mode

Did you check BLASTN vs GenBank (NT) too? It could help to give a little more details about species and samples involved. Which flowcell and basecaller versions were used? Have there been spike-in controls? Have adapters been trimmed? Until now, I was convinced that the sequencers don't simply make up sequences out of thin air. So, yes a de-novo assembly may be something to try.

ADD REPLY
0
Entering edit mode

I checked both and no match. The sequences were already clean from the MinKNOW setting but I provide you with flowcell and model:

Flowcell_id:ALJ911_R9.4.1 basecall_model_version_id: 2021_05_17_dna_r9.4.1_minion_384_d37a2ab9

I know I also think it's a contaminant and the idea of de novo assembly came to mind because I don't know what to think. The only thing perhaps is to repeat the PCR.

ADD REPLY
0
Entering edit mode

That is odd but could be a basecalling artifact. If you don't mind could you post the "ghost" sequences or at least a few examples?

ADD REPLY
0
Entering edit mode

I've authorized you to acces

ADD REPLY

Login before adding your answer.

Traffic: 1528 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6