Fastx_toolkit with nanopore data
1
1
Entering edit mode
7.2 years ago
glgowers ▴ 10

Hi, I am trying to convert a fastq file (containing multiple sequences) to a fasta file using fastx_toolkit (Hannon lab). However I get this error message:

fastq_to_fasta: Error: invalid quality score data on line 68 (quality_tok = "ATATGCGTGCCATTG...etc)

I notice on another thread that you can put -Q33 to tell it you are using Illumina quality scores. Does anyone know if there is an equivalent flag to tell it I am using nanopore data?

If not can anyone recommend another way to convert these files?

Thank you!

nanopore fastx fastq fasta • 4.3k views
ADD COMMENT
2
Entering edit mode

I think that this package could to what you want:

https://poretools.readthedocs.io/en/latest/index.html

It has a function called poretools fasta

ADD REPLY
0
Entering edit mode

You could try Q33 with Nanopore data which I am sure uses sanger fastq format.

You could also use reformat.sh in=your.fastq out=your.fasta from BBMap suite to achieve the same result.

Edit: If you are using FAST5 format files as input then use poretools as suggested by IƱigo Prada .

ADD REPLY
0
Entering edit mode

You might have to add qin=33 to the reformat.sh making it reformat.sh qin=33 in=your.fastq out=your.fasta otherwise you might get Warning! Changed from ASCII-33 to ASCII-64 on input ;: 59 -> 28.

ADD REPLY
1
Entering edit mode
7.2 years ago
Botond Sipos ★ 1.7k

You can easily do that conversion using biopython:

from Bio import SeqIO 
count = SeqIO.convert("input.fastq", "fastq", "output.fasta", "fasta")
print("Converted %i records" % count)
ADD COMMENT

Login before adding your answer.

Traffic: 1948 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6