Reasonable Assumptions About Fastq File Integrity
1
0
Entering edit mode
12.5 years ago

Can I assume that the genomic sequences and quality sequences in a FASTQ file will be of the same length — not only within a read, but through the entire file, for all reads?

For example, here are a few reads from a sample file:

@IRIS:7:1:17:394#0/1
GTCAGGACAAGAAAGACAANTCCAATTNACATTATG
+IRIS:7:1:17:394#0/1
aaabaa`]baaaaa_aab]D^^`b`aYDW]abaa`^
@IRIS:7:1:17:800#0/1
GGAAACACTACTTAGGCTTATAAGATCNGGTTGCGG
+IRIS:7:1:17:800#0/1
ababbaaabaaaaa`]`ba`]`aaaaYD\\_a``XT
@IRIS:7:1:17:1757#0/1
TTTTCTCGACGATTTCCACTCCTGGTCNACGAATCC
+IRIS:7:1:17:1757#0/1
aaaaaa``aaa`aaaa_^a```]][Z[DY^XYV^_Y
...

Can I assume the file (or read) is bad, if the read has a shorter genomic and/or quality sequence, e.g. the second read in this example:

@IRIS:7:1:17:394#0/1
GTCAGGACAAGAAAGACAANTCCAATTNACATTATG
+IRIS:7:1:17:394#0/1
aaabaa`]baaaaa_aab]D^^`b`aYDW]abaa`^
@IRIS:7:1:17:800#0/1
GGAAACACTACTTAGGCTTATA
+IRIS:7:1:17:800#0/1
ababbaaabaaaaa`]`ba`]`
@IRIS:7:1:17:1757#0/1
TTTTCTCGACGATTTCCACTCCTGGTCNACGAATCC
+IRIS:7:1:17:1757#0/1
aaaaaa``aaa`aaaa_^a```]][Z[DY^XYV^_Y
...

Or can a FASTQ file deliberately contain reads (and quality strings) of variable lengths?

fastq filter fastq data • 2.4k views
ADD COMMENT
6
Entering edit mode
12.5 years ago

The FASTQ standard requires that for any record the length of the sequence line (line 2) must match the length of the quality line (4).

While instruments usually produce identical sequence lengths for all records this cannot be assumed to be so for all fastq files. For example quality trimming may be applied that could chop off bases from the beginning or end of sequences.

ADD COMMENT
1
Entering edit mode

For example, Ion Torrent produces FastQ files with reads of variable length

ADD REPLY
0
Entering edit mode

Darn. I knew that the sequence and quality strings need to be of identical length, but I was hoping I could get away with reads of same length across the entire file. Thanks to you both for your answers.

ADD REPLY

Login before adding your answer.

Traffic: 2092 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6