Biostar Beta. Not for public use.
Awk command for fastq read length and number of reads?
0
Entering edit mode
3.0 years ago
oars • 150
@oars41179

Hello,

I am trying to do something similar to this old thread (https://www.biostars.org/p/72433/), were I want to determine both the read length and how many reads are in my fastq file. Here is my code:

gunzip -c SRR1060507_1.fastq.gz|awk 'NR%4==2{printlength($0)}'|uniq -c

But I keep getting the following error:

awk: cmd. line:1: (FILENAME=- FNR=2) fatal: function `printlength' not defined

I'm not sure what I've done incorrectly? I also tried Frederic's code from the old thread and although I got that code to run, its not exactly the output I'm seeking, I should be returning something like 2420797 100

Any help would be super appreciated!

awk fastq • 1.3k views
ADD COMMENTlink
0
Entering edit mode
gunzip -c SRR1060507_1.fastq.gz | awk 'NR%4==2{print length($0)}'

-for length

gunzip -c SRR1060507_1.fastq.gz | awk 'END {print NR/4}'

-for num. of sequences

ADD REPLYlink
3
Entering edit mode
3.0 years ago
cschu181 ♦ 1.7k
@cschu1818927

Try:

print length($0)
ADD COMMENTlink
0
Entering edit mode

Thanks for the suggestion. This worked! I'm very new to both bioinformatics and bash so I feel a bit silly but also very thankful for your help!

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.3