Biostar Beta. Not for public use.
Number of read with fastq file name
0
Entering edit mode
11 months ago
Bioinfonext • 150
Korea

I want no. of reads in fastq file with file name, below script can only give no. of reads in fastq file:

File name are like this:

Soil-6_S22_L001.m150-p1.join.fq

Soil-7.m150-p1.join.fq

Soil-8_S32_L001.m150-p1.join.fq

I am interested to get no. of reads with file name like this:

Soil-6              384994

Soil-7              205889

How should I modify below script?

#!/bin/bash

for i in `ls *.fq`; do echo $(cat ${i} | wc -l)/4|bc; done

Kind Regards

linux bash • 219 views
ADD COMMENTlink
0
Entering edit mode

input fastq:

$ cat test.fq 
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

with awk to print from one file:

$ awk -v OFS="\t" '/@/ {count++} END{print FILENAME,count}' test.fq 
test.fq 3

from multiple files:

$ parallel awk -v "OFS='\t' '/@/ {count++} END{print FILENAME,count}'" {} ::: *.fq
test10.fq   3
test1.fq    3
test2.fq    3
test3.fq    3
test4.fq    3
test5.fq    3
test6.fq    3
test7.fq    3
test8.fq    3
test9.fq    3


$ find . -type f -name "*.fq" -exec awk -v OFS="\t" '/@/ {count++} END{print FILENAME,count}' {} \; 
./test5.fq  3
./test4.fq  3
./test2.fq  3
./test8.fq  3
./test7.fq  3
./test3.fq  3
./test9.fq  3
./test6.fq  3
./test.fq   3
./test10.fq 3
./test1.fq  3
ADD REPLYlink
1
Entering edit mode
7 months ago
India
#!/bin/bash

for i in `ls *.fq`; do file_name=$(basename -s .fq $i);  printf "$file_name\t$(cat ${i} | wc -l)/4|bc\n"; done

Explanation

file_name=$(basename -s .fq $i)

basename = linux command to strip directory and suffix from filenames

-s = SUFFIX, remove a trailing SUFFIX

file_name=$(basename -s .fq $i) = remove the suffix .fq from the given file and store the name in the variable called file_name

ADD COMMENTlink
1
Entering edit mode
12 months ago
France/Nantes/Institut du Thorax - INSE…
for i in  *.fq ; do echo -n "$i " &&  cat $i | paste - - - - | wc -l ; done
ADD COMMENTlink
1
Entering edit mode
11 months ago
JC 7.9k
Mexico
for i in *.fq; do echo "$( echo $i | perl -pe 's/_.*//')  $( grep -c '@' $i)"
ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1