merge large amount of fastq files into a single one
2
10
Entering edit mode
9.1 years ago
catherine ▴ 250

I have 30 small fastq files from same sample, and I want to merge it into one file. I know the command is

cat file1.fastq file2.fastq > bigfile.fastq

but is there any short cut for doing it? It just looks silly to type 30 file names one by one...

Thank you for any idea!

ChIP-Seq fastq • 92k views
ADD COMMENT
0
Entering edit mode

Those with Windows can use this GUI tool (works also on Linux via wine): http://www.dnabaser.com/download/Merge%20Fasta/index.html

ADD REPLY
12
Entering edit mode
9.1 years ago
cat file*.fastq > bigfile.fastq
ADD COMMENT
0
Entering edit mode

οh yeah! i was so stupid!

ADD REPLY
13
Entering edit mode

Be cautious about this approach! Depending on your system, you can enter an endless loop of concatenating the new file to itself. I strictly do:

cat *.fq > merged.fastq** or **cat *.fastq > merged.fq

...or whatever is needed to ensure the pattern does not match the new file being created.

ADD REPLY
0
Entering edit mode

Does this happen? My understanding is that shell first parses "*.fq" and at that time "merged.fq" has not been generated yet. I bet a lot of people must have typed "cat *.txt > out.txt". Shell developers should have been aware of such an issue for many years. I could be wrong, though.

ADD REPLY
1
Entering edit mode

Actually, it happened to me once. That's why I put the 'file' as prefix for the input and 'bigfile' for the output. But I didn't know that it is system dependent. Thanks for mentioning it, Brian.

ADD REPLY
3
Entering edit mode

I was wrong. You and Brian are right. I can reproduce this endless loop.

ADD REPLY
16
Entering edit mode
9.1 years ago

It just looks silly to type 30 file names one by one...

With file globbing

cat file*.fastq > bigfile.fastq

Note: It also works with fastq.gz files. (http://stackoverflow.com/questions/8005114)

cat file*.fastq.gz > bigfile.fastq.gz
ADD COMMENT
0
Entering edit mode
Error while using: cat*.R1_unmapped.fq  > unmapped_R1.fq

216_7W_Ca1_R1_unmapped.fq  
216_9W_Co2_R1_unmapped.fq 
 218_5W_Pa1_R1_unmapped.fq  
218_7W_Pa2_R1_unmapped.fq  

[root@psgl unmapped]# cat *.R1_unmapped.fq  > unmapped_R1.fq\

cat: *.R1_unmapped.fq: No such file or directory
ADD REPLY
1
Entering edit mode

(extra dot)

cat *_R1_unmapped.fq > unmapped_R1.fq
ADD REPLY
0
Entering edit mode

Nice solution. Yes. Basically you need to do a 'dumb' file merge.

ADD REPLY

Login before adding your answer.

Traffic: 2483 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6