Biostar Beta. Not for public use.
Unzipping bz2 files
0
Entering edit mode
4.4 years ago
espop23 • 60
Switzerland

Hello,

I have around 50 files that end with .bz2 What command can I run in the terminal to collectively unzip all? I have tried bzip2 -d NT*.bz2 (for filenames that all start with "NT_..." but it is rather slow..

Thanks

zip terminal • 1.7k views
0
Entering edit mode

ls NT*.bz2|xargs -i echo bzip2 -dc {}|parallel -j8 This decompresses 8 files in parallel. You can also directly use parallel without xargs.

0
Entering edit mode

Hello espop23!

We believe that this post does not fit the main topic of this site.

Sorry, this isn't a bioinformatics question.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

1
Entering edit mode

I think people should refrain from closing posts that have bioinformatics relevance especially after the post is answered.

Unzipping lots of files is extremely common task in bioinformatics and it is annoying that the normal operation that the OP would want does not work. Sending them away especially when most people here can answer it and is of relevance to day to day work in bioinformatics does not help improve the content of the site.

0
Entering edit mode

A difference of opinion, I guess having come from the original Stack* sites where protocol is more rigorously enforced, to me this is hugely offtopic. I don't particularly want Biostars to turn into a command line helpdesk for people. I felt even better about closing it because there was an accepted answer and a secondary correct answer, there's no need to add to the thread any further - so why not close it?

Still abiding by the 'we'll be happy to talk about it' ;)

2
Entering edit mode
4 weeks ago
genomax 68k
United States

Use the following.
bunzip2 *.bz2
If "slow" means the files are processed serially then you could try start a few jobs via a for loop and put them in background. This would still be slow since you are probably running this on a single computer and you are going to saturate the I/O after a certain number of jobs.
If you have access to a cluster then you could start them as 50 independent jobs.