Biostar Beta. Not for public use.
Changing file name with sed command
1
Entering edit mode
12 months ago

Hi all. I need a little bit of help. I have a list of files in a folder, they look like this:

   H3K4me1/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_304.gff3

H3K4me2/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_965.gff3

H3K4me3/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_3761.gff3

H3K9ac/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_3765.gff3

I would like to change their name with sed command, so the final name would be like this:

H3K4me1_modENCODE_304.gff3

H3K4me2_modENCODE_965.gff3

H3K4me3_modENCODE_3761.gff3

H3K9ac_modENCODE_3765.gff3

I know i can play with sed s and the / but I don't manage to get it to work. I would extremely appreciate if you explain a bit the code you give as an answer.

Thank you in advance! Best wishes,

Jordi

ChIP-Seq • 221 views
ADD COMMENTlink
1
Entering edit mode

every time you put a '#' or a '=' in a filename, god kills a kitten.

ADD REPLYlink
0
Entering edit mode

got them named like this from modENCODE page, but thanks for the information, never will dare to do it (poor kittens!)

ADD REPLYlink
0
Entering edit mode

Just be a little cautious with all the /'s in those names. That'll look like a directory structure to bash.

e.g. touch 'H3K4me1/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_304.gff3' fails miserably, even when hard quoted.

ADD REPLYlink
0
Entering edit mode

+1 Pierre

every time you put a '#' or a '=' in a filename, god kills a kitten.

Then we have the culprit for wars and famines: spaces in filenames.

ADD REPLYlink
6
Entering edit mode
3 months ago
Joe 12k
United Kingdom

This is all you need in terms of an expression:

sed 's|/.*/|_|gi'

e.g.

$ echo "H3K4me1/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_304.gff3" | sed 's|/.*/|_|gi'
H3K4me1_modENCODE_304.gff3

Pro tip, use that regular expression with the rename program.

Explained:

s = substitute

| = sed delimiter so as to avoid confusion with the more traditional /

/ = Match the first forward slash you find

.* = followed by any character, any number of times

/ = until you meet another /.

|_| = replace all the previously matched stuff with an underscore.

gi = globally, and case insensitive (you don't strictly need these, I include them as force of habit).

ADD COMMENTlink
0
Entering edit mode

Thank you! Your answer is very much appreciated

ADD REPLYlink
0
Entering edit mode

Glad it helped. Be sure to accept it to provide closure to the thread if it resolved your problem.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1