Dividing A Protein Sequence Into Equal Parts
3
0
Entering edit mode
11.6 years ago

Dear All, I would be happy if anyone can suggest me command (Sed or AWK one line command) to divide each line of file in equal number of part. For example divide each line in 4 part.

Input:

ATGCATHLMNPHLNTPLML

Output:

ATGCA THLMN PHLNT PLML
command-line • 2.7k views
ADD COMMENT
4
Entering edit mode
11.6 years ago
JC 13k
echo ATGCATHLMNPHLNTPLML | perl -plane '$s = int((length $_) / 4) + 1; s/(.{$s})/$1 /g;'
ATGCA THLMN PHLNT PLML
ADD COMMENT
0
Entering edit mode

When I am giving 38 character as input it gives 4 part; 3 part with 10 charter and last with 8. While I am expecting 2 part with 10 characters and last 2 with 9 character

Input: TGCATHLMNPHLNTPLMLATGCATHLMNPHLNTPLML

Output: ATGCATHLMN PHLNTPLMLA TGCATHLMNP HLNTPLML 10 Char 10 char 10 char 8 char

ADD REPLY
0
Entering edit mode

try my unpack below...

ADD REPLY
1
Entering edit mode
11.6 years ago

More than one way to skin a cat in perl :-)

echo ATGCATHLMNPHLNTPLML | perl -lane '$lag = "a4" x((length $_)/4); @seqs = unpack $lag, $_; print join " ", @seqs'

Your second example works fine with my code.

input TGCATHLMNPHLNTPLMLATGCATHLMNPHLNTPLML

output TGCA THLM NPHL NTPL MLAT GCAT HLMN PHLN TPLM
ADD COMMENT
0
Entering edit mode
11.6 years ago

When I am giving 38 character as input it gives 4 part; 3 part with 10 charter and last with 8. While I am expecting 2 part with 10 characters and last 2 with 9 character

Input: TGCATHLMNPHLNTPLMLATGCATHLMNPHLNTPLML

Output: ATGCATHLMN PHLNTPLMLA TGCATHLMNP HLNTPLML 10 Char 10 char 10 char 8 char

ADD COMMENT
2
Entering edit mode

Please do not post new queries as answers; use the comments area below answers from others. Also, what you state you want is different to what you then illustrate by example. Please be precise.

ADD REPLY
1
Entering edit mode

Why do you want to do this? Could you please better state what are your preferences? It sounds like: "I want to break a sequence into 4 parts, not necessarily of equal length, being the shortest part the longest possible", is this right?

ADD REPLY

Login before adding your answer.

Traffic: 2339 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6