fasta sequence manupulation
2
0
Entering edit mode
8.0 years ago
poonam.bi01 ▴ 20

i have fasta file like this:

>1001796365 4.F.1.1.5
MDSIRPATFQIPAAVRELGWAALLLFFVLLSVHEWFSPPGWFGLLAILIFATQGALILTR
WPARQNFGWANRTTLLRSILVVSLVAWAPFLPAADSSALWIYGVACLIALILDGVDGKVA
>1002048002 2.A.4.2.8
MSPSRTARLYFLLVLDLLFFVLEISIGYAVGSLALVADSFHMLNDVVSLIIALYAIKLAA
SSTPTTRYSYGWHRAEILAALVNGVFLLALCFTITLEALERFFSTPEISNPKLIVLVGSL
>1002048004 2.A.4.5.2
IASDIRRILHRHGIHSSTIQPEYHPVRDTILEERSKDVNCLISCPPDSACCEVQACCPSY
AGT

header order in fasta sequence :

 >+first_id then+\t+second_id

i want my sequence in this formate:

 >4.F.1.1.5
MDSIRPATFQIPAAVRELGWAALLLFFVLLSVHEWFSPPGWFGLLAILIFATQGALILTR
WPARQNFGWANRTTLLRSILVVSLVAWAPFLPAADSSALWIYGVACLIALILDGVDGKVA
 >2.A.4.2.8
MSPSRTARLYFLLVLDLLFFVLEISIGYAVGSLALVADSFHMLNDVVSLIIALYAIKLAA
SSTPTTRYSYGWHRAEILAALVNGVFLLALCFTITLEALERFFSTPEISNPKLIVLVGSL
 >2.A.4.5.2
IASDIRRILHRHGIHSSTIQPEYHPVRDTILEERSKDVNCLISCPPDSACCEVQACCPSY
AGT

only

 >+second_id+\n+sequence
sequence alignment • 1.4k views
ADD COMMENT
1
Entering edit mode
  1. No greater-than sign means that it's not fasta
  2. The format you request looks very random in your example
ADD REPLY
0
Entering edit mode

The greater-than sign gets auto-formatted I think, so I guess the post doesn't reflect what OP had in mind.

ADD REPLY
3
Entering edit mode
8.0 years ago
Daniel ★ 4.0k

To code golf-ify the answer, you could do it in fewer keystrokes with sed:

# 21 Keystrokes (+infile.fa)
sed -i 's/^>.\+ />/g' infile.fa

EDIT: Golfing harder:

# 19 Keystrokes (+infile.fa)
sed -i 's/>.\+ />/' infile.fa
ADD COMMENT
0
Entering edit mode
8.0 years ago
venu 7.1k

If I understand it properly, something like this should work.

cat file.fa | paste - - | awk '{print ">"$2"\n"$3}' > new_file.fa

PS: When I copy paste your sequence, there is a gap in the sequence. If it is a formatting problem, it is ok, if not make sure nothing is going wrong.

After reformatting (by genomax2), first linearize the fasta file and use the above.

ADD COMMENT

Login before adding your answer.

Traffic: 2955 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6