Entering edit mode
6.1 years ago
kinggang
•
0
Hi, I want to r arrange header of multifasta file . is there any command (awk,sed) or script (perl,python) for doing this job. please help
input file:
>lcl|NC_025993.1_cds_XP_019709722.1_4 [gene=LOC105045870] [db_xref=GeneID:105045870] [protein=glucan endo-1,3-beta-glucosidase 14 isoform X1] [protein_id=XP_019709722.1] [location=join(18965..19073,19961..21000,30376..30516)] [OIL PALM]
ATGGACGGCGTCGGAGGAGCGGCGGTGTTTTTCTTTTGCGGACCAGGAGGAGCCCATGTGATGCGGGCCTGCCGATGCTT
GTTCATTCTTCTCCTCTTTCTTCACGGCGGCCTTGTGACAGTTGAGGCGTTTACTGGAACCTATGGAATAAACTATGGCA
>lcl|NC_025993.1_cds_XP_019709703.1_5 [gene=LOC105045870] [db_xref=GeneID:105045870] [protein=glucan endo-1,3-beta-glucosidase 14 isoform X2] [protein_id=XP_019709703.1] [location=join(18965..19073,19961..21000,29762..29797)] [OIL PALM]
ATGGACGGCGTCGGAGGAGCGGCGGTGTTTTTCTTTTGCGGACCAGGAGGAGCCCATGTGATGCGGGCCTGCCGATGCTT
GTTCATTCTTCTCCTCTTTCTTCACGGCGGCCTTGTGACAGTTGAGGCGTTTACTGGAACCTATGGAATAAACTATGGCA
out put should be:
>lcl|NC_025993.1_cds_XP_019709722.1_4 [gene=LOC105045870] [protein=glucan endo-1,3-beta-glucosidase 14 isoform X1] [protein_id=XP_019709722.1] [OIL PALM]
ATGGACGGCGTCGGAGGAGCGGCGGTGTTTTTCTTTTGCGGACCAGGAGGAGCCCATGTGATGCGGGCCTGCCGATGCTT
GTTCATTCTTCTCCTCTTTCTTCACGGCGGCCTTGTGACAGTTGAGGCGTTTACTGGAACCTATGGAATAAACTATGGCA
>lcl|NC_025993.1_cds_XP_019709703.1_5 [gene=LOC105045870] [protein=glucan endo-1,3-beta-glucosidase 14 isoform X2] [protein_id=XP_019709703.1] [gbkey=CDS][OIL PALM]
ATGGACGGCGTCGGAGGAGCGGCGGTGTTTTTCTTTTGCGGACCAGGAGGAGCCCATGTGATGCGGGCCTGCCGATGCTT
GTTCATTCTTCTCCTCTTTCTTCACGGCGGCCTTGTGACAGTTGAGGCGTTTACTGGAACCTATGGAATAAACTATGGCA
thank you.