replace headers in a fasta file
1
0
Entering edit mode
8.5 years ago

Hi,

I would like to replace the first filed my headers in my fasta file and concatenate it to the 2nd field (my gene ID), such a, I start with this:

> maker-scaffold_0-snap-gene-0.23-mRNA-1 gene=maker-scaffold_0-snap-gene-0.23
ATGGTGAAGCTCGTGGCGTTCTCGCCGTTCCGCTCGGCGCAGAGCGCGCTGGAGAACATGAACGCCGTGT
CCGAGGGGGTCCTGCACGAGGACCTGCGGCTGCTGCTGGACACGGCGCTGCCCCCCAAGAGGAA....

and get this:

>Species1_gene=maker-scaffold_0-snap-gene-0.23
ATGGTGAAGCTCGTGGCGTTCTCGCCGTTCCGCTCGGCGCAGAGCGCGCTGGAGAACATGAACGCCGTGT
CCGAGGGGGTCCTGCACGAGGACCTGCGGCTGCTGCTGGACACGGCGCTGCCCCCCAAGAGGAA....

I tried this:

awk ' { $2="Species1_" $2; print }

but it adds Species 1 at the end of each line including the sequence. I assume I shouldn't be too complicated but don't seem to find the solution.

Thanks a lot!

fasta • 2.6k views
ADD COMMENT
0
Entering edit mode
8.5 years ago
venu 7.1k

If you have maker-scaffold_0-snap-gene-0.23-mRNA-1 -> this part same in all the headers, you can do the following or else you can modify the part after first slash according to your requirements.

perl -pe 's/maker-scaffold_0-snap-gene-0.23-mRNA-1 /Species_1/g' file.fa > modified_file.fa
ADD COMMENT

Login before adding your answer.

Traffic: 1895 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6