Transfer annotation from fasta headers to associated gff
1
0
Entering edit mode
2.9 years ago
EJB • 0

Hi Everyone,

I have a .fasta file with functional and GO annotations. I also have an associated GFF3 file with the locations of these genes in the genome. The IDs from the GFF3 and fasta match. I need to append the annotation information from the fasta header to the notes column of the appropriate lines in the GFF3 file. something like this:

Fasta Headers:

>evm.model.Scaffold_003599.4 Protein=Olfactory_receptor GO=GO:0004930,...

Current GFF3:

Scaffold_003599 EVM mRNA 187035 187979 . + . ID=evm.model.Scaffold_003599.4

Desired GFF3:

Scaffold_003599   EVM   mRNA   187035   187979 . + .  ID=evm.model.Scaffold_003599.4 Protein=Olfactory_receptor GO=GO:0004930,

Basically, I need to replace each

ID=... (in the gff)

with

ID=... Protein=... GO=... (from the fasta headers)

I feel like this should not be that difficult a task, but it is just out of my range of scripting skills at the moment (or it would take me a long time to figure out how to do this by trial and error).

Does anyone know of a current tool or script to accomplish this?

Thanks!

gff annotation fasta genome • 1.0k views
ADD COMMENT
0
Entering edit mode

reference this post .

ADD REPLY
0
Entering edit mode
2.9 years ago
Juke34 8.5k

You can lift this information with agat_sq_add_attributes_from_tsv.pl from AGAT using a csv/tsv file with feature identifier (e.g. gene ID or mRNA ID) in their first column. Invoke the help of this tool for more information.

ADD COMMENT

Login before adding your answer.

Traffic: 2002 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6