EMBOSS getORF not keeping trinity identifiers
0
0
Entering edit mode
8.0 years ago

I created an assembly using trinity which contains tr# identifiers for each sequence. When I run the fasta file through getORF it generates the ORFs but removes the identifiers. Is there an option for EMBOSS to keep the trinity identifiers. Sample data shown below

>TR4|c0_g1_i1 len=258 path=[236:0-257] [-1, 236, -2]
GTCTGCATTCAGTAGAATTAGAAAACATCAGCCCTGTTTTCCCGATGTTTGATGAACATT
GTTTGGACCATATTCAATATATACATAGGCTGTAGCTTGCCTACATAGTTTTGTTGTCAC
ACTGGGCTATGACCATGAGCTTTCACATCTGACCTTGGACTTCACCCAGGTTTAGGGTCT
CCCCTACCCAGTCGTAGGTCTATCAAGCTTCAAACACGATATGTACATACAGGCAAAGAT
GAACACAAAAGGCTTTAG
>TR5|c0_g1_i1 len=266 path=[244:0-265] [-1, 244, -2]
AAAAAATCGAACGATCACTGCACTTCTCCACTTCTCTCCCTCACCCCCATTCACATACAC
AGGCATCATGGCTCCATCATTTGTCTCGGCTGGTCTTTCCAAGTAAACCTGACCACAGCT
GTGTTGGGCGGACATCCAACTCCTAAGACATGGTTTGGGAGTGATGATGGTTATTGGGTG
TGGCTCAACAAGTACAGACATAATGGGCGGATGCTGAACAGTTGAGAGAGTGGTCGGGAC
AGGTGATGAATATTGGGAGTGGCTGG
RNA-Seq Assembly orf protein • 2.0k views
ADD COMMENT
0
Entering edit mode

Have you tried substituting spaces with an "_" (and replacing |, : and brackets with a _ too)? That would covert the ID into a single string and force EMBOSS to keep it intact.

ADD REPLY

Login before adding your answer.

Traffic: 1770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6