getorf: Remove Overlapping Open Reading Frames
2.7 years ago
JMac • 20
Ireland

Hi,

I am using getorf to extract open reading frames (at least 70 amino acids long) from genomes/scaffolds. I am using the command:

getorf -find 1 -minsize 210 -sequence genome.fasta -outseq orfs.fasta


My question is, does getorf report overlapping ORFs within the same reading frame?

Out of a set of alternative overlapping ORFs within the same reading frame, we only wish to extract the longest ORF.

Thanks for any help

I have read the manual but it doesn't mention anything about alternative/overlapping ORFs.

what do your ids look like in your fasta? If they have similar patterns, such as different isoforms from Trinity for example, where you want to the longest isoform per gene cluster, I have a scrip that may help.

20 months ago
utsafar • 20

I used getorf in galaxy. It reports all ORFs in all reading frames even in complementary strand. For my case I extracted all ORFs of desired contigs and then, using microsoft excel deleted all ORFs of each contig except the longest one.