Biostar Beta. Not for public use.
Question: Perl Script To Retrieve All Orf?
0
Entering edit mode

Hello,

I have a draft assembly. Does anyone know of scripts to retrieve all ORF in protein format from a a fasta file of contigs?

Adrian

ADD COMMENTlink 5.9 years ago Adrian Pelin ♦ 2.3k • updated 9 months ago Abhishek • 20
Entering edit mode
1

Unless this is a prokaryote, getting all open reading frames from a draft assembly is not an informative analysis (splicing, low gene density), and also for bacteria it is of very limited use. Instead, look for gene prediction , e.g. on BioStar: gene-prediction

ADD REPLYlink 5.9 years ago
Michael Dondrup
46k
Entering edit mode
0

I work on microsporidia. They have very little introns (up to 20 genes with introns, some none at all) and small genomes. They are Eukaryotes.

ADD REPLYlink 5.9 years ago
Adrian Pelin
♦ 2.3k
Entering edit mode
0

Then you can use getorf as suggested by R@hul, you should still attempt to do a proper gene prediction.

ADD REPLYlink 5.9 years ago
Michael Dondrup
46k
Entering edit mode
0

Please share the fully functional perl script to translate CDNA to ORF (protein) selecting the longest one only. I have Active Perl installed.

ADD REPLYlink 9 months ago
Abhishek
• 20
3
Entering edit mode

Hi

From EMBOSS toolkit:

getorf -sequence genome.fasta -outseq genome.ORFs -minsize 180 -find 1 &

Cheers!

ADD COMMENTlink 5.9 years ago Rahul Sharma • 580

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0