Extract sequences from a FASTA file using a gtf file
2
0
Entering edit mode
5.5 years ago
sdbaney ▴ 10

Hi, I have a de novo assembled FASTA file that I used with Cuffdiff. I now have a sorted gtf file (only retained the transcripts that were significantly differentially expressed).

Is there a program that I can use to: 1. pull only the transcripts listed in the gtf file from the FASTA file 2. find the longest ORFs

so that I can run some manual BLAST searches to validate my semi-automatic BLAST.

Is there an easy way to do this? Perhaps I just need to run a search function in the FASTA file using the transcript IDs listed in my sorted gtf file.

RNA-Seq • 5.3k views
ADD COMMENT
1
Entering edit mode
5.5 years ago

The first part looks like a job for bedtools getfasta.

ADD COMMENT
2
Entering edit mode

Duh duh duh duh duh duh doo doo doo da! BEDTools!

ADD REPLY
1
Entering edit mode

Love the comments xD

ADD REPLY
0
Entering edit mode

Dun dun DUNNNNNNNNN!

ADD REPLY
1
Entering edit mode
5.5 years ago
  1. Pull only the transcripts listed in the gtf file from the FASTA file

Use bedtools getfasta.

  1. Find the longest ORFs

See this related thread for some options.

ADD COMMENT

Login before adding your answer.

Traffic: 1470 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6