Find Protein Id From Gff File For Results Of Cufflinks
1
0
Entering edit mode
10.1 years ago
ferris.us ▴ 20

I have transcripts.gtf file from cufflinks and gff file from JGI. How can I find the protein id from gff file for each transcript in transcripts.gtf?

protein id gff cufflinks • 3.5k views
ADD COMMENT
0
Entering edit mode

Are you saying that transcript ID appears in both files and you want to know how to match? It would help to see an example line and example IDs from each file.

ADD REPLY
0
Entering edit mode
10.1 years ago

BEDOPS gtf2bed, gff2bed and bedmap could perhaps help, if the GTF and GFF inputs follow specification:

$ gtf2bed < transcripts.gtf > transcripts.bed
$ gff2bed < proteinIds.gff > proteinIds.bed
$ bedmap --echo --echo-map-id-uniq transcripts.bed proteinIds.bed > answer.bed

The file answer.bed will contain transcript elements from the GTF file, along with a semi-colon-delimited list of unique protein IDs from the GFF file, where the GFF element overlaps the Cufflinks-sourced transcript by one or more bases.

ADD COMMENT

Login before adding your answer.

Traffic: 2016 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6