Transcript features and annotations

0

Entering edit mode

5.8 years ago

Sergio Martínez Cuesta ▴ 230

I am planning a trancriptome alignment of iCLIP sequencing data. How can I link UCSC hg19 transcript ids with transcript features e.g. coding, non-coding, lncRNA, antisense, pseudogene ...?

I downloaded iGenome's UCSC hg19 reference genome and used the genes.gtf file available with the download to prepare reference sequences for RSEM:

rsem-prepare-reference --gtf genes.gtf --bowtie2 genome.fa ../RSEM_bowtie2/genome

The above command genererates a list of files, one of them being genome.transcripts.fa, a fasta file containing 51398 sequences, one for each transcript (NM_130786, NR_015380, NM_001198818 ...) as defined in the genes.gtf file.

Once I perform a transcriptome alignment using rsem-calculate-expression, how can I then link each transcript id with transcript features such as the ones mentioned above?

Any ideas would be helpful.

transcript alignment • 1.8k views

ADD COMMENT • link updated 5.8 years ago by Biostar 20 • written 5.8 years ago by Sergio Martínez Cuesta ▴ 230

Login before adding your answer.