conversion of gene ids into ensembl id
16 months ago

Hi, .I did differential gene expression by using this protocol Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown I have gene list file after using the ballgown the gene id in this files is as id

> MSTRG.28632

MSTRG.3615

MSTRG.7507

MSTRG.70532

MSTRG.49954

MSTRG.60656

I want to perform gene ontology next by using tool AgriGo. these gene ids are not recognized in any database. i have to use the tool bioDBnet to convert these ids into ensembl gene id .but not found result. Kindly suggest me how can i replace with ensembl id.

16 months ago
jean.elbers

The identifiers that you show here are arbitrary and generated by StringTie when creating transcripts. You need to BLAST these sequences to your desired database to functionally annotate them with a potential gene name or NCBI/ENSEMBL/UniProt identifier. Please search through Biostars if you need more information. There is bound to be not identical but similar questions on Biostars that can guide your next steps.

Thanks Jean elbers.

Can you tell me which file contains these sequences? Because i have run the string-tie tool and ballgown.and i have not seen any file contain the sequence of genes.

The "sequences" are stored as intervals in your GFF files from StringTie. You need use something like gffread to extract the sequences (i.e., transcript sequences) from the GFF file. See http://ccb.jhu.edu/software/stringtie/gff.shtml#gffread_ex