How to get short and sweet gene names/description for gene IDs?
1
1
Entering edit mode
5.2 years ago
WUSCHEL ▴ 750

I am working on Arabidopsis thaliana omics project. How can I get short and proper gene description for each AGI (Gene IDs) numbers? Downloaded database from Tair10 has pretty lengthy names which are difficult to work with (plotting , summarizing in downs-stream works)

e.g.

AT1G01050       Soluble inorganic pyrophosphatase 1 OS=Arabidopsis thaliana (sp|q93v56|ipyr1_arath : 419.0)

AT1G01800   Enzyme classification.EC_1 oxidoreductases.EC_1.1 oxidoreductase acting on CH-OH group of donor(50.1.1 : 434.7) & (+)-neomenthol dehydrogenase OS=Arabidopsis thaliana (sp|q9m2e2|sdr1_arath : 357.0) (original description: none)

From where I can get / how to modify the names short and sweet?

RNA-Seq R proteomics • 1.7k views
ADD COMMENT
6
Entering edit mode
5.2 years ago
  • Go to ensembl's BioMart
  • Choose Dataset: Ensembl Plant Genes and Arabidopsis thaliana genes
  • Choose Filters->Genes -> Input external references ID list -> Gene Stable ID(s) and paste your IDs into the textfield
  • Choose Attributes->Gene and select Gene Stable ID and Gene name
  • Click Result and download in the format you like

You can extract the IDs from the file example above with a simple cut -f1 input_file > gene_ids.txt.

fin swimmer

ADD COMMENT
0
Entering edit mode

Thank a heaps finswimmer :)

ADD REPLY

Login before adding your answer.

Traffic: 1861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6