Question

Get all chemical compounds interacted with target

0

Entering edit mode

5.7 years ago

emanismail.92 • 0

I want to get all Pubchem Chemical compounds (CID) interacted with a given uniprot target. I need it to study relation between Chemical compounds and Targets

Thanks in advance

cid uniprot interaction • 3.9k views

ADD COMMENT • link updated 3.7 years ago by bhavya.v • 0 • written 5.7 years ago by emanismail.92 • 0

0

Entering edit mode

What have you tried?

ADD REPLY • link 5.7 years ago by Ram 43k

0

Entering edit mode

I tried pubchem pug rest api, but can't reach such service

ADD REPLY • link 5.7 years ago by emanismail.92 • 0

0

Entering edit mode

What do you mean by can't reach such service? What location are you based out of? Is the website blocked there?

ADD REPLY • link 5.7 years ago by Ram 43k

0

Entering edit mode

Why are you interested in PubChem, that contains compounds linked to < 1,000 targets? Have you thought in exploring ChEMBL that a much larger proportion of active compounds identified using dose–response assays and linked to > 4,000 targets (see Gauton et al. 2012 for the stats and more details on PubChem and ChEMBL).

Because some of the PubChem data is available in ChEMBL, you could perhaps give the Open Targets Platform a go and get the (ChEMBL) drug compounds (with known mechanism of action) that modulate a target (as UniProt IDs or Ensembl gene IDs).

Drug information is also available via the Open Targets batch search or the REST API. Check this short animation to see how easy is to run (and interpret) the batch search too. Or read the post on Open Targets and programmatic access. If REST is the way for you, this endpoint is one example on how to get the evidence used to link ENSG00000145335 and diseases when filtering for drug information (from ChEMBL).

https://api.opentargets.io/v3/platform/public/evidence/filter?target=ENSG00000145335&datasource=chembl.

This is how one can visualise the associated diseases in the user interface.

If you are not interested in associations with these diseases, I'd recommend you exploring the ChEMBL web services API documentation.

ADD REPLY • link 5.7 years ago by Denise CS ★ 5.2k

0

Entering edit mode

Thanks alot for your reply. I need pubchem CID for each chemical compound. If I use CHEMBL, Is there any way to get Pubchem CID from Chembl ID

ADD REPLY • link 5.7 years ago by emanismail.92 • 0

0

Entering edit mode

There is the PubChem Identifier Exchange Service. If you have a list of ChEMBL IDs, select the option "synonyms' and as output CIDs (or vice-versa). I've tried ChEMBL1000 aka CETIRIZINE, which gives me PubChem CID 2678.

Check more on the theme of Converting between drug identifier formats here on Biostars, as another tool (Cactvs Cheminformatics Toolkit) has been mentioned as well.

I will check with ChEMBL if they have a web interface toolkit (or plan to release one) since they have already all cross-referenced anyway. Perhaps their web services could also do the converting.

And at last but not least, I also recommend you to check Pierre Lindenbaum's comments ;-)

ADD REPLY • link 5.7 years ago by Denise CS ★ 5.2k

Ram · Answer 1 · 2020-08-18

Hi,

even I have a similar problem. I had pubchem CID and wanted to extract ChEMBL targets for compound CIDs. I used pubchem exchange service to convert Pubchem CID to ChEMBL ID to get ChEMBL targets.

The problem is I cant find an API code to get compound related targets from ChEMBL. I don't know which API exactly extracts targets from compound input. Please help me with this.

I used the following API code to get targets, I get most of the targets, but not able to extract uniprotID and gene symbol using this API. Could you please help me

Thank you in advance

import requests
import xmltodict
import pandas as pd
from pprint import pprint
from chembl_webresource_client.new_client import new_client
from collections import defaultdict
from requests.auth import HTTPBasicAuth

Path = 'chembl.txt'
emblout_df = pd.DataFrame(columns=['Compound ID', 'Compound name', 'Target ID', 'Target name'])
row = 0

#read other file
with open(Path, 'r') as f:
    smilesinput = f.readlines()
    smiles = [x[:-1] for x in smilesinput]

#if find special text, write other lines to new file            
for line in smiles:
    compounds2targets = defaultdict(list)

    res = new_client.activity.filter(molecule_chembl_id__in = line).only([
        'molecule_chembl_id', 
        'target_chembl_id', 
        'target_organism',
        'molecule_pref_name',
        'target_pref_name'
    ])

    for target in res.filter(target_organism='Homo sapiens'):
        emblout_df.at[row, 'Compound ID'] = target['molecule_chembl_id']
        emblout_df.at[row, 'Target ID'] = target['target_chembl_id']
        emblout_df.at[row, 'Compound name'] = target['molecule_pref_name']
        emblout_df.at[row, 'Target name'] = target['target_pref_name']
        #emblout_df.at[row, 'Target UniprotID'] = target['target_uniprot_accessions']

        row += 1


emblout_df = emblout_df.drop_duplicates(["Compound ID", "Target ID"])

emblout_df.reset_index(drop=True, inplace=True)

#display(emblout_df)

emblout_df.to_csv('boswellia_targets.tsv', sep = '\t', index = False)

target = new_client.target
for row, chembl in enumerate(emblout_df['Target ID']):
    a = target.get(chembl)

    if not a:
        continue

    b = a['target_components']

    if len(b) == 1:

        c= b[0]

        emblout_df.at[row, 'Uniprot accession'] = c.get('accession')

        d = c['target_component_synonyms']

        for symbol in d:
             if symbol['syn_type'] == 'GENE_SYMBOL':
                    emblout_df.at[row, 'Gene symbol'] = symbol['component_synonym']

                    break
emblout_df