Converting free-form molecule names into SMILES
4.9 years ago
mrbelt • 10

I have a list of of names of molecules, some of them rather free-form (e.g "Thiolated L-alanine"). I would like to convert these names into SMILES strings. Is there any tool that could assist with dealing with these ambiguous names?

12 months ago
cdsouthan ♦ 1.8k

The Pub Chem Identifer Exhange Service will certainly assist you for the more standardised names that map to CIDs (that you can download as SMILES)

You can also try (if the SMILES download is working)

However, there is (by definition) no solution to ambigous and/or non-standard names (free form as you put it). You will have to eyeball - or better still, find the real person who has invented the free form terms you are having to deal with and get them to substitue clean standard ones such as IUPAC names (e.g. Thiolated L-alanine is even google -ve)

Nothing much I can add to this. BTW, this is exactly the reason why we need literature to list InChI or InChIKeys for all compounds, just like PDB identifiers of official gene names for proteins and DNA snippets.


