A naive biochemist wakes up to the closed world of chemical abstracts and such

We have a project in the lab that involves screening small molecule inhibitors that inhibit the transport activity of a membrane protein on a “lab scale”  . Having identified one such inhibitor we intended to look for similar molecules that share the same substructure . Substructure query is a standard procedure in chemical informatics . In the past I have screencast the use of the Sigma-Aldrich service to identify molecules from sigmas catalog based on similarity .  However  considering the wealth of biochemically relevant  information PUBCHEM offers , I was curious to try out the substructure query at PUBCHEM .  This Pubchem service works great and is very feature rich ( screencast coming soon) and gave me several molecules which could be of interest in my screen.

The next step I assumed was to locate these compounds in the catalogs of the many chemical providers using a suitable lookup id . Naively I assumed this would be the CAS id which is the “unique id” associated with each molecule . An hour of googling later I woke up to the realization that CAS is a closed subscription based service which has fought many political battles against the PUBCHEM database . Also while PUBCHEM , fortunately ,  and I guess surprisingly allows lookups of its data by CAS ids , sadly it does not spit out CAS ids for the molecules it identifies as related ( at least as far as I could tell)

I am glad for the Entrez provided services that help lookup CID ids ( PUBCHEMs id) for CAS id  and am now wishing I can go the other way i.e CID to CAS .

Its been almost 10 years since I have used the CAS abstracts since I mostly use literature search available for free at PUBMED . I guess I am finally waking up to the closed world of the chemical abstracts offered by the CAS service of the American Chemical Society. For a non-profit service to be this closed , it makes me thankful for Entrez and the NCBI being this open. With all this talk of open source drug discovery , I would think that the least we can do is make our unique id lookups freely interconvertible and public.

refs : The Ridiculous Battles ( my words ) of Pubchem vs CAS  

Who has got the Bottle 

9 responses to “A naive biochemist wakes up to the closed world of chemical abstracts and such

  1. Molecules don’t have IDs – they have structures. The only thing worth using is an ID calculated from the structure, or else just to use the structure itself. You can use eMolecules or ChemSpider to query vendors catalogues using the structure or SMILES.

  2. thanks for the pointer to chemspider
    The reason I use the CAS ID is simple , its easier to clue into similarity based on a numerical ID , which the CAS id represents esp for someone not used to SMILES tag or any such structural representation.

    Also most catalogs only give the canonical name and molecular formula and the CAS id so its easy to see that say sodium cacodylate and sodium dimethylarsinate trihydrate are the same compound since they generally have the same CAS id( 6131-99-3)

  3. Harijay, that CAS number search gets you here.. http://www.chemspider.com/Chemical-Structure.21111.html and all the associated names including both you listed (see under the list of names and synonyms). You may be interested in this: http://www.chemspider.com/blog/cas-registry-numbers-and-how-confused-we-are.html

  4. Thanks, useful material. Has added your blog in bookmarks.

  5. I use the CAS ID !

  6. Prijatno chitat’ na russkom jazyke interesnye i umnye mysli. Uzhe pjat’ let v Anglii zhivu.

  7. The good site very interesting!!! Write to a thicket, it is pleasant to me!

  8. Thanks, useful material. Has added your blog in bookmarks.
    Спасибо за совет, попробую применить у себя.

  9. Oh, I see the buttons now! Stitches looks like it was fabulous — the Madelinetosh yarns are especially gorgeous. Click http://d2.ae/hool090630

Leave a reply to ChemSpiderMan Cancel reply