Why Google may be better to find Uniprot sequences than the NCBI

My good friend Deepak had a quote in his blog from Lincoln Stein about making bioinformatics as much an everyday tool to the practicing biologist as a pipettor ( a device used to dispense liquids by experimental biologists and chemists)..

I totally agree, but  think we are quite far away. For example this morning I had to obtain the sequence of 772 swissprot entries  ,which were part of  an alignment for some downstream analysis. Of course my first choice was to query the NCBI -Entrez database. I soon realized that NCBI query box did not return  any results for  the first few queries I tried, all of which were probably new Uniprot/SwissProt IDs ( for eg. .sequence ids Q57T52_SALCH ,Q325Y4_SHIBS )

Disappointed , I turned to the EBI search engine. Within seconds I realized that the EBI indeed does indeed serve up all of entries. SO there are a subset of uniprot entries that the NCBI does not have in its database.

Out of sheer curiosity I entered the queries that drew a blank at the NCBI into Google.

Wonder of Wonders google pulled up all of the hard to find UniProt entries as the very first Match.
Thanks to the increasing use of publicly accessible web service APIs , Google is becoming more and more aware of a lot of very specific sequence data.

I will be very happy when I can type Q57T52_SALCH calc=MW and get an answer back from right inside google. Maybe that day bioinformatics will move one step closer to becoming just another tool.

Till then I am stuck with learning about Equery and WSDL and SOAP and so on..

Powered by ScribeFire.

Advertisements

4 responses to “Why Google may be better to find Uniprot sequences than the NCBI

  1. Pingback: business|bytes|genes|molecules

  2. Cool, incorporate the gapped alignment [ http://en.wikipedia.org/wiki/Dynamic_programming ] and the extreme-value-distribution [ http://en.wikipedia.org/wiki/Generalized_extreme_value_distribution ] rolled with their AI expertise [ http://norvig.com/ ] and we have something cooler 🙂
    A query like http://www.google.co.in/search?hl=en&as_qdr=all&q=ccatcagcaa+http%3A%2F%2Fwww.ncbi.nlm.nih.gov&btnG=Search&meta= should bring all the sequence close to pattern “ccatcagcaa”!

  3. Pingback: Back on the NCBI horse « The Omics world

  4. After checking out a handful of the articles on your web page, I truly like your technique of blogging.
    I added it to my bookmark site list and will be
    checking back soon. Please visit my web site
    as well and tell me what you think.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s