Brief Overview of SolrSherlock and Open DeepQA

jackpark

unread,

Sep 4, 2013, 10:17:33 AM9/4/13

to qa-...@googlegroups.com

By which, I mean, an overview of the SolrSherlock project which provoked, in me, a recognition that there were already projects 'out there' that fall into a general category I called Open DeepQA, along with Open Access, Open Science, and so forth.

SolrSherlock started life as SolrWatson, but Jim Spohrer at IBM pointed out that the Watson project has its intellectual and IP roots in the name Watson, early chairman of IBM; he suggested SolrDrWatson, and Tom Munnecke rounded that out with SolrSherlock, which slides off one's tongue quite nicely! The term Solr in SolrSherlock rests in the original project's core persistence/index mechanism: Apache Solr. Thinking ahead, it's not clear that Solr is the only approach, so a daughter project envisions other backsides, and I've named that OpenSherlock. For now, my own work develops around the Solr platform.

The notion that Watson itself will go open source is not well founded; the possibility that students at rpi.edu will replace internal code with that which could be open sourced is suggested in some places on the web, but remains to be seen. In any case, IBM already gifted the world with UIMA, a core feature of the Watson architecture.

In the case of SolrSherlock, the project started out as a merge agent for a topic map platform. In some topic maps, merge decisions are based on comparison of certain topic property values, much as one determines sameness in an OWL ontology by comparing the object's RDF-ID. But, what about topics which do not carry the same, say, RDF-ID, but which are really about the same topic? In the database field, this is a richly studied field, known variously as record reconciliation, database merge, database federation, and so forth. Merge processes are complex; as I worked on merging truly complex objects I described in my thesis proposal, I began to realize that I was working on a project which would exhibit behaviors not unlike those demonstrated by Watson. Thus, SolrSherlock is an outgrowth of topic map merging agent development; serving as a merge agent is one of its use cases. There, it must answer this question: "Have I seen this before?"

I first described SolrSherlock at my blog. The project's primary thinking document is at DebateGraph. Code for the core project resides at GitHub.

Adam Gibson

unread,

Sep 4, 2013, 10:22:29 AM9/4/13

to qa-...@googlegroups.com

Great information. I'm going to take some of this and modify the github wiki (at least till we can get a dedicated solrsherlock.org) so we have somewhere to link to so this doesn't get buried.

--
You received this message because you are subscribed to the Google Groups "qa-oss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qa-oss+un...@googlegroups.com.
To post to this group, send email to qa-...@googlegroups.com.
Visit this group at http://groups.google.com/group/qa-oss.
To view this discussion on the web visit https://groups.google.com/d/msgid/qa-oss/a77a895b-fb2a-42d1-bcf1-ba330765f41a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Jack Park

unread,

Sep 5, 2013, 12:34:48 PM9/5/13

to Adam Gibson, qa-...@googlegroups.com

SolrSherlock.org is owned; I will ask my ISP to redirect it to
http://solrsherlock.github.io/SolrSherlock/
for the time being.

I'd like to think that it will eventually become a "condo" in the
knowledge garden.

> https://groups.google.com/d/msgid/qa-oss/52274225.1030500%40clevercloudcomputing.com.

Reply all

Reply to author

Forward