http://bigdata.sourceforge.net/pubs/bigdata-oscon-7-23-08.pdf
It's open-source, and the presentation looks intriguing and looks like
definitely worth trying out. It's apparently also among the largest
RDF triple stores deployed (http://esw.w3.org/topic/LargeTripleStores).
Chris - has anyone tried to layer the OBD API on top of this?
-hilmar
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
===========================================================
> the core API capabilities have progressed substantially,
> particularly for statistical and semantic similarity measures. In
> the OBDSQL implementation, these make heavy use of
> aggregate query operators (count, group by). There's no equivalent
> in SPARQL, so to be honest I don't have a clue how to go about
> supporting these in the RDFShard other than expensive read-all-
> objects-into-memory-then-count methods.
That's a good point. Personally, I tend to think that that
statistical and similarity searches may be better looked at in
similar ways as for sequence databases, or databases or high-
dimensional data like expression profiling. I.e., those should
probably better be external tools with their own optimized indexes
rather than being layered on top of the capabilities of SQL.
Otherwise algorithmic innovation may be too constrained too.
So I'm not too concerned about that part, actually, because I think
eventually this will be outside of SQL (or whatever storage model one
uses for the assertions and inferences) anyway.
- Jim