See inline...
On Wednesday, September 4, 2013 6:22:33 PM UTC-7, Wes Freeman wrote:
Maybe I'm not fully qualified to answer this, as I don't have a deep understanding of triple store implementations, but here are a few discussion points:
a) neo4j aims to capture the 95% use case of very fast localized traversals
Where did the figure of 95% use case come from? Do you have any comparisons for how neo4j accomplishes these relative to one of the standard graph notations?
Also I am not sure how "very fast localized traversals" compares to other graph databases. I think it is fair to say the programming style of traversing a small part of a large graph is *different* in neo4j than it is using, say, the sparql query language. I am not sure you can say it is faster, especially given that neo4j is an implementation, while sparql is a standard query language that may be running in a wide variety of implementations from browsers/javascript to various server implementations.
b) neo4j is a pragmatic graph database that implements a "property graph" model, with nodes and relationships that can have properties on each, along with labels on nodes in 2.0--I understand that this is all possible in RDF, but it is a bit harder to grasp how properties on predicates work; the concept of RDF reification is a bit unfriendly, at least to newcomers
How are you measuring "pragmatic" in this case? I would say that neo4j will likely be more immediately familiar to a java (or ruby or... OO) programmer than would a logic-based language like sparql, owl2, etc. Although people with SQL experience may be more immediately comfortable with sparql.
I have not seen any anecdotes about pragmatism once people have gone through the learning curves of either neo4j nor other graph systems. I can only speak for myself and I have relatively little neo4j experience.
c) under the hood, neo4j stores nodes and relationships separately from their properties (with pointers between them), so traversals don't need to be bogged down by the properties if you don't need to inspect them
RDF per se is a conceptual data model and so does not have anything under the hood. Specific implementations of graph databases that read/write RDF serializations and that provide the sparql query language can employ a wide variety of in-memory, disk-based, distributed, etc. strategies. These are implementation distinctions and there are a variety of them in the wild.
e) cypher is a compelling new query language designed to match graph patterns from start points, designed by the neo team for neo4j
Yes, I would summarize the differences as neo4j is an implementation of a graph database. RDF, sparql, owl2, etc. are specifications of a fairly vast array of capabilities (inference, logic, schema, serialization, universal identifiers, etc.) that have many implementations and capabilities that go well beyond what neo4j per se provides. But there's no reason these standards and capabilities could not be fully implemented with neo4j at the core.