Hi,
*** EXPERIMENTAL FEATURE -- NO RELEASE DATE -- REQUEST FOR COMMENT ***
I have been working with Luca (of OrientDB) the last couple of days and we have added some more methods to GraphQuery and VertexQuery that provide more complex semantics around querying a graph/vertex according to the underlying support graph-centric and vertex-centric indices. I would like to get people's thoughts on these methods. The work is currently in the queryfeature/ branch available here:
Here are the new methods:
has(key) = "return those elements that have a property with the provided key."
hasNot(key) = "return those elements that do not have a property with provided key."
has(key,values…) = "return those elements that have their property equal to either of the var arg values."
has(key, compare, values…) = "return those elements that have their property comparable to either of the var arg values."
limit(skip, total) = "skip the first X elements and then return a total of Y number of elements."
My personal notes:
1. I've noticed has(key,NOT_EQUAL,null) in numerous client code bases and this is bad as we don't want to make "null" be a token for "emptiness."
- hence the inclusion of has(key) and hasNot(key).
- also, with var args in has(), "null" as an Object becomes ambiguous.
2. While developing Faunus, has(key,values…) was needed to make ORing efficient in MapReduce. Its generally pleasing as it moves us away from filter{ it.value == x | it.values == y || …}.
- we now have AND/OR semantics in querying -- has()-chaining is AND and has(values…) is OR.
- these method signatures, while not "query" will see their way through to Gremlin as Gremlin may compile such chains into respective queries as needed.
3. OrientDB's SKIP/LIMIT functionality is very handy for paging results and for those databases that support non-iterative "jumping," this will be efficient.
Note that all databases will not support all these efficiently and, at minimum, these will be implemented with linear scans. Like always, Blueprints provides default implementations for those databases that don't have anything more intelligent than what can be accomplished via the Blueprints-specific methods. Finally, these default implementations can be extended as needed (e.g. OrientGraphQuery/OrientVertexQuery) to weave in their specific optimizations that implement the semantics in an efficient manner as dictated by the database.
Thank you,
Marko.