[Blueprints] Edge Filters in Vertex API -- Looking to the Future of Distributed Graph Databases

72 views
Skip to first unread message

Marko Rodriguez

unread,
Apr 25, 2012, 12:33:19 PM4/25/12
to gremli...@googlegroups.com
Hello,

With theoretical and design help from Matthias Broecheler, the core Blueprints' API has changed to support TinkerPop's future in supporting massive-scale, distributed graph databases. However, do not fear, the API is backwards compatible. Here is the new Vertex API:

public Iterable<Edge> getOutEdges(Object... filters);
public Iterable<Edge> getInEdges(Object... filters);

The Object[] can either be Strings (edge labels) and/or a Filter. A Filter is a new class (not an interface) in the core API.


In this model, you can now filter the edges returned by more than just their label.

marko.getOutEdges("likes", new Filter().property("stars", 5))
marko.getOutEdges("likes", new Filter().property("stars", 5).range("date", 1, 100))

Given that the edge label (String) is a "filtering" feature, you can do this.

marko.getOutEdges(new Filter().label("likes").property("stars",5))

For those graph engines that do not support low-level filtering (i.e. where such filtering must be done in-memory), there is a class called FilteredEdgeIterable. If you are a graph database vendor that is not maintaining your Blueprints implementation with TinkerPop, please contact me if you have trouble updating your implementation to support Filter. For those that do maintain their implementation with TinkerPop, your implementation has already been updated. However, your implementation uses FilteredEdgeIterable. If you can support the "push down" of Filters, then please do so as this will make your implementation faster.


The benefit of this model is that as we move into supporting distributed, massive-scale graph databases (e.g. in clustered environments), we need to be able to provide more information to the database so that the database can do as much filtering as possible on the edges touched. In this respect, various problems commonly seen in massive-scale graphs are rectified:

1. Super node problem: with sufficient filtering and intelligent disk layout, there may never be a need to return all edges of vertex and thus a vertex (from a filtered vantage point) is no longer a "super node."
e.g. marko.getOutEdges("tweeted", new Filter().range("date", tenDaysAgo, today))
2. Respect the dimensionality of a multi-relational graph: if a graph is layed out on the disk properly, then the spinning platter's retrieval bounds is sufficient to filter edges. In this way, the "Big Data" speed pattern of sequential scans of disk can be respected and taken advantage of.

Finally, not now, but before the release of Blueprints 1.3, there will also be another change to the Vertex API. We plan to support:

public Iterable<Edge> getOutVertices(Object... filters);
public Iterable<Edge> getInVertices(Object... filters);

...to allow for vertex-adjacency (and thus, edge jumping).

*** This has been pushed to GitHub and deployed as Blueprints 1.3-SNAPSHOT in SonaType Reo. ***

Thanks everyone,
Marko.

Pierre De Wilde

unread,
Apr 25, 2012, 12:46:21 PM4/25/12
to gremli...@googlegroups.com
Hi,

Thank you, Marko.

Blueprints' API gets much stronger with this addition. I hope that vendors will quickly implement their native support.

Regarding the up-coming vertex-adjacency API, do you mean

public Iterable<Vertex> getOutVertices(Object... filters);
public Iterable<Vertex> getInVertices(Object... filters);

Thanks again for this great addition,
Pierre

Marko Rodriguez

unread,
Apr 25, 2012, 12:47:49 PM4/25/12
to gremli...@googlegroups.com
Hey,

Regarding the up-coming vertex-adjacency API, do you mean

public Iterable<Vertex> getOutVertices(Object... filters);
public Iterable<Vertex> getInVertices(Object... filters);

Ah! Yea. Good eye. You are right, <Vertex> not <Edge>.

Thanks Pierre,
Marko.

Luca Garulli

unread,
Apr 26, 2012, 9:07:59 AM4/26/12
to gremli...@googlegroups.com
Wow,
I can't wait to support it at low level in OrientDB! This API is simple, readable, clear and powerful.

Lvc@

Marko Rodriguez

unread,
Apr 26, 2012, 12:34:09 PM4/26/12
to gremli...@googlegroups.com
Cool Luca. I knew you would be able to natively support these notions in OrientDB. 

Marko.

pShah

unread,
May 14, 2012, 11:53:09 AM5/14/12
to gremli...@googlegroups.com
Has this feature been removed completely or moved to a different module?

Thanks

Marko Rodriguez

unread,
May 14, 2012, 11:55:26 AM5/14/12
to gremli...@googlegroups.com
Hi,

This feature is in TinkerPop 2 and is seen in the following method:

Vertex.query() -> Query

Marko.

pShah

unread,
May 14, 2012, 12:34:48 PM5/14/12
to gremli...@googlegroups.com
I did look at Query in Snapshot-1.3.

However, I did not see a method to specify mutiple filters (specially as an OR) and that's why I was curious.

Thanks
Reply all
Reply to author
Forward
0 new messages