Blueprints implementation on top of Datomic

716 views
Skip to first unread message

Davy Suvee

unread,
Apr 6, 2012, 12:47:14 PM4/6/12
to Gremlin-users
Hi guys,

Last week I spend some time on implementing the Blueprints interface
on top of Datomic (www.datomic.com). The RDF and SPARQL feel of the
Datomic datomodel and query approach makes it a good target for
implementing a property graph. I finished the implementation and all
unit tests are passing. Now, what makes it really cool is that it is
the only distributed "temporal" graph database that I'm aware of. It
allows to perform queries against a version of the graph in the past.
To give you a concrete example:

// Create the datomic-based graph
DatomicGraph graph = new DatomicGraph("datomic:mem://
tinkerpop");

// Create the vertex for Davy
Vertex davy = graph.addVertex(null);
davy.setProperty("name","Davy");

// Create the vertex for Stuart
Vertex stuart = graph.addVertex(null);
stuart.setProperty("name","Stuart");

// Add the knows-relationship between Davy and Stuart
Edge davy_stuart = graph.addEdge(null, davy, stuart, "knows");

Thread.sleep(1000);
Date checkpoint = Calendar.getInstance().getTime();
Thread.sleep(1000);

// Reset the name of Davy
davy.setProperty("name","DavyUpdated");

// Create the vertex for James
Vertex james = graph.addVertex(null);
james.setProperty("name","James");

// Add the knows-relationship between Stuart and James
Edge stuart_james = graph.addEdge(null, stuart, james,
"knows");

// Retrieve all "knows" relationships that are "currently" in
the database
System.out.println("Current relationships:");
Iterator<Edge> edgesit = graph.getEdges().iterator();
while (edgesit.hasNext()) {
Edge edge = edgesit.next();
System.out.println(edge.getOutVertex().getProperty("name")
+ " -> " + edge.getLabel() + " -> " +
edge.getInVertex().getProperty("name"));
}

// Set the checkpoint in the past
graph.setCheckPoint(checkpoint);

// Retrieve all "knows" relationships that were in the database
at the specified checkpoint
System.out.println("Relationships at checkpoint " + checkpoint
+ ":");
edgesit = graph.getEdges().iterator();
while (edgesit.hasNext()) {
Edge edge = edgesit.next();
System.out.println(edge.getOutVertex().getProperty("name")
+ " -> " + edge.getLabel() + " -> " +
edge.getInVertex().getProperty("name"));
}

// Shutdown graph
graph.shutdown();

This outputs:

Current relationships:
DavyUpdated -> knows -> Stuart
Stuart -> knows -> James
Relationships at checkpoint Fri Apr 06 17:29:55 CEST 2012:
Davy -> knows -> Stuart

It does not only support versioning of the vertices and edges, but
also on the properties of individual vertices/edges. Will donate my
code to the tinkerpop project somewhere next week.

Any feedback would be welcome,

Davy Suvee
Datablend.be

James Thornton

unread,
Apr 6, 2012, 1:18:23 PM4/6/12
to gremli...@googlegroups.com
Hi Davy -

This is great. Have you been able to test drive it on the distributed DynamoDB service?

I am eager to see the code because I just started to dig into Datomic myself -- is the pre-release code on GitHub somewhere? :)
 
BTW: There's a Datomic thread on this too (https://groups.google.com/d/topic/datomic/Sv7adJBSDZc/discussion).

- James

James Thornton

unread,
Apr 6, 2012, 1:21:41 PM4/6/12
to gremli...@googlegroups.com
FYI -

Datomic DB (http://datomic.com) is a new Java-based immutable, transactional, distributed database by Rich Hickey, the guy who created Clojure.

Its query language is Datalog (http://en.wikipedia.org/wiki/Datalog),
which is a subset of Prolog.

The datastore is decoupled, and it currently supports DynamoDB
(http://aws.amazon.com/dynamodb/), Amazon's new distributed SSD
datastore-as-as-service.

Here are some video overviews...

http://www.infoq.com/interviews/hickey-datomic
http://datomic.com/company/resources/tutorial_video

Some interesting discussions...

http://nosql.mypopescu.com/post/19310504456/thoughts-about-datomic
http://blog.fogus.me/2012/03/05/datomic/
http://news.ycombinator.com/item?id=3667049

- James

Davy Suvee

unread,
Apr 6, 2012, 1:32:19 PM4/6/12
to Gremlin-users
Not yet ... Only on the in memory one and the dev appliance ... I
mainly focussed on correctness until now ... Will deal with
performance later.

Will put it on github early next week to pull it to the Tinkerpop
project.

Davy

Pierre De Wilde

unread,
Apr 6, 2012, 4:49:46 PM4/6/12
to gremli...@googlegroups.com
Hi Davy,

Cool. Let us know when pushed on github.

Thanks,
Pierre

Joshua Shinavier

unread,
Apr 6, 2012, 5:00:23 PM4/6/12
to gremli...@googlegroups.com
Very nice. I look forward to trying out the code.

Note: if you have implemented IndexableGraph, we can look at layering
GraphSail on top of your impl. The temporal aspect of Datomic would
be very interesting (perhaps even unique?) in a general-purpose RDF
triple store.

Josh

Davy Suvee

unread,
Apr 7, 2012, 3:10:18 PM4/7/12
to gremli...@googlegroups.com
Currently, I just implemented the Graph Interface. I don't see immediate problems for implementing it as an IndexableGraph. Will have a look next week.

Davy Suvee

unread,
Apr 11, 2012, 1:35:21 PM4/11/12
to Gremlin-users
Hi all,

I just published the Datomic Blueprints implementation on Github:
https://github.com/datablend/blueprints/tree/master/blueprints-datomic-graph
I've also send a pull request.

Feedback on the implementation is welcome,

Davy

Pierre De Wilde

unread,
Apr 11, 2012, 2:58:05 PM4/11/12
to gremli...@googlegroups.com
Hey Davy,

Thanks for this great contribution. 

Can you provide a wiki page to explain how to use it ? When your pull request will be merged, you should add a reference to your Datomic wiki page from:


Thanks,
Pierre

Davy Suvee

unread,
Apr 11, 2012, 3:01:19 PM4/11/12
to gremli...@googlegroups.com
Sure, will write something up tomorrow.

Greetings,

Davy

James Thornton

unread,
Apr 11, 2012, 3:19:26 PM4/11/12
to gremli...@googlegroups.com
Hi Davy -

Thanks for doing this. I had to add Clojure to the dependency list, but I'm still getting a build error....


- James

Marko Rodriguez

unread,
Apr 11, 2012, 3:21:08 PM4/11/12
to gremli...@googlegroups.com
Hi Davy,

Thanks for the contribution. Here are some preliminary notes:
1. All single-step methods (e.g. getOutEdges(), getInVertex(), etc.) make use of a query to Datomic as a pushed java.lang.String with bound and unbound variables.
- this is very triple-store style and as such, is slow for traversal?
- this is very RexsterGraph in that its over HTTP and as such might have performance issues?
2. Its nice that you have the primary test suite methods implemented. What are the troubles you are having with the transaction- and index-based test suites?

Others -- I'm pointing at your James :) -- it would be good if someone can take this for a test drive and see how it "feels" with some Gremlin queries, etc.

Thanks again Davy, I super appreciate all the work you have done for TinkerPop now and in the past (-- our transaction model is the way it is because of an old blog post of yours).

Take care,
Marko.

http://markorodriguez.com

Davy Suvee

unread,
Apr 11, 2012, 4:22:32 PM4/11/12
to gremli...@googlegroups.com
Hi James,

You need to manually add the Datomic dependency as it's not available on public repositories. Mentioned it in the pom file.

<!-- datomic (use mvn install:install-file from the datomic download to install datomic into your local maven repo) -->
<!-- mvn install:install-file -DgroupId=com.datomic -DartifactId=datomic -Dfile=datomic-0.1.2753.jar -DpomFile=pom.xml -->

Davy

Davy Suvee

unread,
Apr 11, 2012, 4:37:34 PM4/11/12
to gremli...@googlegroups.com
Hi Marko,

The implementation is probably not really fast :-). You are right that, because of the queries, it will be rather slow for traversals (as you will see when running the DatomicBenchmarkTestSuite). I will try to have a look at the performance soon.

Considering the transactions. I tried to get it working in my initial implementation. When working in Datomic, you basically create a list of stuff that you want to commit in a single transaction. These can be both additions, updates or removals of facts. The current implementation of Datomic however does not seems to be able to handle both the addition of a new entity (node or vertex) which is then immediately removed (retracted). Queries in that situation return inconsistent results. I could probably fix it by doing some manual bookkeeping myself, but I didn't want to complicate the initial implementation.

Considering the indexes. I had a short look and the implementation will be somewhat similar to the Dex implementation. In essence, it is possible to find any element (vertex or edge) that has a specific value for a particular attribute. No "explicit" index is required to do so. I was thinking of supporting just automatic indexes for all attributes, which under the hood will execute a query to retrieve the elements that match.

Happy to help out the project. Was thinking of doing another implementation on top of MongoDB (which in a way would be similar to the OrientDB implementation). Would this also be of interest to the community?

Davy 

James Thornton

unread,
Apr 11, 2012, 6:06:07 PM4/11/12
to gremli...@googlegroups.com
Hi Davy -

Yeah, I manually installed Datomic on the first build attempt (I got a build message saying to do so), but I'm still getting the error -- will continue to investigate. 

Regarding the traversal queries, what about using the get() method to get edge references and reducing by label from there:

 outgoingEdges = vertex.get(":graph.edge/_outVertex"); 
 incomingEdges = vertex.get(":graph.edge/_inVertex"); 

- James

Davy Suvee

unread,
Apr 11, 2012, 7:11:44 PM4/11/12
to Gremlin-users
Not exactly sure, but I would think that the get method is just
syntactic sugar for performing a Peer.q() query ... Not?
If so, wouldn't the query be identical to the one I use now?

Davy

project2501

unread,
Apr 11, 2012, 8:36:18 PM4/11/12
to Gremlin-users
This is great. I'm happy to try it out when you guys think its settled
down, builds, etc.

Awesome news.

Davy Suvee

unread,
Apr 12, 2012, 3:02:25 AM4/12/12
to gremli...@googlegroups.com
I updated the DatomicGraph to now support the IndexableGraph API. As mentioned in the message above, it only seems to make sense to support Automic indexes on all attribute values. Hence, the implementation does not support the manual creation of indexes. Will provide an example of use in the documentation.

Davy



 

Joshua Shinavier

unread,
Apr 12, 2012, 3:12:30 AM4/12/12
to gremli...@googlegroups.com
That's quick... when will you push your changes?

Josh

Davy Suvee

unread,
Apr 12, 2012, 3:15:40 AM4/12/12
to gremli...@googlegroups.com
Already pushed it ... :-) It's also part of the pull request.

Davy

Joshua Shinavier

unread,
Apr 12, 2012, 3:23:44 AM4/12/12
to gremli...@googlegroups.com
Yeah, scratch that. Your changes are in, but createManualIndex and
createAutomaticIndex throw UnsupportedOperationExceptions. I wonder
if you could implement createAutomaticIndex, trivially, by returning
an object which provides a partial index over one of the built-in
automatic indices.

Josh

Davy Suvee

unread,
Apr 12, 2012, 4:23:08 AM4/12/12
to gremli...@googlegroups.com
Good idea. Pushed updated Implementation that allows for the creation of automatic indexes (partial index over the built-in one).

Davy

Davy Suvee

unread,
Apr 12, 2012, 3:55:34 PM4/12/12
to gremli...@googlegroups.com
For people who would be interested in understanding the Datomic data and query model: I just published an introductory article at  http://datablend.be/?p=1641

Greetings,

Davy

Joshua Shinavier

unread,
Apr 13, 2012, 3:10:29 AM4/13/12
to gremli...@googlegroups.com
Well done! All GraphSail tests pass on DatomicGraph. There's a new
RDF triple store on the block :-)

Josh

Davy Suvee

unread,
Apr 13, 2012, 3:47:05 AM4/13/12
to gremli...@googlegroups.com
Cool! Now people can perform temporal SPARQL queries :-)

Davy

James Thornton

unread,
Apr 18, 2012, 4:55:52 PM4/18/12
to gremli...@googlegroups.com
Datomic  0.1.3007 was just released (https://groups.google.com/d/msg/datomic/ZM9UcUwrBsA/oibMgCeBoCgJ), and it provides raw index access so it looks like you don't have to go through the query interface...

Raw Index Access

While this release was heavily focused on durability and storage options, we have exposed an important new capability in the API. The Database.datoms() method provides fast access to the raw datoms in the various indexes, with seeking and narrowing capabilities, e.g. letting you rip through all of the values of a particular attribute etc. It provides the raw materials for fast streaming batch operations, or even alternative query engines! And, it builds upon the database-as-value model, and thus respects asOf, since, with etc.


- James

Davy Suvee

unread,
Apr 19, 2012, 2:14:46 AM4/19/12
to gremli...@googlegroups.com
Will have a look in the coming days to update my implementation to make use of the latest features.

Davy

Davy Suvee

unread,
Apr 19, 2012, 10:35:56 AM4/19/12
to gremli...@googlegroups.com
Performed a first set of updates, primarily in the retrieval of for instance the in and out vertex of an edge. By employing the raw indexes (instead of datomic queries), the in and out vertex of an edge are retrieved around 20x faster! A huge improvement, especially if you start traversing big graphs.

Hope to finish the updates somewhere tomorrow.

Davy




Pierre De Wilde

unread,
Apr 19, 2012, 10:47:37 AM4/19/12
to gremli...@googlegroups.com
Good news. When done, make sure to document it for newbies.

Thanks,
Pierre

Davy Suvee

unread,
Apr 19, 2012, 3:58:22 PM4/19/12
to gremli...@googlegroups.com
Committed the updated implementation that makes use of the latest datomic build. Additionally, it uses the new raw data access feature where appropriate. The pull request has been updated.
Pierre, I started writing the documentation. Will commit it tomorrow.

Davy

Pierre De Wilde

unread,
Apr 19, 2012, 4:55:41 PM4/19/12
to gremli...@googlegroups.com
Hi Davy,

Both Datomic and TinkerPop have high-potential. You make a link. 

Thanks,
Pierre

Joshua Shinavier

unread,
Apr 19, 2012, 11:02:25 PM4/19/12
to gremli...@googlegroups.com
Nice. Will you make corresponding changes to the
DatomicVertex.getInEdges(String...) and
DatomicVertex.getOutEdges(String...) methods? Btw. I bumped the
Datomic version in the tinkerpop/blueprints 'datomic' branch as the
version available on the website had changed. Do you know if there
are any plans to push releases and/or snapshots to a publicly
available Maven repo? IMO, the need to create the Maven artifacts
locally is the only issue w.r.t. merging DatomicGraph into the master
branch (apart from the question of how much effort it will take to
maintain over time).

Best,

Josh

Davy Suvee

unread,
Apr 20, 2012, 12:50:17 AM4/20/12
to gremli...@googlegroups.com
Hi,

- In only used the raw index if it was possible to execute the query in a single go. In case of the getInEdges and getOutEdges, multiple calls to the raw index would be required: 
  1. one to retrieve all the edges where the vertex is used as in/out Vertex
  2. loop through all found edges and perform a call to check whether the label equals one of the provided labels. 
Now, at that moment, I'm basically rewriting the Datomic reification engine and I thought the one of Datomic is probably doing things more intelligently ;-).

- As far as I know, the Datomic team does not intend to push the jars to public Maven repositories. From a thread in the Datomic google group: "The instructions for use from maven or leiningen are at http://datomic.com/company/resources/integrating-peer-lib.  Note that Datomic is not redistributable, and can not be placed on public sites such as Clojars."

- Maintenance wise, I think we should be ok. I'm certainly willing to take the lead in keeping it up-to-date and using the lastest Datomic features.

Davy

James Thornton

unread,
Apr 20, 2012, 1:11:33 AM4/20/12
to gremli...@googlegroups.com
Hi Davy -

The release notes say that the datoms() method has "seeking and narrowing capabilities," but the docs don't really go into to detail about the optional datum "components" -- have you found how that's done?

Maybe we should contact them about the dev jar to see if that's going to be a permanent policy or if something can be worked out.

- James

Davy Suvee

unread,
Apr 20, 2012, 6:50:24 AM4/20/12
to gremli...@googlegroups.com
Well, the seeking and narrowing via the optional datom components works very straightforward ...

Imagine a specific index, for instance EAVT (Entity - Attribute - Value - Transaction). If you do not provide any other components as input, the "query" will basically return all datoms in the database. By providing the entity id, you would limit it to all datoms that declare an attribute that belongs to that entity ... Specifying both the entity id and attribute id, you would retrieve all the datoms that declare the specific attribute for the specific entity id . So depending on the index you want to use, you can "narrow" the search by providing arguments (supplying them in the order of the employed index).

Hope this make somewhat sense ;-).

Davy

Davy Suvee

unread,
Apr 20, 2012, 10:31:41 AM4/20/12
to gremli...@googlegroups.com
I wrote some documentation on the use of the Datomic Blueprints Graph implementation. It can be found here:


Greetings,

Davy

Marko Rodriguez

unread,
Apr 20, 2012, 10:51:56 AM4/20/12
to gremli...@googlegroups.com

Well done. Thank you for making our lives easier by replicating the Blueprints pattern.

Marko.

http://markorodriguez.com

James Thornton

unread,
Apr 28, 2012, 8:39:36 PM4/28/12
to gremli...@googlegroups.com
Datomic gets database functions, which enables:
  • Atomic transformation functions in transactions
  • Integrity checks and constraints
  • Predicates and generative functions for queries
  • Database-driven dynamic code distribution to peers


- James

Davy Suvee

unread,
May 2, 2012, 3:23:40 AM5/2/12
to gremli...@googlegroups.com
I should have a look whether this new features allows me to do some stuff in a more intelligent way ...

Anyway, bumped the datomic-graph implementation to the most recent datomic build. The pull request has been updated.

Davy

Davy Suvee

unread,
Jul 13, 2012, 7:34:00 AM7/13/12
to gremli...@googlegroups.com
FYI,

Datomic-graph now supports the latest Blueprints 2.1.0 API and is using the latests Datomic build. The pull request has been updated accordingly.

Davy

Op woensdag 2 mei 2012 09:23:40 UTC+2 schreef Davy Suvee het volgende:

Joshua Shinavier

unread,
Jul 14, 2012, 1:01:49 AM7/14/12
to gremli...@googlegroups.com
Thanks, Davy. I have just verified that GraphSail works perfectly on
this revision of DatomicGraph, as well. GraphSail has become a kind
of litmus test for the consistency and transaction safety of
Blueprints graph implementations, above and beyond the unit tests, so
this is saying something :-)

I hope we can merge your pull request fairly soon after the 2.1.0 release.

Josh

Marc Limotte

unread,
May 16, 2014, 12:53:51 PM5/16/14
to gremli...@googlegroups.com
I see this thread is quite old.  Was this pull-request ever merged?  Any hope for the future?

Joshua Shinavier

unread,
May 16, 2014, 3:17:50 PM5/16/14
to gremli...@googlegroups.com
Hi Marc,

There are no longer any plans to merge DatomicGraph, now FluxGraph, into Blueprints proper.  We moved away from the model of pulling new vendor graph implementations into the code base, and TinkerPop3 includes no vendor-specific implementations at all.  However, there are multiple forks of fluxgraph which bring it up to date with recent versions of Blueprints.  See e.g.


and also this thread:


Bumping the TinkerPop version to 2.5.0 in that fork and adding GraphSail tests poses no problems.

HTH.

Josh



    


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages