Does pacer work with a cassandra cluster?

46 views
Skip to first unread message

anthony...@gmail.com

unread,
Sep 24, 2013, 5:02:12 PM9/24/13
to pacer...@googlegroups.com
Hello everyone!

I'm wondering if pacer (and titan graph, btw) work with a cassandra cluster?

Hope that is not that stupid.

I have found how to setup a tinkergraph (via the "tg" function). But I am not sure I can configure titan and pacer to use cassandra as a backend.

Thanks!

Anthony.

Darrick Wiebe

unread,
Sep 24, 2013, 5:14:55 PM9/24/13
to Pacer Group
Hi Anthony,

Cassandra is quite different from a graph database and I wouldn't expect to be able to use it as one without putting in a lot of work to essentially mimic one on top of Cassandra, which you probably wouldn't want to do without a very good reason. So the basic answer to your question is that no, Pacer can't be used with Cassandra. 

That said, Pacer is very extensible and is excellent at representing streaming operations of all sorts. If stream processing suits your problem, you may well be able to leverage Pacer together with Cassandra (or any other data store) by creating an extension. For a good example of just that, see my pacer-xml repo: https://github.com/xnlogic/pacer-xml

Cheers,
Darrick

anthony...@gmail.com

unread,
Sep 24, 2013, 6:20:42 PM9/24/13
to pacer...@googlegroups.com
Thanks a lot for the answer.

I've read that titan can implement a graph over berkeley, hbase or cassandra as backend or either localy with an XML file (and things like tinkergraph).
I hoped that titan could be used as an interface between pacer and cassandra (the same way that it is an interface between gremlin and cassandra).

Thanks again,

Anthony.

Darrick Wiebe

unread,
Sep 24, 2013, 6:45:54 PM9/24/13
to Pacer Group
Oh, yes you can use Pacer over Titan.

Cheers,
Darick


--
You received this message because you are subscribed to the Google Groups "pacer-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pacer-users...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Mark McCraw

unread,
Sep 24, 2013, 9:52:48 PM9/24/13
to pacer...@googlegroups.com
Hey Anthony,

I've had very good luck with pacer and Titan/Cassandra (in fact, I've monkeyed around with all the back end options except hadoop, plus a recently introduced datastore called persistit).  The coupling you're concerned about here is between pacer and Titan, and that works great.  I have been super happy with it.  You can get started by doing something like this:



config = org.apache.commons.configuration.MapConfiguration.new( "storage.backend" => "embeddedcassandra", "storage.cassandra-config-dir" => "file://#{File.absolute_path(File.dirname(__FILE__))}/config/environments/#{ENV['RACK_ENV']}/cassandra.yaml", "storage.index.search.backend" => "elasticsearch", "storage.index.search.directory" => "/tmp/searchindex", "storage.index.search.client-only" => "false", "storage.index.search.local-mode" => "true" ) G = Pacer.titan config

The code above references a Pacer module that looks like this:

require "java" require "pacer" require "titan/titan_persistent_object" module Pacer class << self attr_reader :titan_g def open(path) proc do graph = Pacer.open_graphs[path] unless graph args = [org.apache.commons.configuration.Configuration.java_class] @titan_g = graph = com.thinkaurelius.titan.core.TitanFactory.java_send(:open, args, @config) end graph end end def shutdown(path) proc do |graph| graph.blueprints_graph.shutdown Pacer.open_graphs.delete path end end def titan(config, path="foo") @config = config PacerGraph.new(Pacer::SimpleEncoder, open(path), shutdown(path)) end private :open, :shutdown end end

This assumes you're running Cassandra embedded, which used to be the recommended method, but now they (the Titan folks) are recommending running Cassandra as a separate process (and are likely to drop Cassandra embedded in a near future release). So you'll need to tweak that config just a hair, but the documentation for that on the Titan wiki is pretty good.

There is one big gotcha that bit me and cost me a fair amount of time with Cassandra/Titan specifically. You'll be running with JRuby, and when running with a back end data store like Berkeley DB, I was able to do something like this within my ruby source to load all the dependencies dynamically at runtime:

Dir["#{Main.root}/titan-berkeleyje-0.3.0/lib/\*.jar"].each { |jar| require jar }

For reasons I can't really be sure of, that will *not* work with cassandra. My best theory is that Cassandra has dependencies on some jars with native components inside them, and those pieces don't load/execute correctly when classloaded as above (it also make things a pain if you're trying to run it inside a container like JBoss). At any rate, Cassandra complains in all kinds of ways unless you're very explicit about setting up your classpath at launch time. I ended up creating a script to launch my application that looks something like:

#!/bin/sh
java -Xmx3072m -XX:MaxPermSize=2048m -classpath \ lib/java/akiban-persistit-3.3.0.jar:\ lib/java/blueprints-core-2.4.0.jar:\
lib/java/titan-core-0.4.0-SNAPSHOT.jar:\ lib/java/titan-persistit-0.4.0-SNAPSHOT.jar:\
... lib/java/titan-cassandra-0.3.1.jar:\
lib/java/titan-core-0.3.1.jar org.jruby.Main -S lib/ruby/main.rb "$@"

Of course, if you're packaging your project in its own runnable jar, you can hide all of this in your project's jar's manifest, but when developing, I wanted my code exploded out as uncompiled ruby source, and I wanted ease of launching it, which this script does.

I hope this helps you avoid some of the pains I went through. I also subscribed to the titan-users mailing list, which is very active, and like this list, full of helpful people.

Good luck!
Mark

David Colebatch

unread,
Sep 24, 2013, 9:57:49 PM9/24/13
to pacer...@googlegroups.com
What a great email - Thanks for sharing this Mark!

Regards,
David

--
David Colebatch
Co-Founder

anthony...@gmail.com

unread,
Sep 25, 2013, 4:46:21 PM9/25/13
to pacer...@googlegroups.com
Wow. Thank you very much for all that information.

There is no doubt that it'll be helpful for me.

I'll give you a feedback on cassandra/titan/pacer someday if it ends working (or I'll come back soon with another question).

Cheers,
Anthony.
Reply all
Reply to author
Forward
0 new messages