Measuring performance of Titan graph database

110 views
Skip to first unread message

Karthik Sharma

unread,
Mar 15, 2015, 5:06:36 AM3/15/15
to gremli...@googlegroups.com
I have Titan (with embedded cassandra running on my system).

cd titan-cassandra-0.3.1
bin/titan.sh config/titan-server-rexster.xml config/titan-server-cassandra.properties

I have installed bulbs on my system as follows.

sudo apt-get install python2.7-dev
sudo apt-get install libyaml-dev

sudo pip install https://github.com/espeed/bulbs/tarball/master

After the above setup I run the following python application to create a very simple graph.

from bulbs.titan import Graph
g = Graph()
switch = g.vertices.create(name="switch")
device = g.vertices.create(name="device")
g.edges.create(switch, "connected to", device)

If I measure the time taken to execute the above python application from linux command using `time` command as

time python graph.py

I get the time taken to create two vertices and the edge connecting to be about 1.620 seconds. This is a bit high. I am looking at ways to bring this down.

My assumptions are as follows.

1. Creation of vertex and creation of edges are blocking operations. I am assuming that these operations will consume quite a bit of CPU. In my case the Titan server as well as the python application is running on the same virtual machine. What would be the impact if any if I were to move this to a separate database server. Can I expect an improvement in the time ?

2. Is there anything else that you could recommend that would improve the execution time for my application.

Stephen Mallette

unread,
Mar 16, 2015, 7:20:52 AM3/16/15
to gremli...@googlegroups.com
From my understanding of bulbs, you are making three REST API calls in that code block.  your timing for that code block still seems high but i can't fully explain that.  i'd consider reducing those 3 API calls to just one by submitting that as a parameterized Gremlin script (or a place the script server-side in Rexster and call it from the client).  


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/8449a752-cd63-4a41-8bf4-e679d09d50f5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Karthik Sharma

unread,
Mar 25, 2015, 4:17:04 PM3/25/15
to gremli...@googlegroups.com
I see. Is the argument then that the parameterized Gremlin script which does the "same set of operations" will be more efficient because it is written in a JVM native language? 

Stephen Mallette

unread,
Mar 25, 2015, 4:21:56 PM3/25/15
to gremli...@googlegroups.com
The efficiency lies in:

1. a single REST API call instead of three
2. a batched transaction of three inserts with a single commit() (instead of a commit() called after each request)
3. the parameterized script would be compiled and thus cached for future requests (compilation is kinda expensive)



Reply all
Reply to author
Forward
0 new messages