How does JanusGraph store data in Cassandra/HBase?

61 views

Skip to first unread message

Kushal Agrawal

unread,

Dec 13, 2019, 7:33:57 AM12/13/19

to JanusGraph users

I have read the description of the JanusGraph data model here, but I wanted to clarify a few doubts as to what exactly JanusGraph actually ends up storing in a Cassandra/HBase backend.

I tried investigating a bit, and loaded the Graph of the Gods into a local instance of Cassandra. Then I ran a few select queries to see what is stored in the tables.

This is what I saw:

Screenshot 2019-12-13 at 4.48.52 PM.png

Screenshot 2019-12-13 at 4.49.21 PM.png

Now I'm not sure how to interpret all of this, but two things stood out to me.

One is that JanusGraph is actually creating only 3 columns in Cassandra, namely key, column1, and value.

The other is that the values in the columns are of varying lengths, and so might be representing what the docs say but not making it readable for the sake of compression/efficiency.

The reason I was looking at all this is that I wanted to know how vertex centric indices actually achieve the log n time they claim to.

What exactly happens when I want to traverse an adjacent edge from a vertex, given some constraints?
What is contained in the sort key mentioned in the data model docs?

If I can find answers to these questions from the community it would really help me convince my colleagues about the log n complexity claim.

Thanks and regards,

Kushal Agrawal

marc.de...@gmail.com

unread,

Dec 14, 2019, 10:36:44 AM12/14/19

to JanusGraph users

Hi Kushal,

Maybe not an answer to your question, but I believe you are looking in too much detail for an explanation. Use of indices is a very general technique, the most simple example being the HashMap in your daily java program, which has the log(n) lookup performance you mention. So, you can imagine for your self that the graphindex column family somehow persists a HashMap. There is no reason the Titan/Janusgraph people would select an implementation that does not scale as log(n).

The sort order from your question only refers to the order in which you get back the results.

HTH, Marc

Op vrijdag 13 december 2019 13:33:57 UTC+1 schreef Kushal Agrawal:

Reply all

Reply to author

Forward

0 new messages