Creating a janusgraph cluster

1,973 views
Skip to first unread message

Dilan Ranasinghe

unread,
Sep 20, 2017, 6:05:18 AM9/20/17
to JanusGraph users
Hello,

I'm currently struggling in creating a Janusgraph cluster. I read the official documents and i'm not clear yet how a cluster is created.

As i understood, most of the documents refer to the hbase or Cassandra  cluster as the janusgraph cluster.

My problem is how can we create a janusgraph/gremlin server cluster?
     1) Do i only need to run janusgraph/gremlin servers in the same network so that they will create a cluster automatically?
     2) Or is there a way to configure a cluster so that janusgraph instances will communicate with each other?
     3) If there is no communication between the janusgraph/gremlin servers can't there be any transaction issues?  For example one server adding a new node to a graph and at the same time another server is also adding a node which        will make an inconsistency.

Thanks and regards,
Dilan.

Ankur Goel

unread,
Sep 21, 2017, 5:28:52 AM9/21/17
to JanusGraph users
Best is create cassandra cluster + Solr/ES cluster individualy.

Connect JanusGraph instance to respective cassandra/ES/Solr cluster.

I am not sure how to have multiple instances of Janus Server. Right now i am using single instance of Janus Server.

~AnkurG

Robert Dale

unread,
Sep 21, 2017, 1:52:23 PM9/21/17
to Ankur Goel, JanusGraph users
JanusGraph (Gremlin) Servers are independent, standalone servers that do not communicate with each other.  However, they can be clustered for example for load balancing.

For your data consistency questions, you should read the documentation with special attention to hbase and data consistency. But one that may be overlooked is the ID allocation options - http://docs.janusgraph.org/latest/config-ref.html#_ids

If there is something that the documentation doesn't address let us know what should be enhanced.


Robert Dale

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/3ff40ca8-1ea4-45cd-a48b-f9c066231d27%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Dilan Ranasinghe

unread,
Sep 21, 2017, 10:40:02 PM9/21/17
to JanusGraph users
Thanks.

There is a confusion for me with your explanation and the section http://docs.janusgraph.org/0.1.0/failure-recovery.html#_janusgraph_instance_failure . There it says "However, some schema related operations - such as installing indexes - require the coordination of all JanusGraph instances". If servers don't communicate with each other, is this coordination is achieved through the underline back-end (hBase)?

Dilan.


On Friday, September 22, 2017 at 1:52:23 AM UTC+8, Robert Dale wrote:
JanusGraph (Gremlin) Servers are independent, standalone servers that do not communicate with each other.  However, they can be clustered for example for load balancing.

For your data consistency questions, you should read the documentation with special attention to hbase and data consistency. But one that may be overlooked is the ID allocation options - http://docs.janusgraph.org/latest/config-ref.html#_ids

If there is something that the documentation doesn't address let us know what should be enhanced.


Robert Dale

On Thu, Sep 21, 2017 at 5:28 AM, Ankur Goel <ankur...@gmail.com> wrote:
Best is create cassandra cluster + Solr/ES cluster individualy.

Connect JanusGraph instance to respective cassandra/ES/Solr cluster.

I am not sure how to have multiple instances of Janus Server. Right now i am using single instance of Janus Server.

~AnkurG


On Wednesday, September 20, 2017 at 3:35:18 PM UTC+5:30, Dilan Ranasinghe wrote:
Hello,

I'm currently struggling in creating a Janusgraph cluster. I read the official documents and i'm not clear yet how a cluster is created.

As i understood, most of the documents refer to the hbase or Cassandra  cluster as the janusgraph cluster.

My problem is how can we create a janusgraph/gremlin server cluster?
     1) Do i only need to run janusgraph/gremlin servers in the same network so that they will create a cluster automatically?
     2) Or is there a way to configure a cluster so that janusgraph instances will communicate with each other?
     3) If there is no communication between the janusgraph/gremlin servers can't there be any transaction issues?  For example one server adding a new node to a graph and at the same time another server is also adding a node which        will make an inconsistency.

Thanks and regards,
Dilan.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.

Jason Plurad

unread,
Sep 22, 2017, 9:07:55 AM9/22/17
to JanusGraph users
Correct, the coordination is achieved through the underlying storage backend.

Documentation on the engine internals is pretty sparse. This is an old reference, but has some decent pointers.

https://github.com/BillBaird/delftswa-aurelius-titan
Message has been deleted

Ankur Goel

unread,
Dec 18, 2017, 11:36:37 PM12/18/17
to Liping Huang, JanusGraph users
i have not seen load balancing properties in JG for cassandra.

For fail safe:
config.set("storage.hostname", "IP,IP,IP");

can be used but above configuration will connect other IP in case previous IP is not reachable. Above configuration is for contact point only through which driver identified cluster participated nodes.


I have not see any property like  withLoadBalancingPolicy(new RoundRobinPolicy()) in JG.

Currently JG connect to single coordinator node for all operations.

~





On Tue, Dec 19, 2017 at 8:46 AM, Liping Huang <liping.h...@gmail.com> wrote:
so we can using JG with load balancing but actually the stroge are clusting, right? but how to connect to the cassandra cluster?  from the JAVA driver, seems there is only one cassandra instance can be connected:

// First configure the graph
JanusGraphFactory.Builder config = JanusGraphFactory.build();
config.set("storage.backend", "cassandrathrift");
config.set("storage.hostname", "IP"); // ip address where cassandra is installed, how to connect to more instance?
config.set("storage.port", "9160");
config.set("storage.username", "cassandra");
config.set("storage.password", "cassandra");
config.set("storage.cassandra.keyspace", "janusgraph");


for ES, it can configure more IPs here
// Elasticsearch config
config.set("index.search.backend", "elasticsearch");
config.set("index.search.hostname", "IP, IP, IP");



在 2017年9月22日星期五 UTC+8下午9:07:55,Jason Plurad写道:

--
You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/ZOoR1vRLfvo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/acdb9e21-585d-46a5-8c57-0c7921c84e3c%40googlegroups.com.

lakshay....@gmail.com

unread,
Oct 5, 2018, 5:13:30 AM10/5/18
to JanusGraph users
Hi i wanted to know how janusgraph can be clustered for load balancing purposes.


On Thursday, September 21, 2017 at 11:22:23 PM UTC+5:30, Robert Dale wrote:
JanusGraph (Gremlin) Servers are independent, standalone servers that do not communicate with each other.  However, they can be clustered for example for load balancing.

For your data consistency questions, you should read the documentation with special attention to hbase and data consistency. But one that may be overlooked is the ID allocation options - http://docs.janusgraph.org/latest/config-ref.html#_ids

If there is something that the documentation doesn't address let us know what should be enhanced.


Robert Dale

On Thu, Sep 21, 2017 at 5:28 AM, Ankur Goel <ankur...@gmail.com> wrote:
Best is create cassandra cluster + Solr/ES cluster individualy.

Connect JanusGraph instance to respective cassandra/ES/Solr cluster.

I am not sure how to have multiple instances of Janus Server. Right now i am using single instance of Janus Server.

~AnkurG


On Wednesday, September 20, 2017 at 3:35:18 PM UTC+5:30, Dilan Ranasinghe wrote:
Hello,

I'm currently struggling in creating a Janusgraph cluster. I read the official documents and i'm not clear yet how a cluster is created.

As i understood, most of the documents refer to the hbase or Cassandra  cluster as the janusgraph cluster.

My problem is how can we create a janusgraph/gremlin server cluster?
     1) Do i only need to run janusgraph/gremlin servers in the same network so that they will create a cluster automatically?
     2) Or is there a way to configure a cluster so that janusgraph instances will communicate with each other?
     3) If there is no communication between the janusgraph/gremlin servers can't there be any transaction issues?  For example one server adding a new node to a graph and at the same time another server is also adding a node which        will make an inconsistency.

Thanks and regards,
Dilan.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.

Florian Hockmann

unread,
Oct 5, 2018, 5:24:48 AM10/5/18
to JanusGraph users
Have you read the part of the documentation about deployment? You can deploy JanusGraph in any of the described scenarios, or with a combination of those. And, as described in the linked chapter, it's also possible to put a load balancer in front of the JanusGraph Server instances.

Lakshay Sharma

unread,
Oct 5, 2018, 5:35:43 AM10/5/18
to janusgra...@googlegroups.com
Hi,
Thanks for reply . here when you mention " load balancer" do you mean load balancers like nginx? and could you share any configuration for that?
regards,
Lakshay Shrama

You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/ZOoR1vRLfvo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgraph-use...@googlegroups.com.
To post to this group, send email to janusgra...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/a00f49a2-9535-46ad-82de-a87077e75940%40googlegroups.com.

Florian Hockmann

unread,
Oct 5, 2018, 6:03:18 AM10/5/18
to JanusGraph users
Yes, nginx works for example. Our nginx config for JanusGraph looks like this:

upstream upstream {
    server janusgraph
:8182;
}

server
{
    listen
8182 ssl;

    server_name janusgraph
-proxy.example.com;

    ssl_certificate    
/etc/nginx/cert.crt;
    ssl_certificate_key
/etc/nginx/cert.key;

    location
/ {
        proxy_pass http
://upstream;
        proxy_http_version
1.1;
   
}
}

If you don't want to use SSL, then the config becomes even simpler. Note, that we only have one server registered as upstream because that's the DNS name for all JanusGraph Server Docker containers in our setup. That allows us to scale the number of JanusGraph Server instances without having to change the nginx config. If you don't use Docker, then you can simply provide a list of your servers there.

This config makes your JanusGraph Servers available on janusgraph-proxy.example.com.

Lakshay Sharma

unread,
Oct 5, 2018, 6:08:40 AM10/5/18
to janusgra...@googlegroups.com
Hi Florian,

Thank you for sharing the config. I just had one more question the above configuration would work for running janusgraph server in webserver(ws) mode right?

Regards,
Lakshay Shrama

Florian Hockmann

unread,
Oct 5, 2018, 6:11:01 AM10/5/18
to JanusGraph users
WS usually stands for WebSocket and I assume that that's what you mean. Yes, it works with WebSockets. That's how we are using it in my team and we didn't have any problems with nginx (Traefik however didn't work with WebSockets which is what we tried first).

Lakshay Sharma

unread,
Oct 5, 2018, 6:12:59 AM10/5/18
to janusgra...@googlegroups.com
Hi
Sorry for wrong abbrevation. Thank you for your conf i will try it out.

Regards,
Lakshay Shrama

Lakshay Sharma

unread,
Oct 8, 2018, 2:51:29 AM10/8/18
to janusgra...@googlegroups.com
Hi,
was trying out load balancing just had one doubt what the url will be when connecting to gremlin using DriverRemoteConnection in python. Currently it is
x.x.x.x:8182/gremlin in python. Thanks for help.

Florian Hockmann

unread,
Oct 8, 2018, 4:18:51 AM10/8/18
to JanusGraph users

It's the same. You just need to set x.x.x.x to the hostname under which your nginx proxy is reachable. In my example earlier it was janusgraph-proxy.example.com.

Please don't post the same question multiple times here on the list. That won't make it any likelier that you get an answer but it makes it harder for others in the future who have the same problem to find an answer.

andy.r....@gmail.com

unread,
Oct 9, 2018, 5:19:21 AM10/9/18
to JanusGraph users
Can I ask what database you're using for your JanusGraph cluster and how it's hosted? We're looking to achieve basically the same thing with JanusGraph containers within Kubernetes.

Lakshay Sharma

unread,
Oct 9, 2018, 5:30:33 AM10/9/18
to janusgra...@googlegroups.com
Hi ,
I am using Cassandra which is hosted as 3 node cluster. I am running janusgraph instances on top of that which are load balanced using nginx.

Florian Hockmann

unread,
Oct 9, 2018, 8:19:37 AM10/9/18
to JanusGraph users
Sure, we're using ScyllaDB as the storage backend and Elasticsearch as our index backend. You can find our full architecture here. But it doesn't really matter for nginx which backends you have behind JanusGraph as nginx should only handle the traffic between your applications and JanusGraph. JanusGraph will then communicate with your backend.
Message has been deleted

andy.r....@gmail.com

unread,
Oct 10, 2018, 5:34:41 AM10/10/18
to JanusGraph users
Thank you Florian, that's very helpful. Can I ask what you're using Apache Spark for?

Florian Hockmann

unread,
Oct 10, 2018, 10:03:13 AM10/10/18
to JanusGraph users
Currently, we're not actively using Spark, in part because of #1228, but also we currently don't have a use case that requires Spark. We used it before mainly for a research project where a student wrote his master thesis in our team on whether we can classify malware with the data we have in our graph database based on TinkerPop VertexPrograms. You find a short overview of this in the slides I linked earlier and a much more complete description in his master thesis. We will probably also public a blog post about this topic in the following weeks / months.

I guess, in general machine learning / analysis queries and clean-up jobs are the two most frequent use cases where Spark makes sense in the context of TinkerPop.

andy.r....@gmail.com

unread,
Oct 11, 2018, 3:55:13 AM10/11/18
to JanusGraph users
Thanks again Florian, you've been very helpful. I'd be very interested in reading that blog post!
Reply all
Reply to author
Forward
0 new messages