where is the 'distribute' of our janusgraph ?

302 views
Skip to first unread message

jx ping

unread,
Feb 1, 2018, 1:00:16 AM2/1/18
to JanusGraph users
 I have use janusgraph for almost two weeks ,now I am considering where is the 'distribute' of our janusgraph ?
As far as I know ,the backend storage of janusgraph  could be hbase or Cassandra, the backend index could be elastic search which is a distribute database with index.
does it mean the 'distribute' of janusgraph is just the storage ?
can i build more janusgraph instance in a cluster which contains more than 10 computer?  if so how the instances communicate with each other and how they work together to finish one job.
if not what should i do to improve my janusgraph to serve for other service,should i build a project to manage a janusgraph cluster? is there any one could give me a help?

Misha Brukman

unread,
Feb 1, 2018, 2:42:48 AM2/1/18
to jcbm...@gmail.com, JanusGraph users list
On Wed, Jan 31, 2018 at 10:00 PM jx ping <jcbm...@gmail.com> wrote:
 I have use janusgraph for almost two weeks ,now I am considering where is the 'distribute' of our janusgraph ?
As far as I know ,the backend storage of janusgraph  could be hbase or Cassandra, the backend index could be elastic search which is a distribute database with index.
does it mean the 'distribute' of janusgraph is just the storage ?
can i build more janusgraph instance in a cluster which contains more than 10 computer?

Yes, certainly, you can, e.g., create a cluster of N JanusGraph nodes, put a load balancer in front of that cluster, and use the endpoint to distribute your graph queries.
 
if so how the instances communicate with each other and how they work together to finish one job.

JanusGraph nodes do not communicate with each other. For example, if you look at the JanusGraph+Cassandra docs, you'll see that JanusGraph nodes are independent and do not communicate amongst themselves, but only with their storage and indexing backends.
 
if not what should i do to improve my janusgraph to serve for other service,should i build a project to manage a janusgraph cluster? is there any one could give me a help?

Here are a couple approaches:
  • build a JanusGraph container and deploy it on Kubernetes or another container manager of your choice, and scale it up and down as you need
  • build a custom VM image, e.g., using HashiCorp's Packer, and create several VMs from that image
  • build an RPM or DEB or other package, and install it on several machines to run JanusGraph from
In either case, as mentioned above, you'll probably want a load balancer in front of the JanusGraph cluster to distribute your queries.

If you're interested in updating or rolling out new versions of this software, need a deployment management tool, e.g., you can use Spinnaker or another tool of your choice for this.

Best,
Misha

jx ping

unread,
Feb 1, 2018, 10:15:20 PM2/1/18
to JanusGraph users


在 2018年2月1日星期四 UTC+8下午3:42:48,Misha Brukman写道:
On Wed, Jan 31, 2018 at 10:00 PM jx ping <jcbm...@gmail.com> wrote:
 I have use janusgraph for almost two weeks ,now I am considering where is the 'distribute' of our janusgraph ?
As far as I know ,the backend storage of janusgraph  could be hbase or Cassandra, the backend index could be elastic search which is a distribute database with index.
does it mean the 'distribute' of janusgraph is just the storage ?
can i build more janusgraph instance in a cluster which contains more than 10 computer?

Yes, certainly, you can, e.g., create a cluster of N JanusGraph nodes, put a load balancer in front of that cluster, and use the endpoint to distribute your graph queries.
  
           Do you have some solution for load balancer ? As I see in the source code ,the janusgraph use only one thread to execute trversal query ,have you implement a asynchronous method?
          or someone implements a threadpoolexecutor to query , if so  i can serve more people in the same time

Misha Brukman

unread,
Feb 4, 2018, 8:20:35 PM2/4/18
to jx ping, JanusGraph users
On Thu, Feb 1, 2018 at 10:15 PM, jx ping <jcbm...@gmail.com> wrote:
在 2018年2月1日星期四 UTC+8下午3:42:48,Misha Brukman写道:
On Wed, Jan 31, 2018 at 10:00 PM jx ping <jcbm...@gmail.com> wrote:
 I have use janusgraph for almost two weeks ,now I am considering where is the 'distribute' of our janusgraph ?
As far as I know ,the backend storage of janusgraph  could be hbase or Cassandra, the backend index could be elastic search which is a distribute database with index.
does it mean the 'distribute' of janusgraph is just the storage ?
can i build more janusgraph instance in a cluster which contains more than 10 computer?

Yes, certainly, you can, e.g., create a cluster of N JanusGraph nodes, put a load balancer in front of that cluster, and use the endpoint to distribute your graph queries.
  
           Do you have some solution for load balancer ?

There are many options for load balancers, such as HAProxy, nginx, and many cloud providers also offer managed load balancer products. Here are some options: https://geekflare.com/open-source-load-balancer/
 
As I see in the source code ,the janusgraph use only one thread to execute trversal query ,have you implement a asynchronous method?
          or someone implements a threadpoolexecutor to query , if so  i can serve more people in the same time

You can configure a threadpool for Gremlin server to handle multiple concurrent requests:

If you're asking about multi-threaded processing for a single transaction, see:

Hope this helps,
Misha

lakshay....@gmail.com

unread,
Oct 8, 2018, 3:05:23 AM10/8/18
to JanusGraph users
Hi,
I used nginx for load balancing one thing i can't figure out is that  what the url will be when connecting to gremlin using DriverRemoteConnection in python. Currently it is
x.x.x.x:8182/gremlin in python 
Reply all
Reply to author
Forward
0 new messages