ArangoDB-3.1.9 clustering not working !!

381 views
Skip to first unread message

gurpreet....@gmail.com

unread,
Feb 3, 2017, 11:15:21 PM2/3/17
to ArangoDB
Team,

i am testing arangodb clustering with 3 machines of setup and i am following arangodb documentation as per link (https://docs.arangodb.com/3.1/Manual/Deployment/Distributed.html).
 
Below are steps/commands that i have tried on each machine one by one but its not resolving agency/cluster across machines.


          10.0.1.52:
arangod --server.endpoint tcp://0.0.0.0:5001 --server.authentication false --agency.activate true --agency.size 3 --agency.supervision true --database.directory agency1 &

10.0.1.137:
arangod --server.endpoint tcp://0.0.0.0:5002 --server.authentication false --agency.activate true --agency.size 3 --agency.supervision true --database.directory agency2 &

10.0.1.228:
arangod --server.endpoint tcp://0.0.0.0:5003 --server.authentication false --agency.activate true --agency.size 3 --agency.endpoint tcp://10.0.1.52:5001 --agency.endpoint tcp://10.0.1.137:5002 --agency.endpoint tcp://10.0.1.228:5003 --agency.supervision true --database.directory agency3 &


10.0.1.52:
arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:8529 --cluster.my-address tcp://10.0.1.52:8529 --cluster.my-local-info db1 --cluster.my-role PRIMARY --cluster.agency-endpoint tcp://10.0.1.52:5001 --cluster.agency-endpoint tcp://10.0.1.137:5002 --cluster.agency-endpoint tcp://10.0.1.228:5003 --database.directory primary1 &

10.0.1.137:
arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:8530 --cluster.my-address tcp:///10.0.1.137:8530 --cluster.my-local-info db2 --cluster.my-role PRIMARY --cluster.agency-endpoint tcp://10.0.1.52:5001 --cluster.agency-endpoint tcp://10.0.1.137:5002 --cluster.agency-endpoint tcp://10.0.1.228:5003 --database.directory primary2 &

10.0.1.228:
arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:8531 --cluster.my-address tcp://10.0.1.228:8531 --cluster.my-local-info coord1 --cluster.my-role COORDINATOR --cluster.agency-endpoint tcp://10.0.1.52:5001 --cluster.agency-endpoint tcp://10.0.1.137:5002 --cluster.agency-endpoint tcp://10.0.1.228:5003 --database.directory coordinator &



i am getting below error every time on all machine:

2017-02-04T02:47:12Z [29952] INFO {agency} f5c1433a-cec2-4089-829e-1c42fef92942: following  in term 58
2017-02-04T02:47:13Z [29952] INFO {agency} f5c1433a-cec2-4089-829e-1c42fef92942: candidating in term 58
2017-02-04T02:47:13Z [29952] INFO {cluster} cannot create connection to server '' at endpoint 'tcp://localhost:5002'
2017-02-04T02:47:13Z [29952] INFO {cluster} cannot create connection to server '' at endpoint 'tcp://localhost:5001'
2017-02-04T02:47:13Z [29952] INFO {agency} f5c1433a-cec2-4089-829e-1c42fef92942: following  in term 59
2017-02-04T02:47:15Z [29952] INFO {agency} f5c1433a-cec2-4089-829e-1c42fef92942: candidating in term 59
2017-02-04T03:28:47Z [17164] ERROR {cluster} cannot create connection to server '' at endpoint 'tcp://localhost:5003'
2017-02-04T03:28:47Z [17164] ERROR {cluster} cannot create connection to server '' at endpoint 'tcp://localhost:5001'
2017-02-04T03:28:51Z [17164] ERROR {cluster} cannot create connection to server '' at endpoint 'tcp://localhost:5003'


Can anybody help out ...?
 

and...@arangodb.com

unread,
Feb 6, 2017, 10:26:18 AM2/6/17
to ArangoDB
Hi,

the documentation was wrong. We forgot it to recheck these parts when we did the 3.1 release.

The missing piece in this setup is the missing --agency.my-address for the agents (which is present in the local setup example but not in the distributed).

Without it the agents broadcast that they are reachable via localhost:* which is of course wrong.

This should work:

arangod --server.endpoint tcp://0.0.0.0:5001 --agency.my-address tcp://10.0.1.52:5001 --server.authentication false --agency.activate true --agency.size 3 --agency.supervision true --database.directory agency1
arangod --server.endpoint tcp://0.0.0.0:5002 --agency.my-address tcp://10.0.1.137:5002 --server.authentication false --agency.activate true --agency.size 3 --agency.supervision true --database.directory agency2
arangod --server.endpoint tcp://0.0.0.0:5003 --agency.my-address tcp://10.0.1.228:5003 --server.authentication false --agency.activate true --agency.size 3 --agency.endpoint tcp://10.0.1.52:5001 --agency.endpoint tcp://10.0.1.137:5002 --agency.endpoint tcp://10.0.1.228:5003 --agency.supervision true --database.directory agency3

Wilfried Gösgens

unread,
Feb 6, 2017, 12:45:25 PM2/6/17
to ArangoDB
Hi,
please note that there have been numerous bugfixes in 3.1.10 and that you should upgrade your installation.

Cheers,
Willi
Message has been deleted

gurpreet....@gmail.com

unread,
Feb 7, 2017, 11:02:00 AM2/7/17
to ArangoDB
i have tried those 3 steps and its working. But still remaining below steps not working properly,


10.0.1.52:
arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:8529 --cluster.my-address tcp://10.0.1.52:8529 --cluster.my-local-info db1 --cluster.my-role PRIMARY --cluster.agency-endpoint tcp://10.0.1.52:5001 --cluster.agency-endpoint tcp://10.0.1.137:5002 --cluster.agency-endpoint tcp://10.0.1.228:5003 --database.directory /var/lib/arangodb3/databases/primary1 &

10.0.1.137:
arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:8530 --cluster.my-address tcp:///10.0.1.137:8530 --cluster.my-local-info db2 --cluster.my-role PRIMARY --cluster.agency-endpoint tcp://10.0.1.52:5001 --cluster.agency-endpoint tcp://10.0.1.137:5002 --cluster.agency-endpoint tcp://10.0.1.228:5003 --database.directory /var/lib/arangodb3/databases/primary2 &

10.0.1.228:
arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:8531 --cluster.my-address tcp://10.0.1.228:8531 --cluster.my-local-info coord1 --cluster.my-role COORDINATOR --cluster.agency-endpoint tcp://10.0.1.52:5001 --cluster.agency-endpoint tcp://10.0.1.137:5002 --cluster.agency-endpoint tcp://10.0.1.228:5003 --database.directory /var/lib/arangodb3/databases/coordinator &


Its showing below error across 3 nodes:

10.0.1.52:
2017-02-07T15:10:26Z [46675] INFO using SSL options: SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG
2017-02-07T15:10:26Z [46675] INFO Starting up with role PRIMARY
2017-02-07T15:10:26Z [46675] INFO {cluster} Fresh start. Persisting new UUID PRMR-6a191801-b72d-424a-872e-0bdfdd2c9d3b
2017-02-07T15:10:26Z [46675] INFO file-descriptors (nofiles) hard limit is 65536, soft limit is 1024
2017-02-07T15:10:26Z [46675] FATAL cannot lock the database directory, please check the lock file '/var/lib/arangodb3/primary1/LOCK': system error

---
10.0.1.137:
2017-02-07T15:11:05Z [37975] INFO using SSL options: SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG
2017-02-07T15:11:05Z [37975] DEBUG file-descriptors (nofiles) hard limit is 65536, soft limit is 1024
2017-02-07T15:11:05Z [37975] INFO Starting up with role PRIMARY
2017-02-07T15:11:05Z [37975] DEBUG connecting to ip endpoint 'http+tcp://10.0.1.52:5001'
2017-02-07T15:11:05Z [37975] DEBUG connecting to ip endpoint 'http+tcp://10.0.1.137:5002'
2017-02-07T15:11:05Z [37975] INFO {cluster} Fresh start. Persisting new UUID PRMR-23c15420-9e30-42ed-94b6-bd1632582e63
2017-02-07T15:11:05Z [37975] DEBUG permanently changing the uid to 999
2017-02-07T15:11:05Z [37975] INFO file-descriptors (nofiles) hard limit is 65536, soft limit is 1024
2017-02-07T15:11:05Z [37975] DEBUG using default language 'en_US'
2017-02-07T15:11:05Z [37975] FATAL cannot lock the database directory, please check the lock file '/var/lib/arangodb3/primary2/LOCK': system erro
----
10.0.1.228:
2017-02-07T15:11:45Z [38310] INFO using SSL options: SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG
2017-02-07T15:11:45Z [38310] DEBUG file-descriptors (nofiles) hard limit is 65536, soft limit is 1024
2017-02-07T15:11:45Z [38310] INFO Starting up with role COORDINATOR
2017-02-07T15:11:45Z [38310] DEBUG connecting to ip endpoint 'http+tcp://10.0.1.52:5001'
2017-02-07T15:11:45Z [38310] DEBUG connecting to ip endpoint 'http+tcp://10.0.1.137:5002'
2017-02-07T15:11:45Z [38310] INFO {cluster} Fresh start. Persisting new UUID CRDN-75fa7168-ab79-4446-a083-63720507d4b5
2017-02-07T15:11:45Z [38310] INFO Waiting for DBservers to show up...
2017-02-07T15:11:45Z [38310] INFO Found 3 DBservers.
2017-02-07T15:11:45Z [38310] DEBUG permanently changing the uid to 999
2017-02-07T15:11:45Z [38310] INFO file-descriptors (nofiles) hard limit is 65536, soft limit is 1024
2017-02-07T15:11:45Z [38310] DEBUG using default language 'en_US'
2017-02-07T15:11:45Z [38310] FATAL cannot lock the database directory, please check the lock file '/var/lib/arangodb3/coordinator/LOCK': system error

Why this error is coming up..?

AYUSH RASTOGI

unread,
Feb 24, 2017, 7:47:35 AM2/24/17
to ArangoDB
I am facing the same problem please fix it asap.

m...@arangodb.com

unread,
Feb 24, 2017, 9:36:32 AM2/24/17
to ArangoDB
It seems your servers cannot write to their database directories. Either some other, already running server has a lock on these directories, or the permissions are wrong. As which user are you starting the jobs? Which user owns the directories and possible files in them? What are the permission bits?

In the meantime we have devised a new tool to simplify cluster startup, see https://github.com/arangodb-helper/ArangoDBStarter for details. We recommend to use this tool.

AYUSH RASTOGI

unread,
Feb 27, 2017, 2:32:14 AM2/27/17
to ArangoDB
 Hello Sir,
I try this tool.I downloaded the source code in taz.gz. but I am not able to use make local.I am beginner in cluster and Arangodb .please help me.
Thanks 

Mahesh Kumar

unread,
Feb 27, 2017, 5:19:42 AM2/27/17
to ArangoDB
Afer changing the ownership and group to "arangodb" the service started properly:

"/var/lib/arangodb3/primary1" will be created after executing following command on server1 (Any how it fails saying FATAL cannot lock database dir....):
 

arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:8529 --cluster.my-address tcp://192.168.1.1:8529 --cluster.my-local-info db1 --cluster.my-role PRIMARY --cluster.agency-endpoint tcp://192.168.1.1:5001 --cluster.agency-endpoint tcp://192.168.1.2:5001 --cluster.agency-endpoint tcp://192.168.1.3:5001 --database.directory primary1 &
 Now modify the group/ownership of the folder:
  # chown  arangodb:arangodb   /var/lib/arangodb3/primary1
 
 Kill the server:
  #killall -9 arangodb

Start it again  

Repeat the same for the other two nodes

AYUSH RASTOGI

unread,
Feb 27, 2017, 5:25:00 AM2/27/17
to ArangoDB
Thanks, it's working fine now
Reply all
Reply to author
Forward
0 new messages