Deploy Stardog 2.2.2 as a Cluster

Phil C

unread,

Nov 13, 2014, 8:05:55 AM11/13/14

to sta...@clarkparsia.com

Hi,

I've read through the answers to Stardog-2.2.2 Cluster Deployment, but it seems to be a different set of problems to what we're seeing.

First of all, it looks like we have to use a version of RHEL internally and are unable to use Ubuntu for much other than very localised testing.

I tried to set up a 3-node Stardog cluster using the automated process described in the Stardog docs - Setting up a cluster with Starman, but this seems to have failed with what looks like a check of the OS (looking for Ubuntu everywhere, but then finding RHEL)

After this I tried to set up using the manual process described on GitHub. I followed this process and got a 2 node cluster working on RHEL, but it fell over some time during the following weekend.

Next step was to try the same process over 3 separate RHEL instances.

These are all configured as per the instructions and start up, but after that I can't seem them externally on port 5820.

I have run (on each node):

./stardog-admin cluster zkstart  --home /data/stardog/2.2.2/cluster-node

./stardog-admin server start --home /data/stardog/2.2.2/cluster-node --port 6000

Then I tried on a single node:

./stardog-admin cluster proxystart --zkconnstr 192.168.2.1:2180,192.168.2.2:2180,192.168.2.3:2180 --user pack --password admin --port 5820

Things seem to have started up somewhat, but I'm not sure how much and if I'm seeing the right messages.

I tried with the above on each node too, with no difference.

When I got to the next step:

./stardog-admin db create -n dm

This just bails with the message:

Internal security error.

There is nothing in the logs about what type of error this is.

Each machine in the cluster can perform keyless ssh to the next, and each machine has passwordless sudo.

What are we doing wrong?

Many thanks,

Phil.

Fernando Hernandez

unread,

Nov 13, 2014, 3:08:15 PM11/13/14

to sta...@clarkparsia.com

Hi,

On Thu, Nov 13, 2014 at 8:05 AM, Phil C <philip...@epeirogenic.com> wrote:

Hi,

I've read through the answers to Stardog-2.2.2 Cluster Deployment, but it seems to be a different set of problems to what we're seeing.

First of all, it looks like we have to use a version of RHEL internally and are unable to use Ubuntu for much other than very localised testing.

I tried to set up a 3-node Stardog cluster using the automated process described in the Stardog docs - Setting up a cluster with Starman, but this seems to have failed with what looks like a check of the OS (looking for Ubuntu everywhere, but then finding RHEL)

After this I tried to set up using the manual process described on GitHub. I followed this process and got a 2 node cluster working on RHEL, but it fell over some time during the following weekend.

Next step was to try the same process over 3 separate RHEL instances.

These are all configured as per the instructions and start up, but after that I can't seem them externally on port 5820.

I have run (on each node):

./stardog-admin cluster zkstart --home /data/stardog/2.2.2/cluster-node ./stardog-admin server start --home /data/stardog/2.2.2/cluster-node --port 6000

Then I tried on a single node:

./stardog-admin cluster proxystart --zkconnstr 192.168.2.1:2180,192.168.2.2:2180,192.168.2.3:2180 --user pack --password admin --port 5820

Things seem to have started up somewhat, but I'm not sure how much and if I'm seeing the right messages.

If you set the debug level to 'INFO' it should tell you something about the state of the cluster when you start any of these components (in stardog.log, proxy.log, or zookeeper.log).

I tried with the above on each node too, with no difference.

When I got to the next step:

./stardog-admin db create -n dm

This just bails with the message:

Internal security error.

Are you able to issue a 'pack info' command – or other query commands – to any of the Stardog nodes directly or through the proxy?

E.g.

stardog-admin --server snarl://<ipaddress>:5820/ pack info

Cheers,

Fernando

Phil C

unread,

Nov 14, 2014, 7:18:31 AM11/14/14

to sta...@clarkparsia.com, fern...@clarkparsia.com

Hi Fernando,

Thanks for the suggestions?

On Thursday, 13 November 2014 20:08:15 UTC, Fernando Hernandez wrote:

Hi,

On Thu, Nov 13, 2014 at 8:05 AM, Phil C <philip...@epeirogenic.com> wrote:
Hi,

I've read through the answers to Stardog-2.2.2 Cluster Deployment, but it seems to be a different set of problems to what we're seeing.

First of all, it looks like we have to use a version of RHEL internally and are unable to use Ubuntu for much other than very localised testing.

I tried to set up a 3-node Stardog cluster using the automated process described in the Stardog docs - Setting up a cluster with Starman, but this seems to have failed with what looks like a check of the OS (looking for Ubuntu everywhere, but then finding RHEL)

After this I tried to set up using the manual process described on GitHub. I followed this process and got a 2 node cluster working on RHEL, but it fell over some time during the following weekend.

Next step was to try the same process over 3 separate RHEL instances.

These are all configured as per the instructions and start up, but after that I can't seem them externally on port 5820.

I have run (on each node):

./stardog-admin cluster zkstart --home /data/stardog/2.2.2/cluster-node ./stardog-admin server start --home /data/stardog/2.2.2/cluster-node --port 6000

Then I tried on a single node:

./stardog-admin cluster proxystart --zkconnstr 192.168.2.1:2180,192.168.2.2:2180,192.168.2.3:2180 --user pack --password admin --port 5820
Things seem to have started up somewhat, but I'm not sure how much and if I'm seeing the right messages.

If you set the debug level to 'INFO' it should tell you something about the state of the cluster when you start any of these components (in stardog.log, proxy.log, or zookeeper.log).

This sounds like a great idea, but the docs are somewhat light on how to do this (unless I've missed something) - what property needs to be set?

I tried with the above on each node too, with no difference.

When I got to the next step:

./stardog-admin db create -n dm

This just bails with the message:

Internal security error.

Are you able to issue a 'pack info' command – or other query commands – to any of the Stardog nodes directly or through the proxy?

E.g.

stardog-admin --server snarl://<ipaddress>:5820/ pack info

I tried this, as per the documentation, but "pack" is an unrecognized command:

/data/stardog/2.2.2/bin> stardog-admin --server snarl://192.186.2.1:5820/ pack info
Unknown command pack

/data/stardog/2.2.2/bin> ./stardog-admin --server snarl://192.186.2.1:5820/ cluster info
Internal security error.

Phil C

unread,

Nov 14, 2014, 7:20:06 AM11/14/14

to sta...@clarkparsia.com, fern...@clarkparsia.com

Hi,

On Friday, 14 November 2014 12:18:31 UTC, Phil C wrote:

Hi Fernando,

Thanks for the suggestions! The original version of this wasn't supposed to end with a question mark!

Fernando Hernandez

unread,

Nov 14, 2014, 9:27:35 AM11/14/14

to sta...@clarkparsia.com

On Fri, Nov 14, 2014 at 7:20 AM, Phil C <philip...@epeirogenic.com> wrote:

Hi,

On Friday, 14 November 2014 12:18:31 UTC, Phil C wrote:
Hi Fernando,

Thanks for the suggestions! The original version of this wasn't supposed to end with a question mark!

On Thursday, 13 November 2014 20:08:15 UTC, Fernando Hernandez wrote:
Hi,

On Thu, Nov 13, 2014 at 8:05 AM, Phil C <philip...@epeirogenic.com> wrote:
Hi,

I've read through the answers to Stardog-2.2.2 Cluster Deployment, but it seems to be a different set of problems to what we're seeing.

First of all, it looks like we have to use a version of RHEL internally and are unable to use Ubuntu for much other than very localised testing.

I tried to set up a 3-node Stardog cluster using the automated process described in the Stardog docs - Setting up a cluster with Starman, but this seems to have failed with what looks like a check of the OS (looking for Ubuntu everywhere, but then finding RHEL)

After this I tried to set up using the manual process described on GitHub. I followed this process and got a 2 node cluster working on RHEL, but it fell over some time during the following weekend.

Next step was to try the same process over 3 separate RHEL instances.

These are all configured as per the instructions and start up, but after that I can't seem them externally on port 5820.

I have run (on each node):

./stardog-admin cluster zkstart --home /data/stardog/2.2.2/cluster-node ./stardog-admin server start --home /data/stardog/2.2.2/cluster-node --port 6000

Then I tried on a single node:

./stardog-admin cluster proxystart --zkconnstr 192.168.2.1:2180,192.168.2.2:2180,192.168.2.3:2180 --user pack --password admin --port 5820
Things seem to have started up somewhat, but I'm not sure how much and if I'm seeing the right messages.

If you set the debug level to 'INFO' it should tell you something about the state of the cluster when you start any of these components (in stardog.log, proxy.log, or zookeeper.log).

This sounds like a great idea, but the docs are somewhat light on how to do this (unless I've missed something) - what property needs to be set?

You can just create a 'logging.properties' file – if there isn't one – in your $STARDOG_HOME. The following lines should do it:

handlers = java.util.logging.ConsoleHandler java.util.logging.FileHandler

java.util.logging.FileHandler.level = INFO

I tried with the above on each node too, with no difference.

When I got to the next step:

./stardog-admin db create -n dm

This just bails with the message:

Internal security error.

Are you able to issue a 'pack info' command – or other query commands – to any of the Stardog nodes directly or through the proxy?

E.g.

stardog-admin --server snarl://<ipaddress>:5820/ pack info

I tried this, as per the documentation, but "pack" is an unrecognized command:

/data/stardog/2.2.2/bin> stardog-admin --server snarl://192.186.2.1:5820/ pack info Unknown command pack /data/stardog/2.2.2/bin> ./stardog-admin --server snarl://192.186.2.1:5820/ cluster info Internal security error.

It looks like the stardog-admin script is using a different Stardog version than the one you're executing the script from. Can you verify that your $STARDOG environment variable is either empty or that it points to '/data/stardog/2.2.2'?

Cheers,

Fernando

Phil C

unread,

Nov 14, 2014, 9:45:52 AM11/14/14

to sta...@clarkparsia.com, fern...@clarkparsia.com

Thanks for the help with the logging - I'll give that a go.

You can just create a 'logging.properties' file – if there isn't one – in your $STARDOG_HOME. The following lines should do it:

handlers = java.util.logging.ConsoleHandler java.util.logging.FileHandler
java.util.logging.FileHandler.level = INFO

Are you able to issue a 'pack info' command – or other query commands – to any of the Stardog nodes directly or through the proxy?

E.g.

stardog-admin --server snarl://<ipaddress>:5820/ pack info

I tried this, as per the documentation, but "pack" is an unrecognized command:

/data/stardog/2.2.2/bin> stardog-admin --server snarl://192.186.2.1:5820/ pack info Unknown command pack /data/stardog/2.2.2/bin> ./stardog-admin --server snarl://192.186.2.1:5820/ cluster info Internal security error.

It looks like the stardog-admin script is using a different Stardog version than the one you're executing the script from. Can you verify that your $STARDOG environment variable is either empty or that it points to '/data/stardog/2.2.2'?

Running 'stardog-admin version' gives 'Stardog 2.2.2', echoing $STARDOG from the stardog-admin script gives '/data/stardog/2.2.2/bin/..', so essentially the same thing.

I'll put some more logging in and we can see if that gives anything useful,

Thanks,

Phil.

Phil C

unread,

Nov 14, 2014, 9:58:43 AM11/14/14

to sta...@clarkparsia.com, fern...@clarkparsia.com

After starting everything up, I see the following in zookeeper.log on 192.168.2.1 (node1):

WARNING: Cannot open channel to 2 at election address /192.168.2.2:7888
java.net.ConnectException: Connection refused
 at java.net.PlainSocketImpl.socketConnect(Native Method)
 at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
 at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
 at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
 at java.net.Socket.connect(Socket.java:579)
 at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:354)
 at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:388)
 at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:765)
 at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:716)


Nov 14, 2014 2:51:32 PM org.apache.zookeeper.server.quorum.QuorumCnxManager connectOne
WARNING: Cannot open channel to 3 at election address /192.168.2.3:5888
java.net.ConnectException: Connection refused
 at java.net.PlainSocketImpl.socketConnect(Native Method)
 at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
 at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
 at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
 at java.net.Socket.connect(Socket.java:579)
 at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:354)
 at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:388)
 at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:765)

When I look at the nodes themselves it looks like 192.168.2.2 is listening on 7888 and 192.168.2.3 is listening on 5888 (netstat -tulpn)

Does this give more of a clue?

Thanks,

Phil.

Fernando Hernandez

unread,

Nov 14, 2014, 10:21:32 AM11/14/14

to Phil C, sta...@clarkparsia.com

Are those ports reachable from each machine? A quick way to test it with telnet, for example: telnet 192.168.2.2 7888.

Another reason could be that the zookeeper data folder doesn't have the ID that corresponds to that node; so if in your zookeeper properties you have:

tickTime=2000

dataDir=/path/to/data/

clientPort=2180

initLimit=5

syncLimit=2

server.1=192.168.2.1:2888:3888

server.2=192.168.2.2:6888:7888

server.3=192.168.2.3:6888:7888

then the contents of the file /path/to/data/myid at 192.168.2.1 should be '1' (just the character 1, which corresponds to the ID in 'server.1'). Similarly, at 192.168.2.2 and 192.168.2.3 you should have the IDs 2 and 3, respectively.

Cheers,

Fernando

Phil C

unread,

Nov 14, 2014, 10:48:40 AM11/14/14

to sta...@clarkparsia.com, philip...@epeirogenic.com, fern...@clarkparsia.com

I was hoping this might be the answer, but my zookeeper.properties looks like:

# zookeeper.properties
tickTime=2000  
dataDir=/tmp/zookeeperdata/  
clientPort=2180  
initLimit=5  
syncLimit=2  
server.1=192.168.2.1:2888:3888  
server.2=192.168.2.2:6888:7888  
server.3=192.168.2.3:4888:5888

Which looks reasonably OK

Fernando Hernandez

unread,

Nov 14, 2014, 10:58:26 AM11/14/14

to Phil C, sta...@clarkparsia.com

Were you able to verify that the ports are reachable from the other machine and that the IDs correspond to the ones in the zookeeper.properties file?

Phil C

unread,

Nov 14, 2014, 11:02:39 AM11/14/14

to sta...@clarkparsia.com, philip...@epeirogenic.com, fern...@clarkparsia.com

Yes

Fernando Hernandez

unread,

Nov 17, 2014, 11:26:09 AM11/17/14

to Phil C, sta...@clarkparsia.com

Phil,

It would probably be easier if you try with VirtualBox or AWS and Starman just to get you started with Stardog as a custer. Then we can figure out the details of your Red Hat setup.