node for /job-schedule/scheduler isn't created on zookeeper

114 views
Skip to first unread message

art...@appsflyer.com

unread,
Dec 2, 2015, 5:46:21 PM12/2/15
to Onyx
I'm trying to run a simple job locally with local apache zookeeper v3.46 server running 

That's my peer-conf : 

{:zookeeper/address "127.0.0.1:2181"
   :onyx/id "001"
   :onyx.peer/job-scheduler :onyx.job-scheduler/balanced
   :onyx.messaging.aeron/embedded-driver? true
   :onyx.messaging/allow-short-circuit? false
   :onyx.messaging/impl :aeron
   :onyx.messaging/peer-port 40200
   :onyx.messaging/bind-addr "localhost"}

After creating and running uberjar I'm getting zookeeper exception "org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /onyx/001/job-scheduler/scheduler".

Looking in zookeeper nodes I see that "/onyx/001/job-scheduler" exists, but it's indeed empty.

This is the output:

INFO [onyx.log.zookeeper] - Starting ZooKeeper client connection. If Onyx hangs here it may indicate a difficulty connecting to ZooKeeper.
INFO [onyx.log.zookeeper] - Stopping ZooKeeper client connection
Job is running, job-id:  #uuid "71308bdd-c4f4-46a4-aaba-a0b4ec02a5ea"
INFO [onyx.log.zookeeper] - Starting ZooKeeper client connection. If Onyx hangs here it may indicate a difficulty connecting to ZooKeeper.
WARN [onyx.log.zookeeper] - 
                                                 java.lang.Thread.run              Thread.java:  745
                   java.util.concurrent.ThreadPoolExecutor$Worker.run  ThreadPoolExecutor.java:  617
                    java.util.concurrent.ThreadPoolExecutor.runWorker  ThreadPoolExecutor.java: 1142
                                                                  ...                               
                                    clojure.core.async/thread-call/fn                async.clj:  434
                                             onyx.log.zookeeper/fn/fn            zookeeper.clj:  233
                                onyx.log.zookeeper/find-job-scheduler            zookeeper.clj:  189
                             onyx.log.zookeeper/find-job-scheduler/fn            zookeeper.clj:  190
                                                                  ...                               
                                                onyx.log.zookeeper/fn            zookeeper.clj:  503
                         onyx.monitoring.measurements/measure-latency         measurements.clj:   11
                                             onyx.log.zookeeper/fn/fn            zookeeper.clj:  504
                       onyx.log.zookeeper/clean-up-broken-connections            zookeeper.clj:   73
                                          onyx.log.zookeeper/fn/fn/fn            zookeeper.clj:  507
                                                onyx.log.curator/data              curator.clj:  127
       org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath  GetDataBuilderImpl.java:  138
       org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath  GetDataBuilderImpl.java:  142
         org.apache.curator.framework.imps.GetDataBuilderImpl.forPath  GetDataBuilderImpl.java:  279
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground  GetDataBuilderImpl.java:  287
                           org.apache.curator.RetryLoop.callWithRetry           RetryLoop.java:  107
          org.apache.curator.framework.imps.GetDataBuilderImpl$4.call  GetDataBuilderImpl.java:  291
          org.apache.curator.framework.imps.GetDataBuilderImpl$4.call  GetDataBuilderImpl.java:  302
                               org.apache.zookeeper.ZooKeeper.getData           ZooKeeper.java: 1155
                          org.apache.zookeeper.KeeperException.create     KeeperException.java:   51
                          org.apache.zookeeper.KeeperException.create     KeeperException.java:  111
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /onyx/001/job-scheduler/scheduler
    code: -101
    path: "/onyx/001/job-scheduler/scheduler"

WARN [onyx.log.zookeeper] - Job scheduler couldn't be discovered. Backing off 500ms and trying again...



Mike Drogalis

unread,
Dec 2, 2015, 5:48:35 PM12/2/15
to art...@appsflyer.com, Onyx
Hi there!

One question, and a request:

- Are you running the dashboard, by any chance?
- Can you please send me your program here in a Gist? I think the fix is very simple (peer group hasn't been started), but I'd like to see.

Thanks!

--
You received this message because you are subscribed to the Google Groups "Onyx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to onyx-user+...@googlegroups.com.
To post to this group, send email to onyx...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/onyx-user/0f09e6d5-3e54-476b-a593-a768e8e44375%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

art...@appsflyer.com

unread,
Dec 2, 2015, 6:03:01 PM12/2/15
to Onyx
1. No, I don't run dashboard. Just stated playing with Onyx today.

You're right, I don't start peer group, i see this in the API doc:
(start-peer-group peer-config) Starts a peer group for use in cases where an env is not started (e.g. distributed mode)
I've missed it. 
Thank you for your quick response, btw.

Mike Drogalis

unread,
Dec 2, 2015, 6:13:40 PM12/2/15
to art...@appsflyer.com, Onyx
Cool, glad to see it working.

For some context, when the peer group starts up, it races against other peers groups to write some information about the messaging layer to ZooKeeper. Only one peer group needs to succeed. If the information has already been written (e.g. the group loses the race), the peer ignores it because the data is present. In your case, that information wasn't found, and the peers couldn't proceed because it needed that data to fully boot itself up.

A few other pre-emptive things from that Gist:

- You might want to make this a sliding/dropping buffer. If Onyx tries to write a message to this channel, and the buffer is full, everything will block. This will make the Onyx peer halt execution.
- You can use with-test-env as seen in the Onyx workshop for bullet-proof repl development. Using this, you won't need to control the start up and shutdown of the env, peer group, and peers manually.
- A new version of the onyx-template will be out soon (see the feature/new-idioms branch) with all of this advice preloaded into your code. :)

Happy Onyx'ing!

--
You received this message because you are subscribed to the Google Groups "Onyx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to onyx-user+...@googlegroups.com.
To post to this group, send email to onyx...@googlegroups.com.

art...@appsflyer.com

unread,
Dec 2, 2015, 6:17:06 PM12/2/15
to Onyx
Great! Thanks again :)


On Wednesday, December 2, 2015 at 3:46:21 PM UTC-7, art...@appsflyer.com wrote:
Reply all
Reply to author
Forward
0 new messages