Druid + Docker + Tranquility - Zookeeper znode issue

900 views
Skip to first unread message

ram.th...@gmail.com

unread,
Jan 31, 2016, 10:49:10 AM1/31/16
to Druid User
I am trying to ingest data into Druid from spark. Both Druid and Spark run on docker containers each. I am using Tranquility for ingesting data into Druid. The docker for Druid is the official one from druid-io: https://github.com/druid-io/docker-druid

I am seeing this error:

2016-01-31 14:50:57,637 DEBG 'zookeeper' stdout output:
2016-01-31 14:50:57,636 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.99.1:49197
2016-01-31 14:50:57,636 [myid:] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection request from old client /192.168.99.1:49197; will be dropped if server is in r-o mode
2016-01-31 14:50:57,636 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@841] - Refusing session request for client /192.168.99.1:49197 as it has seen zxid 0x50 our last zxid is 0x23 client must try another server
2016-01-31 14:50:57,636 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /192.168.99.1:49197 (no session established for client)

2016-01-31 14:50:58,795 DEBG 'zookeeper' stdout output:
2016-01-31 14:50:58,789 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x152981fdc7a0004 type:create cxid:0x2 zxid:0x24 txntype:-1 reqpath:n/a Error Path:/tranquility/beams/overlord/test Error:KeeperErrorCode = NoNode for /tranquility/beams/overlord/test

2016-01-31 14:50:58,856 DEBG 'zookeeper' stdout output:
2016-01-31 14:50:58,856 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x152981fdc7a0004 type:create cxid:0xc zxid:0x2a txntype:-1 reqpath:n/a Error Path:/tranquility/beams/overlord/test/mutex/locks Error:KeeperErrorCode = NoNode for /tranquility/beams/overlord/test/mutex/locks

2016-01-31 14:50:58,902 DEBG 'zookeeper' stdout output:
2016-01-31 14:50:58,902 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x152981fdc7a0004 type:create cxid:0x17 zxid:0x2e txntype:-1 reqpath:n/a Error Path:/tranquility/beams/overlord/test/mutex/leases Error:KeeperErrorCode = NoNode for /tranquility/beams/overlord/test/mutex/leases


I am assuming that the required path on zookeeper is not found : /tranquility/beams/overlord/*. I tried grepping for the properties file on the docker-machine (on Mac OSX) with no luck. 

I was able to get the list of paths from zookeeper and I could see /discovery/druid:overlord present. This is how my BeamFactory code looks like:

class EventRDDBeamFactory extends BeamFactory[Map[String,String]] {


  lazy val makeBeam: Beam[Map[String,String]] = {

    val curator = CuratorFrameworkFactory.newClient(

      "192.168.99.100:3181",

      new BoundedExponentialBackoffRetry(100, 3000, 5))

    curator.start()


    val indexService = "druid/overlord" 

    val discoveryPath = "/discovery"

    

    val dataSource = "test"

    val dimensions = IndexedSeq("ip")

    val aggregators = Seq(new CountAggregatorFactory("website"))

    

    val timestampFn = (message: Map[String,String]) => new DateTime(message.get("time").get)

    

    DruidBeams

      .builder(timestampFn)

      .curator(curator)

      .discoveryPath(discoveryPath)

      .location(DruidLocation.create(indexService, dataSource))

      .rollup(DruidRollup(SpecificDruidDimensions(dimensions), aggregators, QueryGranularity.MINUTE))

      .tuning(

        ClusteredBeamTuning(

          segmentGranularity = Granularity.HOUR,

          windowPeriod = new Period("PT10M"),

          partitions = 1,

          replicants = 1

        )

      )

      .buildBeam()

  }

}



Can anyone please help me understand what could be the issue here?

I also have a few questions:

1) Looking at the supervisord config for the docker image, I didnt see any command line overrides specified for the overlord's service name and the path. I could not find the required *.properties file on the docker container. Is there any way (maybe an API) that is available that helps find out these values and maybe even override those? The overlord console only lists the set of tasks etc., but not the config.
2) I had also played around a bit by creating the required path manually on zookeeper and gave complete access on the same (/tranquility/beams/overlord/test/*). Even then, I got the same error. This makes me think it is not necessarily an issue with zookeeper paths?

Fangjin Yang

unread,
Jan 31, 2016, 2:51:52 PM1/31/16
to Druid User
I'm not 100% sure if anyone has actually been able to get the Druid docker image working. I think you'll have much better luck with the Docker distribution here http://imply.io/download if you are just starting out with Druid.

ram.th...@gmail.com

unread,
Feb 10, 2016, 8:16:33 AM2/10/16
to Druid User
Thanks Fangjin for the response, I am starting with Imply packed distribution. I have started a single machine druid instance using the quickstart.

I have made the following changes to the quickstart.conf:

:verify bin/verify-java

:verify bin/verify-default-ports


!p10 zk bin/run-zk conf-quickstart

coordinator bin/run-druid coordinator conf-quickstart

broker bin/run-druid broker conf-quickstart

historical bin/run-druid historical conf-quickstart

!p80 overlord bin/run-druid overlord conf-quickstart

!p90 middleManager bin/run-druid middleManager conf-quickstart

#tranquility-server bin/tranquility server -configFile conf-quickstart/tranquility/server.json


# Uncomment to use Tranquility Kafka

#tranquility-kafka bin/tranquility kafka -configFile conf-quickstart/tranquility/kafka.json


Apart from this, I haven't made any changes to any other config files. I am able to get the requests get to Druid through tranquility library, but I see the following error on zookeeper:


2016-02-10 12:52:31,558 INFO [SyncThread:0] org.apache.zookeeper.server.ZooKeeperServer - Established session 0x152cb3c23a20005 with negotiated timeout 40000 for client /10.196.192.38:49836

2016-02-10 12:52:33,943 INFO [ProcessThread(sid:0 cport:-1):] org.apache.zookeeper.server.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x152cb3c23a20005 type:create cxid:0x2 zxid:0x44 txntype:-1 reqpath:n/a Error Path:/tranquility/beams/druid:overlord/druid_ingest Error:KeeperErrorCode = NoNode for /tranquility/beams/druid:overlord/druid_ingest

2016-02-10 12:52:33,963 INFO [ProcessThread(sid:0 cport:-1):] org.apache.zookeeper.server.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x152cb3c23a20005 type:create cxid:0xc zxid:0x4a txntype:-1 reqpath:n/a Error Path:/tranquility/beams/druid:overlord/druid_ingest/mutex/locks Error:KeeperErrorCode = NoNode for /tranquility/beams/druid:overlord/druid_ingest/mutex/locks

2016-02-10 12:52:33,980 INFO [ProcessThread(sid:0 cport:-1):] org.apache.zookeeper.server.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x152cb3c23a20005 type:create cxid:0x17 zxid:0x4e txntype:-1 reqpath:n/a Error Path:/tranquility/beams/druid:overlord/druid_ingest/mutex/leases Error:KeeperErrorCode = NoNode for /tranquility/beams/druid:overlord/druid_ingest/mutex/leases


My datasource name is druid_ingest. I don't see any real-time indexing tasks on the coordinator either. And there are no datasources returned from the broker API<host>:8082/druid/v2/datasources.


I see this in the overlord logs, when the overlord starts up:


2016-02-10T13:05:49,020 WARN [main] com.metamx.common.RetryUtils - Failed on try 1, retrying in 2,054ms.

org.skife.jdbi.v2.exceptions.UnableToObtainConnectionException: java.sql.SQLException: Cannot create PoolableConnectionFactory (java.net.ConnectException : Error connecting to server localhost on port 1,527 with message Connection refused.)

        at org.skife.jdbi.v2.DBI.open(DBI.java:230) ~[jdbi-2.63.1.jar:2.63.1]

        at org.skife.jdbi.v2.DBI.withHandle(DBI.java:279) ~[jdbi-2.63.1.jar:2.63.1]

        at io.druid.metadata.SQLMetadataConnector$2.call(SQLMetadataConnector.java:108) ~[druid-server-0.8.3-iap1.jar:0.8.3-iap1]

        at com.metamx.common.RetryUtils.retry(RetryUtils.java:38) [java-util-0.27.4.jar:?]

        at io.druid.metadata.SQLMetadataConnector.retryWithHandle(SQLMetadataConnector.java:113) [druid-server-0.8.3-iap1.jar:0.8.3-iap1]

        at io.druid.metadata.SQLMetadataConnector.createTable(SQLMetadataConnector.java:157) [druid-server-0.8.3-iap1.jar:0.8.3-iap1]

        at io.druid.metadata.SQLMetadataConnector.createConfigTable(SQLMetadataConnector.java:231) [druid-server-0.8.3-iap1.jar:0.8.3-iap1]

        at io.druid.metadata.SQLMetadataConnector.createConfigTable(SQLMetadataConnector.java:374) [druid-server-0.8.3-iap1.jar:0.8.3-iap1]

        at io.druid.guice.JacksonConfigManagerModule$1.start(JacksonConfigManagerModule.java:56) [druid-common-0.8.3-iap1.jar:0.8.3-iap1]

        at com.metamx.common.lifecycle.Lifecycle.start(Lifecycle.java:244) [java-util-0.27.4.jar:?]

        at io.druid.guice.LifecycleModule$2.start(LifecycleModule.java:155) [druid-api-0.3.13.jar:0.8.3-iap1]

        at io.druid.cli.GuiceRunnable.initLifecycle(GuiceRunnable.java:71) [druid-services-0.8.3-iap1.jar:0.8.3-iap1]

        at io.druid.cli.ServerRunnable.run(ServerRunnable.java:38) [druid-services-0.8.3-iap1.jar:0.8.3-iap1]

        at io.druid.cli.Main.main(Main.java:99) [druid-services-0.8.3-iap1.jar:0.8.3-iap1]

Caused by: java.sql.SQLException: Cannot create PoolableConnectionFactory (java.net.ConnectException : Error connecting to server localhost on port 1,527 with message Connection refused.)

        at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:2152) ~[commons-dbcp2-2.0.1.jar:2.0.1]

        at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:1903) ~[commons-dbcp2-2.0.1.jar:2.0.1]

        at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1413) ~[commons-dbcp2-2.0.1.jar:2.0.1]

        at org.skife.jdbi.v2.DataSourceConnectionFactory.openConnection(DataSourceConnectionFactory.java:36) ~[jdbi-2.63.1.jar:2.63.1]

        at org.skife.jdbi.v2.DBI.open(DBI.java:212) ~[jdbi-2.63.1.jar:2.63.1]

        ... 13 more

Caused by: java.sql.SQLNonTransientConnectionException: java.net.ConnectException : Error connecting to server localhost on port 1,527 with message Connection refused.

        at org.apache.derby.client.am.SQLExceptionFactory.getSQLException(Unknown Source) ~[derbyclient-10.11.1.1.jar:?]

        at org.apache.derby.client.am.SqlException.getSQLException(Unknown Source) ~[derbyclient-10.11.1.1.jar:?]

        at org.apache.derby.jdbc.ClientDriver.connect(Unknown Source) ~[derbyclient-10.11.1.1.jar:?]

        at org.apache.commons.dbcp2.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:39) ~[commons-dbcp2-2.0.1.jar:2.0.1]

        at org.apache.commons.dbcp2.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:205) ~[commons-dbcp2-2.0.1.jar:2.0.1]

        at org.apache.commons.dbcp2.BasicDataSource.validateConnectionFactory(BasicDataSource.java:2162) ~[commons-dbcp2-2.0.1.jar:2.0.1]

        at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:2148) ~[commons-dbcp2-2.0.1.jar:2.0.1]

        at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:1903) ~[commons-dbcp2-2.0.1.jar:2.0.1]

        at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1413) ~[commons-dbcp2-2.0.1.jar:2.0.1]

        at org.skife.jdbi.v2.DataSourceConnectionFactory.openConnection(DataSourceConnectionFactory.java:36) ~[jdbi-2.63.1.jar:2.63.1]

        at org.skife.jdbi.v2.DBI.open(DBI.java:212) ~[jdbi-2.63.1.jar:2.63.1]

        ... 13 more


My datasource is created, because of the above? Is there any way I can debug here?

ram.th...@gmail.com

unread,
Feb 10, 2016, 8:19:08 AM2/10/16
to Druid User
Adding my code for reference:

  val DRUID_INDEX_SERVICE = "druid/overlord"

  val DRUID_DISCOVERY_PATH = "/discovery"

  val DATA_SOURCE = "druid_ingest"


  val DRUID_TRANQUILITY_RETRY_POLICY = new BoundedExponentialBackoffRetry(100, 3000, 5)

  val DRUID_TRANQUILITY_TUNING = ClusteredBeamTuning(

    segmentGranularity = Granularity.HOUR,

    windowPeriod = new Period("PT10M"),

    partitions = 1,

    replicants = 1)


 lazy val BeamInstance: Beam[Map[String, String]] = {

    val curator = CuratorFrameworkFactory.newClient(

      ConfigConstants.DRUID_ZOOKEEPER,

      ConfigConstants.DRUID_TRANQUILITY_RETRY_POLICY)

    curator.start()


    val dimensions = <keys>

    val aggregators = <metrics>


    val timestampFn = (message: Map[String, String]) => CommonUtil.convertMillisToJodaDate(message.get("timestamp").get)


    DruidBeams

      .builder(timestampFn)

      .curator(curator)

      .discoveryPath(ConfigConstants.DRUID_DISCOVERY_PATH)

      .location(DruidLocation.create(ConfigConstants.DRUID_INDEX_SERVICE, ConfigConstants.DATA_SOURCE))

      .rollup(DruidRollup(SpecificDruidDimensions(dimensions), aggregators, QueryGranularity.MINUTE))

      .tuning(ConfigConstants.DRUID_TRANQUILITY_TUNING)

      .buildBeam()

ram.th...@gmail.com

unread,
Feb 11, 2016, 7:24:51 AM2/11/16
to Druid User
Any help here would be greatly appreciated? I also posted this on the IRC channel, but I didn't get any response.
Message has been deleted

Gian Merlino

unread,
Feb 11, 2016, 1:20:51 PM2/11/16
to druid...@googlegroups.com
Hey Ram,

That error makes it sound like your coordinator isn't starting up properly (the default single-machine setup involves the coordinator running a derby server on port 1527). Is there anything interesting in the coordinator logs?

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/672e5549-7412-4553-8d4f-5a8843ce11d2%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

ram.th...@gmail.com

unread,
Feb 11, 2016, 11:01:52 PM2/11/16
to Druid User
Thanks Gian for quickly jumping to help. 

I don't see anything weird in the coordinator logs. In-fact, I see that derby server is running on port 1527 and the coordinator process which runs on 8081 is managing the derby server as well (from netstat). But I see a connect exception to the derby server in the overlord logs,  as I had mentioned above.

Anyways I am attaching the complete var/sv folder, zipped up. The logs were taken after a few rounds of my spark job trying to pump in data into druid through tranquility. I have replaced my druid host as 'hostname.com' and my client as 'CLIENT_IP'. 

Please let me know if you find anything missing. Thanks again for the help.

- Ram
sv.zip

Gian Merlino

unread,
Feb 18, 2016, 1:57:02 AM2/18/16
to druid...@googlegroups.com
Hey Ram,

Actually looking through those logs I think the exceptions you see are "normal" startup things (they're mostly transient errors caused by the fact that the cluster hasn't fully started up yet). It seems from the logs that things are OK once the cluster has got going. What exactly is not working? What happens when you run your Tranquility program?

Gian

Jakub Liska

unread,
Apr 29, 2016, 5:13:29 AM4/29/16
to Druid User
Guys what are the !p10 !p80 !p90  prefixes in supervise configuration? Is it an ordering? I cannot figure that out from bin/supervise

Looks like start or kill ordering but why such weird numbers like 10, 80, 90 ???
Reply all
Reply to author
Forward
0 new messages