Unable to get Tranquility & overlord/middle manager working

460 views
Skip to first unread message

Preetam Rao

unread,
Jul 15, 2014, 3:51:43 AM7/15/14
to druid-de...@googlegroups.com
Hi,

I could run with overlord in local mode fine. But having trouble with overlord & middle manager.

Exception in topology
---------------------------
2014-07-15 07:44:22,864 WARN [Hashed wheel timer #1] com.metamx.tranquility.beam.ClusteredBeam - Emitting alert: [anomaly] Failed to propagate events: overlord/druid_test
{
  "eventCount" : 16,
  "timestamp" : "2014-07-15T07:00:00.000Z",
  "beams" : "HashPartitionBeam(DruidBeam(timestamp = 2014-07-15T07:00:00.000Z, partition = 0, tasks = [index_realtime_druid_test_2014-07-15T07:00:00.000Z_0_0_mhoacioc/druid_test-07-0000-0000]))"
}
com.twitter.finagle.GlobalRequestTimeoutException: exceeded 1.minutes+30.seconds to druid:local:firehose:druid_test-07-0000-0000 while waiting for a response for the request, including retries (if applicable)
        at com.twitter.finagle.NoStacktrace(Unknown Source)
2014-07-15 07:44:22,881 INFO [Hashed wheel timer #1] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"alerts","timestamp":"2014-07-15T07:44:22.874Z","service":"tranquility","host":"localhost","severity":"anomaly","description":"Failed to propagate events: overlord/druid_test","data":{"eventCount":16,"timestamp":"2014-07-15T07:00:00.000Z","beams":"HashPartitionBeam(DruidBeam(timestamp = 2014-07-15T07:00:00.000Z, partition = 0, tasks = [index_realtime_druid_test_2014-07-15T07:00:00.000Z_0_0_mhoacioc/druid_test-07-0000-0000]))","exception":"com.twitter.finagle.GlobalRequestTimeoutException: exceeded 1.minutes+30.seconds to druid:local:firehose:druid_test-07-0000-0000 while waiting for a response for the request, including retries (if applicable)\n\tat com.twitter.finagle.NoStacktrace(Unknown Source)\n"}}]
2014-07-15 07:44:22,885 INFO [BeamBolt-Emitter-tranquillity-0] com.metamx.tranquility.storm.BeamBolt - Sent 0, ignored 16 queued events.


JVMs:
-------
java -Xmx2g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath lib/*:config/overlord io.druid.cli.Main server overlord
java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath lib/*:config/middlemanager io.druid.cli.Main server middleManager

overlord/runtime
---------------------
druid.host=localhost
druid.port=8087
druid.service=overlord

druid.zk.service.host=localhost

druid.extensions.coordinates=["io.druid.extensions:druid-kafka-seven:0.6.121"]

druid.db.connector.connectURI=jdbc:mysql://localhost:3306/druid
druid.db.connector.user=druid
druid.db.connector.password=diurd

druid.selectors.indexing.serviceName=overlord
druid.indexer.queue.startDelay=PT0M
druid.indexer.runner.javaOpts="-server -Xmx256m"
druid.indexer.fork.property.druid.processing.numThreads=1
druid.indexer.fork.property.druid.computation.buffer.size=100000000

druid.indexer.task.chathandler.type=announce
druid.indexer.runner.type=remote

middlemanager/runtime
---------------------------------
user.timezone=UTC
file.encoding=UTF-8
druid.host=localhost
druid.port=8092
druid.service=middleManager
druid.zk.service.host=localhost
druid.db.connector.connectURI=jdbc:mysql://localhost:3306/druid
druid.db.connector.user=druid
druid.db.connector.password=diurd
druid.selectors.indexing.serviceName=overlord
druid.indexer.runner.startPort=8093
druid.indexer.fork.property.druid.computation.buffer.size=268435456
druid.indexer.task.chathandler.type=announce

logs
-------

I did see this line overlord logs
2014-07-15 07:40:28,565 INFO [PathChildrenCache-0] io.druid.indexing.overlord.RemoteTaskRunner - W orker[localhost:8092] reportin' for duty!

And this in middle manager logs
2014-07-15 07:40:29,326 INFO [main] org.eclipse.jetty.server.Server - Started @3381ms

But then nothing else gets printed. Nor do I see anything latest under logs/ folder.

I do not see any tasks in overlord console.

DruidBeam code
----------------------
            final DruidBeams.Builder<Map<String, String>> builder = DruidBeams
                    .builder(new Timestamper<Map<String, String>>() {
                        @Override
                        public DateTime timestamp(Map<String, String> theMap) {
                            PrintInfo.getInstance().print("Parsing timestamp from event " + theMap.toString());
                            return new DateTime(theMap.get("timestamp"));
                        }
                    })
                    .curator(curator)
                    .location(
                            new DruidLocation(new DruidEnvironment("overlord", "druid:local:firehose:%s"),
                                    dataSource))
                    .rollup(DruidRollup.create(dimensions, aggregators, QueryGranularity.MINUTE))
                    .tuning(ClusteredBeamTuning.create(Granularity.HOUR, new Period("PT0M"), new Period("PT10M"), 1, 1));

            return builder.buildBeam();


This code works fine in local.

Can you please let me know what could be going wrong ? I think the Beam is unable to submit tasks to overlord. But can't find any logs nor errors.


Gian Merlino

unread,
Jul 15, 2014, 10:31:57 AM7/15/14
to druid-de...@googlegroups.com
Can you try shutting down the programs, clearing out your tranquility path in ZK (the default is /tranquility/beams; you could also just use a different datasource, which will have the same effect), setting druid.indexer.storage.type=db on your overlord, and seeing if that makes things more stable in general? That setting will switch the overlord to storing task metadata in your mysql database instead of in memory, so it can persist across overlord restarts. The default in-memory task storage gets wiped every time you restart the overlord, which will lead tranquility to look for a task that the overlord doesn't know about.

Preetam Rao

unread,
Jul 15, 2014, 11:54:04 AM7/15/14
to druid-de...@googlegroups.com
Thanks a lot :-) Setting storage as db and changing data source name worked. Really appreciate your help. Had spent couple of hours trying to fix but no avail.

Reply all
Reply to author
Forward
0 new messages