How does realtime with the indexing service compare to using a realtime node?

1,086 views
Skip to first unread message

Wayne Adams

unread,
Feb 19, 2014, 4:19:32 PM2/19/14
to druid-de...@googlegroups.com
I see in the 0.6 docs on the Indexing Service that there is a "Realtime Index Task", which I'm thinking of using.  I have a couple of questions about it, though...

1) Since a realtime node

  (a) ingests data,
  (b) makes it available in realtime, and
  (c) arranges to push to deep storage,

does this mean that the Realtime Index task only ingests data?  Is that data considered the same as batch-ingested data, i.e. requests for it get serviced only by Historical nodes, or is it immediately available for realtime access? 

-- Do I still need a realtime node to handle realtime requests from the broker, or does the Indexing Service's middle manager become a fully-capable realtime node (in which case, it would seem the broker would have to route realtime requests to that middle manager)? 

-- Also, if I do still need a realtime node, how do I configure it without a realtime ingest spec, since I assume it wouldn't need one in this case?

2) Assuming I resolve the issues above, are there any differences in the robustness of ingesting realtime with one approach vs the other?  In our current 0.5.x deployment, we sometimes get handoff failures ("Not enough _default_tier servers or node capacity to assign segment"), so I'm interested in what steps we could take to make this process more robust/stable.

Thanks -- Wayne


Gian Merlino

unread,
Feb 19, 2014, 6:19:32 PM2/19/14
to druid-de...@googlegroups.com
1) The realtime index task is intended to totally replace realtime nodes for cases where you'd rather interact with ingestion through an API than by manually writing configuration files and starting up processes. It does all the things a realtime node does: ingests data, makes it available to the broker in realtime, and pushes segments to deep storage. So for any given datasource you'd use realtime index tasks, or realtime nodes, but not both.

2) The handoff process is essentially the same in either method, so I don't think you'll see any difference there. The best thing you can do to reduce those handoff failures is to set up monitoring to make sure that you always have a working coordinator and enough capacity in your historical cluster.

The main difference between the realtime task and the standalone node is the ability to programmatically create and destroy tasks. This has a couple of nice effects:

- It can give you more flexibility in resource management. The indexing service will take the pool of worker machines you give it and assign work appropriately, without you having to decide what runs on what machine. If you are running in a cloud, you can also have your indexing service automatically provision and terminate machines based on how much work you give it.

- It can let you get creative with how you manage task lifecycles. For example, in our environment we use tranquility (https://github.com/metamx/tranquility) to manage sets of short-lived realtime index tasks rather than a single long-lived process. This lets us do configuration and schema changes without any downtime at all, as the stream of data seamlessly moves from the old set of tasks to the new.

There is also one somewhat annoying disadvantage, currently, if you choose to have long-lived realtime index tasks: https://github.com/metamx/druid/issues/401. Patches welcome :)

Wayne Adams

unread,
Feb 19, 2014, 7:23:16 PM2/19/14
to druid-de...@googlegroups.com
Thanks very much, Gian!

Tranquility looks really cool -- is there any documentation other than the readme and the unit tests?  I'm wondering how I'd use this with a Kafka source...

Thanks again -- Wayne

Gian Merlino

unread,
Feb 19, 2014, 10:44:26 PM2/19/14
to druid-de...@googlegroups.com
Not yet, but there are two main APIs, both of which can work with kafka:

- a Storm Bolt. If you're using storm, which does support kafka, you can hook this right into your topology.

- a Finagle-based client. If you're not using storm, then you can just write a program that uses the kafka consumer api to get messages, does stuff to them, and then passes them to the client (which Finagle calls a Service). The Service will asynchronously send them to Druid and return a Future<Int> that will eventually resolve to the number of events sent. If you want to send events synchronously, you can call Await.result(future) to sit there and wait for the events to send. If that's all you want to do, you don't need to study the Finagle docs, but if you want to do fancier stuff they are here: http://twitter.github.io/finagle/guide/. The docs are somewhat biased towards Scala examples, but Finagle works totally fine in Java too.

The readme has examples for both APIs. Some parts may be a bit cryptic, so feel free to ask if you run into trouble.

jiqing yao

unread,
Feb 20, 2014, 2:08:23 PM2/20/14
to druid-de...@googlegroups.com
Is kafka required here ? I am trying to have a druid setup that can consume log data in real time. Due to the company policy, we cannot use kafka. Is it a good idea to use storm between machines generating logs and druid real time ingestion ? Are you suggesting to use transquality in this case ?

Thanks

Jiqing Yao

Wayne Adams

unread,
Feb 20, 2014, 6:44:57 PM2/20/14
to druid-de...@googlegroups.com
Thanks again, Gian!

One more point, going back to the similarity between a standalone real-time node and using the indexing service...  For both batch ingest and real-time tasks, it appears that when you configure them through the indexer, there are some configuration properties you leave out (compared to when you run a standalone real-time node, or a standalone batch indexer).  One example is the leaving out the "segmentOutputPath" when doing Hadoop indexing through the service.  In the case of using the Indexing Service for real-time, I'm not sure how and where the segment directory gets defined.  It doesn't appear to be in the example real-time task spec.  I placed the following in the Overlord runtime properties:

druid.storage.type=local
druid.storage.storageDirectory=/mapr/system/druid/eg/segments
druid.segmentCache.locations=/mapr/system/druid/eg/locations
druid.segmentCache.infoDir=/mapr/system/druid/eg/info

thinking that the Overlord should be responsible for configuring the middle manager for handoff, but I'm several hours overdue for my first handoff and don't see anything even being attempted.  Should these properties be set on the middle manager?  Or do they need to be prepended with something like a "fork" prefix to ensure they get passed to the Task that will actually perform the ingest?

Thanks -- Wayne

Gian Merlino

unread,
Feb 20, 2014, 7:36:50 PM2/20/14
to druid-de...@googlegroups.com
Kafka is not required but it solves an important problem: decoupling your data producers and data consumers. You can solve this using other approaches if you can't use kafka. You can also just not decouple your producers and consumers, but this can lead to operational headaches (producers may suffer from backpressure if the consumers are down, and it may be difficult to manage lots of different kinds of producers and consumers).

Storm is solving a different problem from kafka. Kafka decouples producers and consumers, storm lets you inject business logic between your producers and consumers.

Gian Merlino

unread,
Feb 20, 2014, 7:48:42 PM2/20/14
to druid-de...@googlegroups.com
In the indexing service, the middle managers push segments out to deep storage, and the overlord updates metadata in your database. So your middle manager needs the properties for segment storage and your overlord needs properties for the database. It shouldn't hurt to provide all properties on both though. You shouldn't need any "fork" prefixed druid properties since the middle manager passes down all druid-related properties to its peons automatically.

Btw, the rationale behind the missing properties in the indexing service is that they are generally all storage- or operations-related things, and those things make more sense to specify in a global properties file rather than in each task.

Wayne Adams

unread,
Feb 21, 2014, 1:36:02 PM2/21/14
to druid-de...@googlegroups.com
Hi, Gian:

I had configured the Overlord, but not the Middle Manager, for the segment locations directory, and after running nearly 24 hours (with hour granularity and a 10-minute window), I never saw any segments pushed to deep storage.

I see in the documentation on the Overlord that if you are running with "druid.indexer.runner.type=local", which is the default value, the Overlord directly launches the tasks without using the Middle Manager at all.  This seems to confirm the experience I had on my first attempt, since when I posted my request to the Overlord, a real-time node was launched and it didn't appear the Middle Manager got used.  But if that is the case, why didn't my segments get pushed to deep storage, since the Overlord contains all the required properties?  Are there some other properties required other than druid.storage.type, druid.storage.storageDirectory, druid.segmentCache.locations, and druid.segmentCache.infoDir?

This morning, when it became obvious that I wasn't going to get segments pushed out, I brought down everything and restarted without a Middle Manager at all.  Overlord runtime properties include both the MySQL and segment-location properties.  I posted my realtime indexer request to the Overlord and saw that it stood up a realtime node, and I was immediately able to query the data.  After the first segment should have been pushed out, however, nothing happened.  But I did see the following in the Overlord log:

2014-02-21 18:11:51,512 INFO [Coordinator-Exec--0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-02-21T18:11:51.512Z","service":"coordinator","host":"localhost:8082","metric":"coordinator/dropQueue/count","value":0,"user1":"localhost:8081"}]
2014-02-21 18:12:51,161 WARN [DatabaseSegmentManager-Exec--0] io.druid.db.DatabaseSegmentManager - No segments found in the database!
2014-02-21 18:12:51,367 INFO [DatabaseRuleManager-Exec--0] io.druid.db.DatabaseRuleManager - Polled and found rules for 1 datasource(s)
2014-02-21 18:12:51,513 INFO [Coordinator-Exec--0] io.druid.server.coordinator.ReplicationThrottler - [_default_tier]: Replicant create queue is empty.
2014-02-21 18:12:51,513 INFO [Coordinator-Exec--0] io.druid.server.coordinator.ReplicationThrottler - [_default_tier]: Replicant terminate queue is empty.
2014-02-21 18:12:51,513 INFO [Coordinator-Exec--0] io.druid.server.coordinator.helper.DruidCoordinatorBalancer - [_default_tier]: One or fewer servers found.  Cannot balance.
2014-02-21 18:12:51,514 INFO [Coordinator-Exec--0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-02-21T18:12:51.513Z","service":"coordinator","host":"localhost:8082","metric":"coordinator/overShadowed/count","value":0}]

at about the time I expected the first segment to be pushed.  Does this shed any light?  I wouldn't expect there to be any segments in the database, yet, as this was the time I expected the first segment to be pushed.  But I'm wondering if that "One or fewer servers found." statement is really a problem or not, or even related to my issue.

I've noticed other people having issues where either they never see historical segments, or the segments appear 6 hours late, and I don't recall seeing those issues resolved (at least, in the text of the thread).  Are there any other places I should be looking, log-wise?  I should mention if it's not obvious that the segments table in MySQL is empty.

If this helps, this is the output from "ps" for the realtime node that starts when I submit my request to Overlord:

system    3472  3195  5 12:09 pts/2    00:04:43 java -cp /a/apps/druid-0.6.62-build/services/target/druid-services-0.6.62-SNAPSHOT-selfcontained.jar:/a/apps/druid-0.6.62-build/hll/target/druid-hll-0.6.62-SNAPSHOT.jar:/a/apps/druid-0.6.62-build/kafka-eight/target/druid-kafka-eight-0.6.62-SNAPSHOT.jar:/a/apps/druid-0.6.62-build/lib/trove4j-3.0.3.jar:/a/apps/druid-0.6.62-build/lib/kafka_2.9.2-0.8.0.jar:/a/apps/druid-0.6.62-build/lib/scala-library.jar:/a/apps/druid-0.6.62-build/lib/zkclient-0.1.jar:/a/apps/druid-0.6.62-build/lib/metrics-annotation-2.2.0.jar:/a/apps/druid-0.6.62-build/lib/metrics-core-2.2.0.jar:config/overlord -Ddruid.indexer.runner.javaOpts="-server -Xmx1g" -Ddruid.segmentCache.infoDir=/mapr/system/druid/eg/info -Ddruid.host=localhost -Ddruid.indexer.fork.property.druid.indexer.taskDir=/tmp/system/persistent -Ddruid.indexer.fork.property.druid.indexer.baseDir=/tmp/system/base -Ddruid.indexer.taskDir=/tmp/system/persistent -Ddruid.indexer.baseDir=/tmp/system/base -Ddruid.indexer.hadoopWorkingPath=/tmp/system/hadoop -Ddruid.db.connector.connectURI=jdbc:mysql://localhost:3306/druid -Duser.timezone=UTC -Ddruid.db.connector.password=diurd -Dfile.encoding.pkg=sun.io -Ddruid.indexer.fork.property.zookeeper.connect=localhost:6181 -Ddruid.database.ruleTable=druid_rules -Ddruid.storage.storageDirectory=/mapr/system/druid/eg/segments -Ddruid.selectors.indexing.serviceName=overlord -Ddruid.indexer.queue.startDelay=PT0M -Ddruid.port=8087 -Ddruid.indexer.fork.property.druid.indexer.runner.taskDir=/tmp/system/persistent -Ddruid.indexer.fork.property.java.io.tmpdir=/tmp/system/overlord -Ddruid.database.segmentTable=druid_segments -Ddruid.indexer.fork.property.druid.indexer.hadoopWorkingPath=/tmp/system/hadoop -Ddruid.database.configTable=druid_config -Ddruid.service=overlord -Djava.io.tmpdir=/tmp/system/overlord -Ddruid.indexer.runner.startPort=8088 -Ddruid.indexer.runner.taskDir=/tmp/system/persistent -Ddruid.zk.service.host=localhost:6181 -Ddruid.segmentCache.locations=/mapr/system/druid/eg/locations -Dfile.encoding=UTF-8 -Ddruid.storage.type=local -Ddruid.db.connector.user=druid -Ddruid.indexer.fork.property.druid.computation.buffer.size=268435456 -Ddruid.indexer.taskDir=/tmp/system/persistent -Ddruid.indexer.baseDir=/tmp/system/base -Dzookeeper.connect=localhost:6181 -Ddruid.indexer.runner.taskDir=/tmp/system/persistent -Djava.io.tmpdir=/tmp/system/overlord -Ddruid.indexer.hadoopWorkingPath=/tmp/system/hadoop -Ddruid.computation.buffer.size=268435456 -Ddruid.host=localhost:8088 -Ddruid.port=8088 io.druid.cli.Main internal peon /tmp/system/persistent/playback_slice_v3/249ef1bc-d8db-491f-8cc1-f32a8f276293/task.json /tmp/system/persistent/playback_slice_v3/249ef1bc-d8db-491f-8cc1-f32a8f276293/status.json --nodeType realtime

I know some of these properties are unnecessary here, as they relate to my earlier attempts to redirect "/tmp" output in batch indexer tasks.

Thanks, Gian!

-- Wayne

Wayne Adams

unread,
Feb 21, 2014, 3:24:09 PM2/21/14
to druid-de...@googlegroups.com
One other thing I should add is that, for a real-time indexing request to the Indexing Service, I don't see any configuration for the Plumber, as for a standalone real-time node.  Maybe this is in some way taken care of by the Overlord, but if there are no properties for this, either in the Overlord runtime properties or in the real-time indexing request spec, how would this be configured?  I've noticed when real-time node users have issues with segments not getting handed off, at least they are able to see log messages about the IndexMerger and writing/deleting messages, but I don't see anything like this in my Overlord logs.

Wayne Adams

unread,
Feb 21, 2014, 4:20:25 PM2/21/14
to druid-de...@googlegroups.com
OK, I've got a little more information...  I finally found the Overlord-launched real-time node's log file.  It is in

  /tmp/system/persistent/playback_slice_v3/<big UUID>

BTW I had set, in my Overlord properties,

  druid.indexer.taskDir=/tmp/system/persistent

However, other than the real-time log and the task.json files, which end up there, everything else for this task ends up in

  /tmp/persistent/task/playback_slice_v3/work/persist/playback_slice_v3

So if I set druid.indexer.taskDir to some value other than /tmp or maybe /tmp/persistent, it appears the task-related files get spread out over the two directories; that is, the property change only affects the destination of some of the output files, not all of them.  Is this possibly the reason that I see a "directory does not exist" error each hour (see below)?  I have ensured all the segment, etc. output directories exist and are writable by the user running the process, so I know it's not that.  But I'm wondering if it's possible that the splitting of the real-time task's output files over these two directories is causing some kind of 404 because of using a relative path to find a task-related file.  I'm going to restart everything with the druid.indexer.taskDir property >not< set, just to see if that solves the problem, but meanwhile can anyone comment on this, including the below stack trace I just got?  Thanks much!

-- Wayne

2014-02-21 21:09:00,219 INFO [task-runner-0] io.druid.segment.realtime.plumber.RealtimePlumber - Submitting persist runnable for dataSource[playback_slice_v3]
2014-02-21 21:09:00,221 INFO [playback_slice_v3-incremental-persist] io.druid.segment.realtime.plumber.RealtimePlumber - DataSource[playback_slice_v3], Interval[2014-02-21T21:00:00.000Z/2014-02-21T22:00:00.000Z], persisting Hydrant[FireHydrant{index=io.druid.segment.incremental.IncrementalIndex@bb26d37, queryable=io.druid.segment.IncrementalIndexSegment@6f203be0, count=0}]
2014-02-21 21:09:00,230 INFO [playback_slice_v3-incremental-persist] io.druid.segment.IndexMerger - Starting persist for interval[2014-02-21T21:00:00.000Z/2014-02-21T21:09:00.000Z], rows[80,889]
2014-02-21 21:09:00,694 INFO [playback_slice_v3-incremental-persist] io.druid.segment.IndexMerger - outDir[/tmp/persistent/task/playback_slice_v3/work/persist/playback_slice_v3/2014-02-21T21:00:00.000Z_2014-02-21T22:00:00.000Z/0/v8-tmp] completed index.drd in 8 millis.
2014-02-21 21:09:00,700 ERROR [playback_slice_v3-incremental-persist] io.druid.segment.realtime.plumber.RealtimePlumber - dataSource[playback_slice_v3] -- incremental persist failed: {class=io.druid.segment.realtime.plumber.RealtimePlumber, interval=2014-02-21T21:00:00.000Z/2014-02-21T22:00:00.000Z, count=0}
2014-02-21 21:09:00,704 INFO [playback_slice_v3-incremental-persist] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"alerts","timestamp":"2014-02-21T21:09:00.702Z","service":"overlord","host":"localhost:8088","severity":"component-failure","description":"dataSource[playback_slice_v3] -- incremental persist failed","data":{"class":"io.druid.segment.realtime.plumber.RealtimePlumber","interval":"2014-02-21T21:00:00.000Z/2014-02-21T22:00:00.000Z","count":0}}]
Exception in thread "plumber_persist_0" java.lang.RuntimeException: java.io.IOException: No such file or directory
        at com.google.common.base.Throwables.propagate(Throwables.java:160)
        at io.druid.segment.realtime.plumber.RealtimePlumber.persistHydrant(RealtimePlumber.java:688)
        at io.druid.segment.realtime.plumber.RealtimePlumber$3.doRun(RealtimePlumber.java:295)
        at io.druid.common.guava.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)
Caused by: java.io.IOException: No such file or directory
        at java.io.UnixFileSystem.createFileExclusively(Native Method)
        at java.io.File.createNewFile(File.java:1006)
        at java.io.File.createTempFile(File.java:1981)
        at java.io.File.createTempFile(File.java:2032)
        at io.druid.segment.data.TmpFileIOPeon.makeOutputStream(TmpFileIOPeon.java:44)
        at io.druid.segment.data.GenericIndexedWriter.open(GenericIndexedWriter.java:67)
        at io.druid.segment.IndexMerger.makeIndexFiles(IndexMerger.java:466)
        at io.druid.segment.IndexMerger.merge(IndexMerger.java:306)
        at io.druid.segment.IndexMerger.persist(IndexMerger.java:149)
        at io.druid.segment.IndexMerger.persist(IndexMerger.java:119)
        at io.druid.segment.IndexMerger.persist(IndexMerger.java:104)
        at io.druid.segment.realtime.plumber.RealtimePlumber.persistHydrant(RealtimePlumber.java:668)
        ... 5 more

Wayne Adams

unread,
Feb 21, 2014, 5:35:18 PM2/21/14
to druid-de...@googlegroups.com
I've partially solved my problem.  It appears there's a bit of a bug somewhere...  If you set the property "druid.indexer.taskDir", some of the code will use the new property value, and some of it will not.  So the files for a realtime task end up going under 2 different directory structures (the specified one, and the default one), and when you try to do an incremental persist, it fails because it is looking under the wrong choice (of the 2 directories) for the file it's trying to persist.

So now, with "druid.indexer.taskDir" >not< set, my incremental persists succeed, every 10 minutes per my spec.

However, I still don't get any segments.  I get the single line:

2014-02-21 22:10:00,002 INFO [playback_slice_v3-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - Starting merge and push.

every hour at 10 minutes after (I have hour granularity and a 10-minute window), but then nothing gets pushed.

I've seen comments about rejection policy possibly being related to issues like this.  I guess I only partially understand this concept.  I used an example from the doc site to start, so what I have is

  "rejectionPolicy": {
    "type": "messageTime"
  }

Is this possibly the reason that nothing happens when my merge and push phase starts?

Thanks -- Wayne

Fangjin Yang

unread,
Feb 21, 2014, 8:01:07 PM2/21/14
to druid-de...@googlegroups.com
Hi Wayne, see inline.


On Friday, February 21, 2014 10:36:02 AM UTC-8, Wayne Adams wrote:
Hi, Gian:

I had configured the Overlord, but not the Middle Manager, for the segment locations directory, and after running nearly 24 hours (with hour granularity and a 10-minute window), I never saw any segments pushed to deep storage.

I see in the documentation on the Overlord that if you are running with "druid.indexer.runner.type=local", which is the default value, the Overlord directly launches the tasks without using the Middle Manager at all.  This seems to confirm the experience I had on my first attempt, since when I posted my request to the Overlord, a real-time node was launched and it didn't appear the Middle Manager got used.  But if that is the case, why didn't my segments get pushed to deep storage, since the Overlord contains all the required properties?  Are there some other properties required other than druid.storage.type, druid.storage.storageDirectory, druid.segmentCache.locations, and druid.segmentCache.infoDir?

There are a few configs to make sure you've set.
1)  druid.extensions.coordinates - make sure you've included the deep storage module
2) druid.storage.type= - make sure the deep storage type is specified and that all configs for related to the deep storage type are there as well:

This morning, when it became obvious that I wasn't going to get segments pushed out, I brought down everything and restarted without a Middle Manager at all.  Overlord runtime properties include both the MySQL and segment-location properties.  I posted my realtime indexer request to the Overlord and saw that it stood up a realtime node, and I was immediately able to query the data.  After the first segment should have been pushed out, however, nothing happened.  But I did see the following in the Overlord log:

2014-02-21 18:11:51,512 INFO [Coordinator-Exec--0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-02-21T18:11:51.512Z","service":"coordinator","host":"localhost:8082","metric":"coordinator/dropQueue/count","value":0,"user1":"localhost:8081"}]
2014-02-21 18:12:51,161 WARN [DatabaseSegmentManager-Exec--0] io.druid.db.DatabaseSegmentManager - No segments found in the database!
2014-02-21 18:12:51,367 INFO [DatabaseRuleManager-Exec--0] io.druid.db.DatabaseRuleManager - Polled and found rules for 1 datasource(s)
2014-02-21 18:12:51,513 INFO [Coordinator-Exec--0] io.druid.server.coordinator.ReplicationThrottler - [_default_tier]: Replicant create queue is empty.
2014-02-21 18:12:51,513 INFO [Coordinator-Exec--0] io.druid.server.coordinator.ReplicationThrottler - [_default_tier]: Replicant terminate queue is empty.
2014-02-21 18:12:51,513 INFO [Coordinator-Exec--0] io.druid.server.coordinator.helper.DruidCoordinatorBalancer - [_default_tier]: One or fewer servers found.  Cannot balance.
2014-02-21 18:12:51,514 INFO [Coordinator-Exec--0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-02-21T18:12:51.513Z","service":"coordinator","host":"localhost:8082","metric":"coordinator/overShadowed/count","value":0}]

at about the time I expected the first segment to be pushed.  Does this shed any light?  I wouldn't expect there to be any segments in the database, yet, as this was the time I expected the first segment to be pushed.  But I'm wondering if that "One or fewer servers found." statement is really a problem or not, or even related to my issue.

This means that Druid is trying to balance segments but since there is only one historical node, it can't do anything. This should be okay.
 
I've noticed other people having issues where either they never see historical segments, or the segments appear 6 hours late, and I don't recall seeing those issues resolved (at least, in the text of the thread).  Are there any other places I should be looking, log-wise?  I should mention if it's not obvious that the segments table in MySQL is empty.

This definitely points to some configuration issue. What is your segment granularity and window period? 

Fangjin Yang

unread,
Feb 21, 2014, 8:04:37 PM2/21/14
to druid-de...@googlegroups.com
One other important config on the overlord is:
druid.indexer.storage.type=db
along with the standard database user/pw configs.

FWIW, our production configs for the indexing service are posted here:

Wayne Adams

unread,
Feb 24, 2014, 11:59:54 AM2/24/14
to druid-de...@googlegroups.com
Hi Fangjin:

  Everything's running now.  My segment granularity is 1 hour, window period 10 minutes, intermediate persist 10 minutes and rollup 1 minute.

  Everything started working when I removed the following:

"rejectionPolicy": {
    "type": "messageTime"
  }

I was looking at the code and suppose I'd need to run in debug to see exactly what's happening, but it appears that this rejection policy results in everything being rejected when it comes time for the "merge and push" task.  In the online docs, under "rejectionPolicy" there's a link to Realtime, but no discussion there of rejectionPolicy.  It would be helpful to have a little discussion of that somewhere.  For now I'm just going to leave it out, but the example realtime indexing service configuration in the docs includes the above policy, so I think anyone using it without understanding it is likely to have some issues.

On the other point -- that really is a bug in Druid, re the setting the of the druid.indexer.taskDir property.  If you set that to any value other than "/tmp", part of the codebase writes to the new directory, and part writes to the "/tmp", with the result being that the incremental persists will fail due to "directory not found".  I would try a fix, but right now I might break more than I fix -- just as in the batch indexer task, someone would need to run through all places where files are written to "/tmp", either directly or indirectly via JDK API, and ensure they use the property instead.  I'll revisit this later, since right now I have to have really large /tmp partitions to be able to run some of these jobs.

Thanks again -- Wayne

P.S. Agree with Gian or whoever it was who said standing up a realtime node from the indexing service has an issue with log-file size.... I stood up a cluster on Friday afternoon and this morning, I woke up to a 158-GB-and-counting realtime log file!

Fangjin Yang

unread,
Feb 24, 2014, 4:43:00 PM2/24/14
to druid-de...@googlegroups.com
Hi Wayne, see inline.


On Monday, February 24, 2014 8:59:54 AM UTC-8, Wayne Adams wrote:
Hi Fangjin:

  Everything's running now.  My segment granularity is 1 hour, window period 10 minutes, intermediate persist 10 minutes and rollup 1 minute.

  Everything started working when I removed the following:

"rejectionPolicy": {
    "type": "messageTime"
  }

I was looking at the code and suppose I'd need to run in debug to see exactly what's happening, but it appears that this rejection policy results in everything being rejected when it comes time for the "merge and push" task.  In the online docs, under "rejectionPolicy" there's a link to Realtime, but no discussion there of rejectionPolicy.  It would be helpful to have a little discussion of that somewhere.  For now I'm just going to leave it out, but the example realtime indexing service configuration in the docs includes the above policy, so I think anyone using it without understanding it is likely to have some issues.

Some of the examples use "messageTime" as the rejection policy because it allows for events for different time ranges (not the current time) to be accepted. The problem with this rejection policy is that handoff does not occur until after segment Granularity + window period and after more events are ingested. So things are really wonky if you do not have a constant stream of events. "serverTime" is what we use in production for real-time nodes. It accept events based on the concept of "current time". I'm surprised we didn't document this but we totally should. Apologies for the confusion.

On the other point -- that really is a bug in Druid, re the setting the of the druid.indexer.taskDir property.  If you set that to any value other than "/tmp", part of the codebase writes to the new directory, and part writes to the "/tmp", with the result being that the incremental persists will fail due to "directory not found".  I would try a fix, but right now I might break more than I fix -- just as in the batch indexer task, someone would need to run through all places where files are written to "/tmp", either directly or indirectly via JDK API, and ensure they use the property instead.  I'll revisit this later, since right now I have to have really large /tmp partitions to be able to run some of these jobs.
 
Hmm, we are probably making a mkdirs() call somewhere.  Thanks for this catch.

Thanks again -- Wayne

P.S. Agree with Gian or whoever it was who said standing up a realtime node from the indexing service has an issue with log-file size.... I stood up a cluster on Friday afternoon and this morning, I woke up to a 158-GB-and-counting realtime log file!

Wow! Yeah, we should absolutely fix that. 

Wayne Adams

unread,
Feb 25, 2014, 2:14:18 PM2/25/14
to druid-de...@googlegroups.com
Hi Fangjin and Gian:

  Concerning the large-log-file issue for indexing-service-launched real-time tasks (https://github.com/metamx/druid/issues/401), I do have a workaround that seems fairly reasonable.  I have two log4j properties files: one for overlord, and one for any real-time processes it launches.  At overlord launch time, I pass in

-Dlog4j.configuration=file:/a/apps/druid/config/overlord/overlord.log4j.properties -Ddruid.indexer.fork.property.log4j.configuration=file:/a/apps/druid/config/overlord/realtime.log4j.properties

The realtime.log4j.properties file has one-hour rollover:

log4j.appender.RTLOG.DatePattern='.'yyyy-MM-dd-HH

which is fairly fine, but with more than one real-time process running, and one-hour segment granularity, this actually works out to a fair amount of log per hour, and I can easily verify one "merge and push" per log per dataSource.  I can also specify the realtime log4j log file to go to a more-convenient directory, so it's easier to find than drilling down through /tmp.  There is still a real-time log file under the task directory on /tmp, but it appears to be topping out at 1395 bytes.

Thanks -- Wayne

Fangjin Yang

unread,
Feb 25, 2014, 10:30:09 PM2/25/14
to druid-de...@googlegroups.com
Hi Wayne,

This does seem very reasonable. We were also thinking of the same approach for managing logs.

Thanks!
FJ

Wayne Adams

unread,
Feb 26, 2014, 3:18:26 PM2/26/14
to druid-de...@googlegroups.com
I should mention that if you have more than one Overlord child process, total chaos ensues as they all write to the same log file.  Things get especially interesting at roll time.  :)  But in my case I'm just trying to keep down the size of the live log file, so it's working fine for me.  If there were a way to direct each task's logs to a different location that would be cool.  -- Wayne
Message has been deleted

Jerome liu

unread,
Sep 19, 2016, 1:49:28 AM9/19/16
to Druid Development

   hi,Wayne
     

        I have a similar problem (https://groups.google.com/forum/#!topic/druid-user/DJp-myKKBCM) , and  I don't know how to solve it , I hope you suggest some advise to me  。




在 2014年2月22日星期六 UTC+8上午6:35:18,Wayne Adams写道:

Jerome liu

unread,
Sep 19, 2016, 3:27:08 AM9/19/16
to Druid Development

    I got it !  Thx : )


在 2016年9月19日星期一 UTC+8下午1:49:28,Jerome liu写道:
Reply all
Reply to author
Forward
0 new messages