Query timeout

Sudhir Rama Rao

unread,

Dec 29, 2014, 2:29:25 PM12/29/14

to druid-de...@googlegroups.com

I have a single node druid server working... It runs the following

1 - overlord

1 - coordinator

1 - middlemanager

1 - broker

1 - historical

I have successfully indexed 1000 records into druid using local firehose. When i try to query it, it works intermittently...

Other times i see the following happen

1. query timeouts.

2. empty array returned quickly []

I am not sure where should i begin to debug this problems. please let me know

regards

#sudhir

Fangjin Yang

unread,

Dec 29, 2014, 3:18:56 PM12/29/14

to druid-de...@googlegroups.com

Sudhir, how long are your queries taking such that queries time out? Have you had a chance to compare your query performance against our published benchmarks?

http://druid.io/blog/2014/03/17/benchmarking-druid.html

If your queries are taking more than 1 second, can you please share more details about your setup?

Sudhir Rama Rao

unread,

Dec 29, 2014, 4:48:58 PM12/29/14

to druid-de...@googlegroups.com

The queries are taking more than 50 seconds and still not completing for a datasource records size of 1000

I have one machine running the above mentioned servers. The machine configuration is as follows

Disk : 120 GB

RAM : 60 GB

OS : RHEL 6.x

Broker Configuration

# Change these to make Druid faster
druid.processing.buffer.sizeBytes=100000000
druid.processing.numThreads=1
druid.host=localhost
druid.service=broker
druid.port=8084
druid.zk.service.host=titan3,titan2,titan1
druid.db.connector.connectURI=jdbc:mysql://db1:3306/druid
druid.db.connector.user=druid
druid.db.connector.password=druid
druid.coordinator.startDelay=PT70s
druid.indexer.logs.directory=/tmp/druid/log
druid.request.logging.dir=/tmp/druid/log
com.metamx.emitter.logging.level=error
druid.emitter.logging.logLevel=error

Historical Configuration

druid.host=localhost
druid.service=historical
druid.port=8081
druid.zk.service.host=titan3,titan2,titan1
druid.extensions.coordinates=["io.druid.extensions:druid-hdfs-storage:0.6.147"]

druid.server.maxSize=10000000000
# Change these to make Druid faster
druid.processing.buffer.sizeBytes=100000000
druid.processing.numThreads=1
druid.segmentCache.locations=[{"path": "/tmp/druid/indexCache", "maxSize": 10000000000}]
druid.indexer.logs.directory=/tmp/druid/log
druid.request.logging.dir=/tmp/druid/log
com.metamx.emitter.logging.level=error
druid.emitter.logging.logLevel=error

Coordinator Configuration

druid.host=localhost
druid.service=coordinator
druid.port=8082
druid.zk.service.host=titan3,titan2,titan1
druid.db.connector.connectURI=jdbc:mysql://db1:3306/druid
druid.db.connector.user=druid
druid.db.connector.password=druid
druid.coordinator.startDelay=PT70s
druid.indexer.logs.directory=/tmp/druid/log
druid.request.logging.dir=/tmp/druid/log
com.metamx.emitter.logging.level=error
druid.emitter.logging.logLevel=error

regards

#sudhir

Fangjin Yang

unread,

Dec 29, 2014, 4:53:42 PM12/29/14

to druid-de...@googlegroups.com

Sudhir, there is something very broken with your setup but we need more info to find out what it is.

How many segments do you have? I notice you've only set a single processing thread to scan segments so you have no parallelization whatsoever. What query are you using? How long does a timeseries query take? What are your heap sizes? What about direct memory sizes?

I also strongly suggest using an actual stable version of Druid. The most recent is 0.6.160.

Eric Tschetter

unread,

Dec 29, 2014, 8:06:52 PM12/29/14

to druid-de...@googlegroups.com

Sudhir,

The general place to start debugging is to

1) Look at the cluster status on the coordinator console and ensure
that a segment (I'm assuming it's only one because you said you are
only ingesting 1000 rows) is being served by your historical node.
2) Take your query and issue it directly to the historical node (so,
localhost:8081 instead of localhost:8084)
3) If that is still timing out, check the logs for the historical node
and see if there is anything telling in them (exceptions or other
interesting things).
4) Try the query again on the broker, watch the broker's logs while
this is happening

One of these steps should break and give us more information. Please
let us know which step that is.

Also, if you can provide the query that you are issuing, that might be
helpful (though, it shouldn't really be query-dependent given what you
have stated so far).

--Eric

> https://groups.google.com/d/msgid/druid-development/CAKyF60%2BOGkU9e_E-07Tuqu2EoCZ0%2BDqhcFSieA4Zq%3D%3DULYytaw%40mail.gmail.com.

Sudhir Rama Rao

unread,

Dec 30, 2014, 1:53:52 PM12/30/14

to druid-de...@googlegroups.com

I made some changes to the configs

druid.processing.numThreads=5

druid.processing.buffer.sizeBytes=1142857142

Also set -XX:MaxDirectMemorySize=8g for broker and historical

Querying directly to the historical node works, all the queries like select, search, topN and timeseries take < 1 sec

Here is the strangest thing that happened. After querying the historical nodes... querying the brokers also started to work just fine.

Eric Tschetter

unread,

Dec 30, 2014, 2:05:55 PM12/30/14

to druid-de...@googlegroups.com

Hrm, ok, I'm not sure what was going on, but my best guess is that it
is the time laag between indexing being completed and the segment
being loaded.

That is, you can generally expect a few minutes 1-3 between
batch-style indexing completing and the segment actually being loaded.
If you query in this time frame, getting a "[]" would be expected.
Hanging forever wouldn't be, however, so I'm not sure what happened
there.

If you are ever able to reproduce, it would be very interesting to
understand what the logs in the broker and historical were doing at
that point in time.

--Eric

> https://groups.google.com/d/msgid/druid-development/b40be229-87d1-4575-ab6b-abc09e19545c%40googlegroups.com.

Sudhir Rama Rao

unread,

Dec 30, 2014, 2:16:43 PM12/30/14

to druid-de...@googlegroups.com

Eric, if i encounter it again. i will save the logs and post it here.

Thanks for all your help

Sudhir Rama Rao

unread,

Dec 30, 2014, 2:36:11 PM12/30/14

to druid-de...@googlegroups.com

So to extend my POC, i indexed 100K records and i see the following in the coordinator log

Coordinator Logs

2014-12-30 19:32:13,061 INFO [main-EventThread] io.druid.server.coordinator.LoadQueuePeon - Server[/druid/loadQueue/localhost:8083] done processing [/druid/loadQueue/localhost:8083/pci_2012-01-01T00:00:00.000Z_2012-01-02T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:32:33,288 INFO [DatabaseRuleManager-Exec--0] io.druid.db.DatabaseRuleManager - Polled and found rules for 1 datasource(s)

2014-12-30 19:32:33,923 INFO [DatabaseSegmentManager-Exec--0] io.druid.db.DatabaseSegmentManager - Polled and found 735 segments in the database

But when i query broker i see that the query is taking a really long time and see the following in the borker logs (historical runs on 8083, broker runs on 8084)

Historical Logs

2014-12-30 19:34:48,045 INFO [qtp649630909-18] io.druid.query.GroupByParallelQueryRunner - Query timeout, cancelling pending results for query id [7d6984b1-94ed-4457-a68d-9c5362b55687]

Broker Logs

2014-12-30 19:30:43,480 INFO [qtp1374432753-42] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://localhost:8083

2014-12-30 19:31:09,816 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2013-12-24T00:00:00.000Z_2013-12-25T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,456 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2013-02-16T00:00:00.000Z_2013-02-17T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,488 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2013-02-08T00:00:00.000Z_2013-02-09T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,588 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2013-01-01T00:00:00.000Z_2013-01-02T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,635 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-12-31T00:00:00.000Z_2013-01-01T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,700 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-12-13T00:00:00.000Z_2012-12-14T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,747 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-12-05T00:00:00.000Z_2012-12-06T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,787 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-11-12T00:00:00.000Z_2012-11-13T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,849 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-11-03T00:00:00.000Z_2012-11-04T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,948 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-09-24T00:00:00.000Z_2012-09-25T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,987 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-08-27T00:00:00.000Z_2012-08-28T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,052 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-08-09T00:00:00.000Z_2012-08-10T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,067 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-07-18T00:00:00.000Z_2012-07-19T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,101 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-06-20T00:00:00.000Z_2012-06-21T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,126 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-06-01T00:00:00.000Z_2012-06-02T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,179 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-05-15T00:00:00.000Z_2012-05-16T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,208 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-04-29T00:00:00.000Z_2012-04-30T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,240 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-03-26T00:00:00.000Z_2012-03-27T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,282 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-02-29T00:00:00.000Z_2012-03-01T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,309 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-02-21T00:00:00.000Z_2012-02-22T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,323 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] removed segment[pci_2012-12-01T00:00:00.000Z_2012-12-02T00:00:00.000Z_2014-12-29T22:43:01.376Z]

2014-12-30 19:31:11,398 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-02-09T00:00:00.000Z_2012-02-10T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,424 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-01-26T00:00:00.000Z_2012-01-27T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,466 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-01-10T00:00:00.000Z_2012-01-11T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.539Z","service":"broker","host":"localhost:8084","metric":"cache/delta/numEntries","value":0}]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.540Z","service":"broker","host":"localhost:8084","metric":"cache/delta/sizeBytes","value":0}]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.540Z","service":"broker","host":"localhost:8084","metric":"cache/delta/hits","value":0}]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.540Z","service":"broker","host":"localhost:8084","metric":"cache/delta/misses","value":0}]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.540Z","service":"broker","host":"localhost:8084","metric":"cache/delta/evictions","value":0}]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.540Z","service":"broker","host":"localhost:8084","metric":"cache/delta/hitRate","value":0.0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/delta/averageBytes","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/delta/timeouts","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/delta/errors","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/numEntries","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/sizeBytes","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/hits","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/misses","value":40}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/evictions","value":40}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/hitRate","value":0.0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/averageBytes","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/timeouts","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/errors","value":0}]

Eric Tschetter

unread,

Dec 30, 2014, 7:20:54 PM12/30/14

to druid-de...@googlegroups.com

Sudhir,

You've got a *lot* of segments there, also it looks like your broker
cache isn't actually doing too much (it has evictions but no bytes
stored).

If you are only working with 100,000 rows of data, try adjusting your
ingestion spec to use "year" for segmentGranularity. This should
reduce the number of segments generated in the system and ultimately
make things nice and happy for you.

Just to explain a bit more, as a general rule it's good to target ~5
millions rows in a single segment, the primary thing to use to achieve
that on smaller data sets, like your own, is to increase the
"segmentGranularity". "year" is the largest value it can be.

--Eric

> https://groups.google.com/d/msgid/druid-development/afae5173-b438-469b-afe3-a0a352f69b5b%40googlegroups.com.

Sudhir Rama Rao

unread,

Dec 30, 2014, 7:59:24 PM12/30/14

to druid-de...@googlegroups.com

Eric, I did what you recommended

1. # of segments fell to 2

2. loading of the data was super fast

But, the query still times out

On the other hand i see another issue here, even though i have 3 historical servers running, the data is always loaded to only 1, from where i run the indexing.

other 3 historical nodes are visible in the localhost:8082/cluster.html, but shows 0 server currSize and 0 server percentUsed.

How can i distribute the data into multiple historical nodes ? could this impact the performance of the queries ?

Message has been deleted

Eric Tschetter

unread,

Dec 30, 2014, 8:40:18 PM12/30/14

to druid-de...@googlegroups.com

Sudhir,

When you say you have three historicals running, are they running on
the same machine or different machines? I'm assuming that when you
said you are using the local indexer to load your data, that means you
are also using local "deep storage", is that true?

Local deep storage doesn't actually allow other historical nodes to
see the segments (i.e. it cannot actually distribute the data to other
nodes in the "cluster") without setting up NFS or something like that.
I would recommend using S3 or HDFS if you want to test out clustered
operation.

Also, if you have 3 historicals on multiple different servers, the
"host" property is going to need to be the actual IP_ADDR:PORT
combination, not "localhost:PORT" like you currently have.

--Eric

> https://groups.google.com/d/msgid/druid-development/cf8d417b-774a-48de-8cdd-1cec3dc13cdb%40googlegroups.com.

Sudhir Rama Rao

unread,

Dec 30, 2014, 8:53:23 PM12/30/14

to druid-de...@googlegroups.com

Yes i am running 3 node cluster: node1, node2, node3

node1 runs : broker, historical, coordinator,overlord, zookeeper, mysql-server

node2 runs : broker, historical, coordinator

node3 runs : broker, historical, coordinator

i run the indexing from node1:

curl -X 'POST' -H 'Content-Type:application/json' -d @index.json http://localhost:8080/druid/indexer/v1/task

The runtime.properties of historical nodes have this setup

druid.extensions.coordinates=["io.druid.extensions:druid-hdfs-storage:0.6.160"]

druid.storage.storageDirectory=hdfs://nn1/data/druid/pci

druid.storage.type=hdfs

i expect the data segments to be stored in hdfs, which is not happening. all the segments are stored in node1

i am sorry about the localhost mention, i am actually looking at the coordinator logs at http://node1:8082/cluster.html

regards

#sudhir

Fangjin Yang

unread,

Dec 30, 2014, 8:58:41 PM12/30/14

to druid-de...@googlegroups.com

Hi Sudhir, did you set the deep storage properties on the overlord (the node doing the indexing) as well?

Sudhir Rama Rao

unread,

Dec 30, 2014, 9:17:55 PM12/30/14

to druid-de...@googlegroups.com

FY, I had not... so i put the same configs in overlord runtime.properties and tried to index the data again

This is what i see....

Updated Config :

druid.extensions.coordinates=["io.druid.extensions:druid-hdfs-storage:0.6.160"]

druid.storage.storageDirectory=hdfs://nn1:8020/data/druid/pci

druid.storage.type=hdfs

Indexing logs :

2014-12-31 02:15:31,838 WARN [task-runner-0] io.druid.indexing.common.index.YeOldePlumberSchool - Failed to merge and upload
java.io.IOException: No FileSystem for scheme: hdfs
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
	at io.druid.storage.hdfs.HdfsDataSegmentPusher.push(HdfsDataSegmentPusher.java:75)
	at io.druid.indexing.common.task.IndexTask$2.push(IndexTask.java:390)
	at io.druid.indexing.common.index.YeOldePlumberSchool$1.finishJob(YeOldePlumberSchool.java:179)
	at io.druid.indexing.common.task.IndexTask.generateSegment(IndexTask.java:444)
	at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:198)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:218)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:197)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
2014-12-31 02:15:31,842 INFO [task-runner-0] io.druid.indexing.common.index.YeOldePlumberSchool - Del

Fangjin Yang

unread,

Dec 30, 2014, 9:21:36 PM12/30/14

to druid-de...@googlegroups.com

Hi Sudhir, did your include hadoop configuration files on the classpath of your middle manager?

Fangjin Yang

unread,

Dec 30, 2014, 9:24:53 PM12/30/14

to druid-de...@googlegroups.com

Ah, you are running a single overlord instance, so the hadoop config files need to be on the cp of the overlord.

Sudhir Rama Rao

unread,

Dec 30, 2014, 9:46:37 PM12/30/14

to druid-de...@googlegroups.com

alright thanks so much FY and Eric. hdfs deep storage works.

-rw-r--r-- 3 sudhrao hdfs 537 2014-12-30 19:42 /data/druid/pci/20120401T000000.000Z_20120501T000000.000Z/2014-12-31T02_41_55.040Z/0/descriptor.json

-rw-r--r-- 3 sudhrao hdfs 285372 2014-12-30 19:42 /data/druid/pci/20120401T000000.000Z_20120501T000000.000Z/2014-12-31T02_41_55.040Z/0/index.zip

-rw-r--r-- 3 sudhrao hdfs 537 2014-12-30 19:42 /data/druid/pci/20120501T000000.000Z_20120601T000000.000Z/2014-12-31T02_41_55.040Z/0/descriptor.json

-rw-r--r-- 3 sudhrao hdfs 244960 2014-12-30 19:42 /data/druid/pci/20120501T000000.000Z_20120601T000000.000Z/2014-12-31T02_41_55.040Z/0/index.zip

...

....

however when i see the coordinator cluster.html page, I see only one server having data. Is this expected due to my data size ? why don;t other historical nodes load the segments from hdfs

PS : checkout the screenshot in the attachement

Screen Shot 2014-12-30 at 6.45.08 PM.png

Fangjin Yang

unread,

Dec 30, 2014, 9:51:50 PM12/30/14

to druid-de...@googlegroups.com

By default, two copies of a segment should be loaded on the historical cluster. This means you should see another node with a non zero currSize. Do you see any exceptions in your coordinator node logs or on the other historicals?

Sudhir Rama Rao

unread,

Dec 30, 2014, 9:55:12 PM12/30/14

to druid-de...@googlegroups.com

FY my bad, i think it could be the same hadoop config not in the cp issue.. i am looking into it.. will report back in few mins

Sudhir Rama Rao

unread,

Dec 30, 2014, 10:01:32 PM12/30/14

to druid-de...@googlegroups.com

Ok i think now i have the following working

1. deep storage into hdfs - check

2. two historical nodes have loaded the segments - check

3. query works - when directly hitting historical or broker - not working :(

I still cannot understand why, here is my indexing spec

{

    "type":"index",
    "dataSource" : "pci",
    "granularitySpec" : {

            "type" : "uniform",

            "gran" : "month",

            "intervals" : [ "2000/2016" ]

    },

    "aggregators":[ 

                    {"type":"doubleSum","name":"count","fieldName":"count"},

                    {"type":"doubleSum","name":"amount","fieldName":"amount"}

        ],

  "config": {

      "maxRowsInMemory": 500000000,

      "intermediatePersistPeriod": "PT10m"

    },

    "firehose" : {

      "type" : "local",

       "baseDir" : "/home/sudhrao",

      "filter" : "100k.json",

      "parser" : {

      "timestampSpec" : {

        "column" : "timestamp"

      },

      "data": {

        "format": "json",

        "columns": ["timestamp","account_number","tcode","amount","currency","country","timezone","count"],

        "dimensions": ["account_number","tcode","curreny","country","timezone"]

      }

    }

  },

  "plumber": {

      "type": "realtime",

      "windowPeriod": "PT10m",

      "segmentGranularity": "day",

      "basePersistDirectory": "\/tmp\/realtime\/basePersist"

    }

}

here is the groupBy query









{

  "queryType": "groupBy",

  "dataSource": "pci",

  "granularity": "none",

  "limitSpec": { "type": "default", "limit": 500000000, "columns": [] },

  "dimensions": ["account_number","tcode","timezone"],

  "intervals" : [ "2012-01-01T00:00:00.000/2014-10-01T00:00:00.000" ],

    "aggregations":[ 

                    {"type":"longSum","name":"count","fieldName":"count"},

                    {"type":"longSum","name":"amount","fieldName":"amount"}

        ]

}

Fangjin Yang

unread,

Dec 30, 2014, 10:03:38 PM12/30/14

to druid-de...@googlegroups.com

Hi Sudhir, can you try a simple timeBoundary query and a timeseries query? Do either of those work? As for the groupBy query, are there any exceptions in the logs of the historicals or brokers?

Sudhir Rama Rao

unread,

Dec 30, 2014, 10:18:40 PM12/30/14

to druid-de...@googlegroups.com

No even a simple time boundary is timing out

I am attaching the config i am using

I run the servers as follows

Historical :

[sudhrao@dtee1-druid-poc-001-6055 ~]$ cat /opt/druid/run_historical.sh

#!/usr/bin/env bash

echo "This will run a historical of Druid"

set +u

shopt -s xpg_echo

shopt -s expand_aliases

# start process

JAVA_ARGS="-Xmx1g -Duser.timezone=UTC -Dfile.encoding=UTF-8"

#For the kit

DRUID_CP=/opt/druid/lib/*:/opt/druid/config/historical

echo "Running command:"

(set -x; java -XX:MaxDirectMemorySize=8g ${JAVA_ARGS} -classpath ${DRUID_CP} -Ddruid.emitter=logging -Ddruid.emitter.logging.logLevel=error io.druid.cli.Main server historical)

[sudhrao@dtee1-druid-poc-001-6055 ~]$

similarly other servers, with mx set to 1g-4g

config.zip

Sudhir Rama Rao

unread,

Dec 30, 2014, 10:36:34 PM12/30/14

to druid-de...@googlegroups.com

Ok timeBoundary query now works when i directly hit the historical node. however when i hit the broker it basically timesout

Query :

{

"queryType" : "timeBoundary",

"dataSource": "pci"

}

Result :

[ {

"timestamp" : "2012-01-01T00:12:37.000Z",

"result" : {

"minTime" : "2012-01-01T00:12:37.000Z",

"maxTime" : "2013-09-30T21:58:55.000Z"

}

} ]

Fangjin Yang

unread,

Dec 30, 2014, 11:02:05 PM12/30/14

to druid-de...@googlegroups.com

Hi Sudhir, this is 0.6.160 right?

Can you try to ping the historical node from your broker in your cluster? If that succeeds, can you add the following configs to the broker:

druid.broker.http.numConnections=50

druid.server.http.numThreads=100

And the following config on the historical:

druid.server.http.numThreads=50

I'm actually not sure if that's the problem, but I do know that the old default setting on the number of jetty threads has caused problems.

Sudhir Rama Rao

unread,

Dec 30, 2014, 11:16:34 PM12/30/14

to druid-de...@googlegroups.com

yes, i am using 0.6.160.. i will try the below configs and let you know

Eric Tschetter

unread,

Dec 30, 2014, 11:37:58 PM12/30/14

to druid-de...@googlegroups.com

It looks like you have the host set to "node1", are you certain that
the broker node you are hitting can resolve "node1", "node2" and
"node3" to IP addresses and that the ports are open between them?

--Eric

> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-developm...@googlegroups.com.
> To post to this group, send email to druid-de...@googlegroups.com.
> To view this discussion on the web visit

> https://groups.google.com/d/msgid/druid-development/64c49bee-3c9d-49ee-bb0a-0f0e2fa537dd%40googlegroups.com.

Reply all

Reply to author

Forward