Query timeout

906 views
Skip to first unread message

Sudhir Rama Rao

unread,
Dec 29, 2014, 2:29:25 PM12/29/14
to druid-de...@googlegroups.com
I have a single node druid server working... It runs the following 

1 - overlord
1 - coordinator
1 - middlemanager
1 - broker
1 - historical

I have successfully indexed 1000 records into druid using local firehose. When i try to query it, it works intermittently...

Other times i see the following happen

1. query timeouts. 
2. empty array returned quickly []

I am not sure where should i begin to debug this problems. please let me know

regards
#sudhir

Fangjin Yang

unread,
Dec 29, 2014, 3:18:56 PM12/29/14
to druid-de...@googlegroups.com
Sudhir, how long are your queries taking such that queries time out? Have you had a chance to compare your query performance against our published benchmarks?


If your queries are taking more than 1 second, can you please share more details about your setup?

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/b24d2b02-3c88-49d3-8359-c2a1ea884810%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sudhir Rama Rao

unread,
Dec 29, 2014, 4:48:58 PM12/29/14
to druid-de...@googlegroups.com
The queries are taking more than 50 seconds and still not completing for a datasource records size of 1000

I have one machine running the above mentioned servers. The machine configuration is as follows

Disk : 120 GB
RAM : 60 GB
OS : RHEL 6.x


Broker Configuration

# Change these to make Druid faster

druid.processing.buffer.sizeBytes=100000000
druid.processing.numThreads=1
druid.host=localhost
druid
.service=broker
druid
.port=8084
druid
.zk.service.host=titan3,titan2,titan1
druid
.db.connector.connectURI=jdbc:mysql://db1:3306/druid
druid.db.connector.user=druid
druid.db.connector.password=druid

druid.coordinator.startDelay=PT70s
druid
.indexer.logs.directory=/tmp/druid/log
druid
.request.logging.dir=/tmp/druid/log
com
.metamx.emitter.logging.level=error
druid.emitter.logging.logLevel=error


Historical Configuration

druid.host=localhost
druid
.service=historical
druid
.port=8081
druid
.zk.service.host=titan3,titan2,titan1
druid
.extensions.coordinates=["io.druid.extensions:druid-hdfs-storage:0.6.147"]

druid
.server.maxSize=10000000000

# Change these to make Druid faster

druid.processing.buffer.sizeBytes=100000000

druid.processing.numThreads=1

druid.segmentCache.locations=[{"path": "/tmp/druid/indexCache", "maxSize": 10000000000}]

druid.indexer.logs.directory=/tmp/druid/log

druid.request.logging.dir=/tmp/druid/log

com.metamx.emitter.logging.level=error

druid.emitter.logging.logLevel=error


Coordinator Configuration

druid.host=localhost
druid
.service=coordinator
druid
.port=8082

druid.zk.service.host=titan3,titan2,titan1

druid.db.connector.connectURI=jdbc:mysql://db1:3306/druid
druid
.db.connector.user=druid
druid
.db.connector.password=druid

druid.coordinator.startDelay=PT70s

druid.indexer.logs.directory=/tmp/druid/log

druid.request.logging.dir=/tmp/druid/log
com.metamx.emitter.logging.level=error

druid.emitter.logging.logLevel=error




regards
#sudhir

Fangjin Yang

unread,
Dec 29, 2014, 4:53:42 PM12/29/14
to druid-de...@googlegroups.com
Sudhir, there is something very broken with your setup but we need more info to find out what it is.

How many segments do you have? I notice you've only set a single processing thread to scan segments so you have no parallelization whatsoever. What query are you using? How long does a timeseries query take? What are your heap sizes? What about direct memory sizes?

I also strongly suggest using an actual stable version of Druid. The most recent is 0.6.160.

Eric Tschetter

unread,
Dec 29, 2014, 8:06:52 PM12/29/14
to druid-de...@googlegroups.com
Sudhir,

The general place to start debugging is to

1) Look at the cluster status on the coordinator console and ensure
that a segment (I'm assuming it's only one because you said you are
only ingesting 1000 rows) is being served by your historical node.
2) Take your query and issue it directly to the historical node (so,
localhost:8081 instead of localhost:8084)
3) If that is still timing out, check the logs for the historical node
and see if there is anything telling in them (exceptions or other
interesting things).
4) Try the query again on the broker, watch the broker's logs while
this is happening

One of these steps should break and give us more information. Please
let us know which step that is.

Also, if you can provide the query that you are issuing, that might be
helpful (though, it shouldn't really be query-dependent given what you
have stated so far).

--Eric
> https://groups.google.com/d/msgid/druid-development/CAKyF60%2BOGkU9e_E-07Tuqu2EoCZ0%2BDqhcFSieA4Zq%3D%3DULYytaw%40mail.gmail.com.

Sudhir Rama Rao

unread,
Dec 30, 2014, 1:53:52 PM12/30/14
to druid-de...@googlegroups.com
I made some changes to the configs

druid.processing.numThreads=5

druid.processing.buffer.sizeBytes=1142857142


Also set  -XX:MaxDirectMemorySize=8g for broker and historical

Querying directly to the historical node works, all the queries like select, search, topN and timeseries take < 1 sec

Here is the strangest thing that happened. After querying the historical nodes... querying the brokers also started to work just fine.

Eric Tschetter

unread,
Dec 30, 2014, 2:05:55 PM12/30/14
to druid-de...@googlegroups.com
Hrm, ok, I'm not sure what was going on, but my best guess is that it
is the time laag between indexing being completed and the segment
being loaded.

That is, you can generally expect a few minutes 1-3 between
batch-style indexing completing and the segment actually being loaded.
If you query in this time frame, getting a "[]" would be expected.
Hanging forever wouldn't be, however, so I'm not sure what happened
there.

If you are ever able to reproduce, it would be very interesting to
understand what the logs in the broker and historical were doing at
that point in time.

--Eric
> https://groups.google.com/d/msgid/druid-development/b40be229-87d1-4575-ab6b-abc09e19545c%40googlegroups.com.

Sudhir Rama Rao

unread,
Dec 30, 2014, 2:16:43 PM12/30/14
to druid-de...@googlegroups.com
Eric, if i encounter it again. i will save the logs and post it here.

Thanks for all your help

Sudhir Rama Rao

unread,
Dec 30, 2014, 2:36:11 PM12/30/14
to druid-de...@googlegroups.com

So to extend my POC, i indexed 100K records and i see the following in the coordinator log

Coordinator Logs

2014-12-30 19:32:13,061 INFO [main-EventThread] io.druid.server.coordinator.LoadQueuePeon - Server[/druid/loadQueue/localhost:8083] done processing [/druid/loadQueue/localhost:8083/pci_2012-01-01T00:00:00.000Z_2012-01-02T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:32:33,288 INFO [DatabaseRuleManager-Exec--0] io.druid.db.DatabaseRuleManager - Polled and found rules for 1 datasource(s)

2014-12-30 19:32:33,923 INFO [DatabaseSegmentManager-Exec--0] io.druid.db.DatabaseSegmentManager - Polled and found 735 segments in the database


But when i query broker i see that the query is taking a really long time and see the following in the borker logs (historical runs on 8083, broker runs on 8084)



Historical Logs

2014-12-30 19:34:48,045 INFO [qtp649630909-18] io.druid.query.GroupByParallelQueryRunner - Query timeout, cancelling pending results for query id [7d6984b1-94ed-4457-a68d-9c5362b55687]


Broker Logs

2014-12-30 19:30:43,480 INFO [qtp1374432753-42] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://localhost:8083

2014-12-30 19:31:09,816 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2013-12-24T00:00:00.000Z_2013-12-25T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,456 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2013-02-16T00:00:00.000Z_2013-02-17T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,488 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2013-02-08T00:00:00.000Z_2013-02-09T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,588 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2013-01-01T00:00:00.000Z_2013-01-02T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,635 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-12-31T00:00:00.000Z_2013-01-01T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,700 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-12-13T00:00:00.000Z_2012-12-14T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,747 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-12-05T00:00:00.000Z_2012-12-06T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,787 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-11-12T00:00:00.000Z_2012-11-13T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,849 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-11-03T00:00:00.000Z_2012-11-04T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,948 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-09-24T00:00:00.000Z_2012-09-25T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:10,987 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-08-27T00:00:00.000Z_2012-08-28T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,052 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-08-09T00:00:00.000Z_2012-08-10T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,067 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-07-18T00:00:00.000Z_2012-07-19T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,101 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-06-20T00:00:00.000Z_2012-06-21T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,126 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-06-01T00:00:00.000Z_2012-06-02T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,179 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-05-15T00:00:00.000Z_2012-05-16T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,208 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-04-29T00:00:00.000Z_2012-04-30T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,240 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-03-26T00:00:00.000Z_2012-03-27T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,282 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-02-29T00:00:00.000Z_2012-03-01T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,309 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-02-21T00:00:00.000Z_2012-02-22T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,323 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] removed segment[pci_2012-12-01T00:00:00.000Z_2012-12-02T00:00:00.000Z_2014-12-29T22:43:01.376Z]

2014-12-30 19:31:11,398 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-02-09T00:00:00.000Z_2012-02-10T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,424 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-01-26T00:00:00.000Z_2012-01-27T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:11,466 INFO [ServerInventoryView-0] io.druid.client.SingleServerInventoryView - Server[localhost:8083] added segment[pci_2012-01-10T00:00:00.000Z_2012-01-11T00:00:00.000Z_2014-12-30T19:03:31.590Z]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.539Z","service":"broker","host":"localhost:8084","metric":"cache/delta/numEntries","value":0}]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.540Z","service":"broker","host":"localhost:8084","metric":"cache/delta/sizeBytes","value":0}]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.540Z","service":"broker","host":"localhost:8084","metric":"cache/delta/hits","value":0}]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.540Z","service":"broker","host":"localhost:8084","metric":"cache/delta/misses","value":0}]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.540Z","service":"broker","host":"localhost:8084","metric":"cache/delta/evictions","value":0}]

2014-12-30 19:31:12,540 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.540Z","service":"broker","host":"localhost:8084","metric":"cache/delta/hitRate","value":0.0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/delta/averageBytes","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/delta/timeouts","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/delta/errors","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/numEntries","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/sizeBytes","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/hits","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/misses","value":40}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/evictions","value":40}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/hitRate","value":0.0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/averageBytes","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/timeouts","value":0}]

2014-12-30 19:31:12,541 ERROR [MonitorScheduler-0] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-12-30T19:31:12.541Z","service":"broker","host":"localhost:8084","metric":"cache/total/errors","value":0}]

Eric Tschetter

unread,
Dec 30, 2014, 7:20:54 PM12/30/14
to druid-de...@googlegroups.com
Sudhir,

You've got a *lot* of segments there, also it looks like your broker
cache isn't actually doing too much (it has evictions but no bytes
stored).

If you are only working with 100,000 rows of data, try adjusting your
ingestion spec to use "year" for segmentGranularity. This should
reduce the number of segments generated in the system and ultimately
make things nice and happy for you.

Just to explain a bit more, as a general rule it's good to target ~5
millions rows in a single segment, the primary thing to use to achieve
that on smaller data sets, like your own, is to increase the
"segmentGranularity". "year" is the largest value it can be.

--Eric
> https://groups.google.com/d/msgid/druid-development/afae5173-b438-469b-afe3-a0a352f69b5b%40googlegroups.com.

Sudhir Rama Rao

unread,
Dec 30, 2014, 7:59:24 PM12/30/14
to druid-de...@googlegroups.com
Eric, I did what you recommended

1. # of segments fell to 2
2. loading of the data was super fast

But, the query still times out

On the other hand i see another issue here, even though i have 3 historical servers running, the data is always loaded to only 1, from where i run the indexing.

other 3 historical nodes are visible in the localhost:8082/cluster.html, but shows 0 server currSize and 0 server percentUsed.

How can i distribute the data into multiple historical nodes ? could this impact the performance of the queries ?
Message has been deleted

Eric Tschetter

unread,
Dec 30, 2014, 8:40:18 PM12/30/14
to druid-de...@googlegroups.com
Sudhir,

When you say you have three historicals running, are they running on
the same machine or different machines? I'm assuming that when you
said you are using the local indexer to load your data, that means you
are also using local "deep storage", is that true?

Local deep storage doesn't actually allow other historical nodes to
see the segments (i.e. it cannot actually distribute the data to other
nodes in the "cluster") without setting up NFS or something like that.
I would recommend using S3 or HDFS if you want to test out clustered
operation.

Also, if you have 3 historicals on multiple different servers, the
"host" property is going to need to be the actual IP_ADDR:PORT
combination, not "localhost:PORT" like you currently have.

--Eric
> https://groups.google.com/d/msgid/druid-development/cf8d417b-774a-48de-8cdd-1cec3dc13cdb%40googlegroups.com.

Sudhir Rama Rao

unread,
Dec 30, 2014, 8:53:23 PM12/30/14
to druid-de...@googlegroups.com
Yes i am running 3 node cluster: node1, node2, node3

node1 runs : broker, historical, coordinator,overlord, zookeeper, mysql-server
node2 runs : broker, historical, coordinator
node3 runs : broker, historical, coordinator

i run the indexing from node1:

curl -X 'POST' -H 'Content-Type:application/json' -d @index.json http://localhost:8080/druid/indexer/v1/task

The runtime.properties of historical nodes have this setup

druid.extensions.coordinates=["io.druid.extensions:druid-hdfs-storage:0.6.160"]

druid.storage.storageDirectory=hdfs://nn1/data/druid/pci

druid.storage.type=hdfs

i expect the data segments to be stored in hdfs, which is not happening. all the segments are stored in node1

i am sorry about the localhost mention, i am actually looking at the coordinator logs at http://node1:8082/cluster.html


regards
#sudhir

Fangjin Yang

unread,
Dec 30, 2014, 8:58:41 PM12/30/14
to druid-de...@googlegroups.com
Hi Sudhir, did you set the deep storage properties on the overlord (the node doing the indexing) as well?
...

Sudhir Rama Rao

unread,
Dec 30, 2014, 9:17:55 PM12/30/14
to druid-de...@googlegroups.com
FY, I had not... so i put the same configs in overlord runtime.properties and tried to index the data again

This is what i see....

Updated Config : 


druid.extensions.coordinates=["io.druid.extensions:druid-hdfs-storage:0.6.160"]

druid.storage.storageDirectory=hdfs://nn1:8020/data/druid/pci

druid.storage.type=hdfs

Indexing logs :

2014-12-31 02:15:31,838 WARN [task-runner-0] io.druid.indexing.common.index.YeOldePlumberSchool - Failed to merge and upload
java.io.IOException: No FileSystem for scheme: hdfs
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
	at io.druid.storage.hdfs.HdfsDataSegmentPusher.push(HdfsDataSegmentPusher.java:75)
	at io.druid.indexing.common.task.IndexTask$2.push(IndexTask.java:390)
	at io.druid.indexing.common.index.YeOldePlumberSchool$1.finishJob(YeOldePlumberSchool.java:179)
	at io.druid.indexing.common.task.IndexTask.generateSegment(IndexTask.java:444)
	at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:198)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:218)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:197)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
2014-12-31 02:15:31,842 INFO [task-runner-0] io.druid.indexing.common.index.YeOldePlumberSchool - Del

...

Fangjin Yang

unread,
Dec 30, 2014, 9:21:36 PM12/30/14
to druid-de...@googlegroups.com
Hi Sudhir, did your include hadoop configuration files on the classpath of your middle manager?
...

Fangjin Yang

unread,
Dec 30, 2014, 9:24:53 PM12/30/14
to druid-de...@googlegroups.com
Ah, you are running a single overlord instance, so the hadoop config files need to be on the cp of the overlord.
...

Sudhir Rama Rao

unread,
Dec 30, 2014, 9:46:37 PM12/30/14
to druid-de...@googlegroups.com
alright thanks so much  FY and Eric. hdfs deep storage works.

-rw-r--r--   3 sudhrao hdfs        537 2014-12-30 19:42 /data/druid/pci/20120401T000000.000Z_20120501T000000.000Z/2014-12-31T02_41_55.040Z/0/descriptor.json

-rw-r--r--   3 sudhrao hdfs     285372 2014-12-30 19:42 /data/druid/pci/20120401T000000.000Z_20120501T000000.000Z/2014-12-31T02_41_55.040Z/0/index.zip

-rw-r--r--   3 sudhrao hdfs        537 2014-12-30 19:42 /data/druid/pci/20120501T000000.000Z_20120601T000000.000Z/2014-12-31T02_41_55.040Z/0/descriptor.json

-rw-r--r--   3 sudhrao hdfs     244960 2014-12-30 19:42 /data/druid/pci/20120501T000000.000Z_20120601T000000.000Z/2014-12-31T02_41_55.040Z/0/index.zip

...

....

however when i see the coordinator cluster.html page, I see only one server having data. Is this expected due to my data size ? why don;t other historical nodes load the segments from hdfs

PS : checkout the screenshot in the attachement


...
Screen Shot 2014-12-30 at 6.45.08 PM.png

Fangjin Yang

unread,
Dec 30, 2014, 9:51:50 PM12/30/14
to druid-de...@googlegroups.com
By default, two copies of a segment should be loaded on the historical cluster. This means you should see another node with a non zero currSize. Do you see any exceptions in your coordinator node logs or on the other historicals?
...

Sudhir Rama Rao

unread,
Dec 30, 2014, 9:55:12 PM12/30/14
to druid-de...@googlegroups.com
FY my bad, i think it could be the same hadoop config not in the cp issue.. i am looking into it.. will report back in few mins
...

Sudhir Rama Rao

unread,
Dec 30, 2014, 10:01:32 PM12/30/14
to druid-de...@googlegroups.com
Ok i think now i have the following working

1. deep storage into hdfs - check
2. two historical nodes have loaded the segments - check
3. query works - when directly hitting historical or broker - not working :(

I still cannot understand why, here is my indexing spec

{


    "type":"index",

   "dataSource" : "pci",

   "granularitySpec" : {


            "type" : "uniform",


            "gran" : "month",


            "intervals" : [ "2000/2016" ]


    },


    "aggregators":[


                    {"type":"doubleSum","name":"count","fieldName":"count"},


                    {"type":"doubleSum","name":"amount","fieldName":"amount"}


        ],


  "config": {


      "maxRowsInMemory": 500000000,


      "intermediatePersistPeriod": "PT10m"


    },


    "firehose" : {


      "type" : "local",


       "baseDir" : "/home/sudhrao",


      "filter" : "100k.json",


      "parser" : {


      "timestampSpec" : {


        "column" : "timestamp"


      },


      "data": {


        "format": "json",


        "columns": ["timestamp","account_number","tcode","amount","currency","country","timezone","count"],


        "dimensions": ["account_number","tcode","curreny","country","timezone"]


      }


    }


  },


  "plumber": {


      "type": "realtime",


      "windowPeriod": "PT10m",


      "segmentGranularity": "day",


      "basePersistDirectory": "\/tmp\/realtime\/basePersist"


    }


}


here is the groupBy query 








{


  "queryType": "groupBy",


  "dataSource": "pci",


  "granularity": "none",


  "limitSpec": { "type": "default", "limit": 500000000, "columns": [] },


  "dimensions": ["account_number","tcode","timezone"],


  "intervals" : [ "2012-01-01T00:00:00.000/2014-10-01T00:00:00.000" ],


    "aggregations":[


                    {"type":"longSum","name":"count","fieldName":"count"},


                    {"type":"longSum","name":"amount","fieldName":"amount"}


        ]


}

...

Fangjin Yang

unread,
Dec 30, 2014, 10:03:38 PM12/30/14
to druid-de...@googlegroups.com
Hi Sudhir, can you try a simple timeBoundary query and a timeseries query? Do either of those work? As for the groupBy query, are there any exceptions in the logs of the historicals or brokers?

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.

Sudhir Rama Rao

unread,
Dec 30, 2014, 10:18:40 PM12/30/14
to druid-de...@googlegroups.com
No even a simple time boundary is timing out

I am attaching the config i am using

I run the servers as follows

Historical :

[sudhrao@dtee1-druid-poc-001-6055 ~]$ cat /opt/druid/run_historical.sh 

#!/usr/bin/env bash

echo "This will run a historical of Druid"

set +u

shopt -s xpg_echo

shopt -s expand_aliases


#  start process

JAVA_ARGS="-Xmx1g -Duser.timezone=UTC -Dfile.encoding=UTF-8"



#For the kit

DRUID_CP=/opt/druid/lib/*:/opt/druid/config/historical

echo "Running command:"

(set -x; java -XX:MaxDirectMemorySize=8g ${JAVA_ARGS} -classpath ${DRUID_CP} -Ddruid.emitter=logging -Ddruid.emitter.logging.logLevel=error  io.druid.cli.Main server historical)

[sudhrao@dtee1-druid-poc-001-6055 ~]$ 


similarly other servers, with mx set to 1g-4g



...
config.zip

Sudhir Rama Rao

unread,
Dec 30, 2014, 10:36:34 PM12/30/14
to druid-de...@googlegroups.com
Ok timeBoundary query now works when i directly hit the historical node. however when i hit the broker it basically timesout

Query : 

{

    "queryType" : "timeBoundary",

    "dataSource": "pci"

}


Result :

[ {

  "timestamp" : "2012-01-01T00:12:37.000Z",

  "result" : {

    "minTime" : "2012-01-01T00:12:37.000Z",

    "maxTime" : "2013-09-30T21:58:55.000Z"

  }

} ]


...

Fangjin Yang

unread,
Dec 30, 2014, 11:02:05 PM12/30/14
to druid-de...@googlegroups.com
Hi Sudhir, this is 0.6.160 right?

Can you try to ping the historical node from your broker in your cluster? If that succeeds, can you add the following configs to the broker:

druid.broker.http.numConnections=50
druid.server.http.numThreads=100

And the following config on the historical:
druid.server.http.numThreads=50

I'm actually not sure if that's the problem, but I do know that the old default setting on the number of jetty threads has caused problems.

...

Sudhir Rama Rao

unread,
Dec 30, 2014, 11:16:34 PM12/30/14
to druid-de...@googlegroups.com
yes, i am using 0.6.160.. i will try the below configs and let you know
...

Eric Tschetter

unread,
Dec 30, 2014, 11:37:58 PM12/30/14
to druid-de...@googlegroups.com
It looks like you have the host set to "node1", are you certain that
the broker node you are hitting can resolve "node1", "node2" and
"node3" to IP addresses and that the ports are open between them?

--Eric
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-developm...@googlegroups.com.
> To post to this group, send email to druid-de...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/druid-development/64c49bee-3c9d-49ee-bb0a-0f0e2fa537dd%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages