All Druid queries return an empty list

1,245 views
Skip to first unread message

Craig Finch

unread,
Nov 17, 2015, 7:18:53 PM11/17/15
to Druid User
I have a very puzzling situation with a Druid cluster (0.8.0). This cluster was installed a few months back, and I'm trying to get it running.

I have successfully ingested and indexed a data file (I think). According to the web interface of the coordinator node, the data source is fully available, and I can that all the segments are stored on node cluster-c. I can retrieve the dimensions and metrics of the data set with the command:

curl cluster-a:8082/druid/v2/datasources/opportunity_histogram_1M?interval=2015-01-01/2015-12-31

I have tried simple queries such as the following (I have also tried more complex queries, with the same result):

curl -X POST -H 'content-type: application/json' -d @query-metadata.json cluster-a:8082/druid/v2/


{
    "queryType": "dataSourceMetadata",
    "dataSource": "opportunity_histogram_1M",
    "intervals": ["2015-01-01/2015-12-31"]
}


curl -X POST -H 'Content-Type:application/json' -d @query-timeboundary.json cluster-a:8082/druid/v2/


{
        "queryType" : "timeBoundary",
        "dataSource" : "opportunity_histogram_1M"
}


I enabled request logging on both the historical node that happens to host these segments (cluster-c), and on the broker node (cluster-a). I can see the requests coming into both nodes.


On the broker:

2015-11-18T00:09:06.951Z        10.0.10.246     {"queryType":"timeBoundary","dataSource":{"type":"table","name":"opportunity_histogram_1M"},"intervals":{"type":"intervals","intervals":["0000-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z"]},"bound":"","context":{"queryId":"3aa88e03-5e7d-4dc0-929d-e54b16dd2e6d","timeout":300000}}        {"query/time":18,"success":true}


On the historical node:

2015-11-18T00:18:06.399Z        10.0.10.248     {"queryType":"timeBoundary","dataSource":{"type":"table","name":"opportunity_histogram_1M"},"intervals":{"type":"segments","segments":[{"itvl":"2015-02-16T00:00:00.000Z/2015-02-17T00:00:00.000Z","ver":"2015-11-10T23:03:30.838Z","part":0},{"itvl":"2015-03-18T00:00:00.000Z/2015-03-19T00:00:00.000Z","ver":"2015-11-10T23:03:30.838Z","part":0}]},"bound":"","context":{"finalize":false,"queryId":"3aa88e03-5e7d-4dc0-929d-e54b16dd2e6d","timeout":300000}} {"query/time":11,"success":true}


Both nodes seem to be reporting success, but no data is returned. No errors or warnings are displayed on the standard output or standard error of any of the cluster nodes.


Any suggestions how to troubleshoot this problem?


Fangjin

unread,
Nov 17, 2015, 7:19:56 PM11/17/15
to Druid User
Hi Craig, can you post a screenshot of the coordinator console main page?

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/a865b94d-6dc1-483d-a44b-8b72b32f758c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Craig Finch

unread,
Nov 17, 2015, 7:23:16 PM11/17/15
to Druid User

Fangjin

unread,
Nov 17, 2015, 7:25:22 PM11/17/15
to Druid User
Hi Craig, can you post the full command and response of the timeboundary query?

Craig Finch

unread,
Nov 17, 2015, 7:28:01 PM11/17/15
to Druid User
Thank you for a quick response! The full command, JSON content, and response logs of the time boundary query are in my original post. The query returns [].

   Craig

Fangjin

unread,
Nov 17, 2015, 7:28:44 PM11/17/15
to Druid User
Does it make a difference if u use the other datasource?

Craig Finch

unread,
Nov 17, 2015, 7:34:11 PM11/17/15
to Druid User
Both data sources return the same result: []. I should note that the two data sources are actually ingested from the same data file, but with a different number of lines imported (ten thousand and one million). 

Fangjin

unread,
Nov 17, 2015, 7:37:23 PM11/17/15
to Druid User
Any exceptions in the broker logs?

Craig Finch

unread,
Nov 17, 2015, 7:41:40 PM11/17/15
to Druid User
I don't see any exceptions. A request log entry from the broker is shown in my original post. The broker node is printing output to the console like this:


2015-11-18T00:08:04,076 INFO [ServerInventoryView-0] io.druid.client.BatchServerInventoryView - New Server[DruidServerMetadata{name='cluster-c:8081', host='cluster-c:8081', maxSize=300000000000, tier='_default_tier', type='historical', priority='0'}]


2015-11-18T00:26:08,334 INFO [qtp640275932-40] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://cluster-c:8081


That is the only output after the broker service is started. I don't see any other logs from the broker. Is there a way to configure more verbose logging?


    Craig

Fangjin Yang

unread,
Nov 18, 2015, 1:28:29 PM11/18/15
to Druid User
Hmmm, very odd. Are there any interesting logs for your historical? If you issue a timeBoundary query to your historical directly, do any results return?

Craig Finch

unread,
Nov 22, 2015, 5:29:35 PM11/22/15
to Druid User
Fangjin,
Queries directly to the historical node return the same result. cluster-c is the historical node that contains my data:

curl -X POST -H 'Content-Type:application/json' -d @query-timeboundary.json cluster-c:8081/druid/v2/


I see the request logged on cluster-c:

2015-11-22T22:26:29.997Z        10.0.10.246     {"queryType":"timeBoundary","dataSource":{"type":"table","name":"opportunity_histogram_10k"},"intervals":{"type":"intervals","intervals":["0000-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z"]},"bound":"","context":{"queryId":"5ad7bf12-daf4-4469-a01c-96c84b7e8533","timeout":300000}}       {"query/time":16,"success":true}


An empty list [] is returned to the querying node.

We are considering "tearing down" this cluster and starting from scratch, rather than spending a lot more time finding obscure errors in our setup. Do you have any further suggestions before we start over?

   Craig

Fangjin Yang

unread,
Nov 28, 2015, 5:17:52 PM11/28/15
to Druid User
I have never seen this problem before. Can you dump the output of the curl in verbose mode? I don't think tearing down a cluster will overcome this problem. I suspect the problem to be something with the environment things are set up.

pratik dhamanekar

unread,
Dec 10, 2015, 8:17:01 AM12/10/15
to Druid User
Hi All,

I am also facing similar issue.

When I try to run an example at the below link I am able to get output.


But when I make use of spec file at below location I am getting empty result set as an output.


Is there anything else that we have to change at the configuration level?

Thanks and Regards,
Pratik

Fangjin Yang

unread,
Dec 13, 2015, 5:06:00 PM12/13/15
to Druid User
Pratik, are you sure you have data in the cluster?

Can you copy and paste your ingest/* metrics?

pratik dhamanekar

unread,
Dec 18, 2015, 7:28:36 AM12/18/15
to Druid User
I was wrong. I went through ingest metrics and found that there were events throownaway for my new data. I catched it in 'ingest/events/thrownaway' metric. It was because of timestamp I was ingesting. 

I just changed tuningConfig -> rejectionPolicy -> type to "messageTime" from serverTime. and It worked for me.

Thank you Fangjin.
Reply all
Reply to author
Forward
0 new messages