Result of all metrics is 0 after running index

santosh sahoo

unread,

Mar 3, 2017, 2:16:01 AM3/3/17

to Druid User

Hi,

I am trying to run the index_hadoop task on one of my datasource for reindexing. This job is getting success and segment is creating. But when I am trying to see the result by select query, I found all metrics value are 0. Could you please help me on this. Below are details I have used to run this task.

index_hadoop task:

{

"type": "index_hadoop",

"spec": {

"dataSchema": {

"dataSource": "index_hadoop_test20",

"parser": {

"type": "string",

"parseSpec": {

"format" : "json",

"timestampSpec" : {

"format" : "auto",

"column" : "timestamp"

},

"columns" : [

"timestamp",

"dim1",

"metric1_sum"

],

"dimensionsSpec" : {

"dimensions" : []

}

},

"metricsSpec": [

{

"name" : "metric1_sum",

"type" : "doubleSum",

"fieldName" : "metric1"

}

],

"granularitySpec": {

"type": "uniform",

"segmentGranularity": "DAY",

"queryGranularity": "NONE",

"intervals": ["2017-03-01T02:00:00/2017-03-02T02:00:00"]

}

},

"ioConfig": {

"type": "hadoop",

"inputSpec": {

"type":"dataSource",

"ingestionSpec": {

"dataSource":"arpan1",

"intervals" : ["2017-03-01T02:00:00/2017-03-02T02:00:00"]

}

},

"tuningConfig": {

"type": "hadoop",

"partitionsSpec" : {

"type" : "hashed",

"targetPartitionSize" : 5000000

},

"jobProperties" : {}

}

Select Query :

{

"queryType": "select",

"dataSource": "index_hadoop_test20",

"descending": "true",

"dimensions":[],

"metrics":[],

"granularity": "all",

"intervals": ["2017-01-01/2017-03-31"],

"pagingSpec":{"pagingIdentifiers": {}, "threshold":300}

}

Result :

"timestamp": "2017-03-01T07:00:00.000Z",

"result": {

"pagingIdentifiers": {

"index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z": -11

},

"dimensions": [

"dim4",

"dim3",

"dim2",

"dim1"

],

"metrics": [

"metric1_sum"

],

"events": [

{

"segmentId": "index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z",

"offset": -1,

"event": {

"timestamp": "2017-03-01T09:45:00.000Z",

"dim4": "dim4",

"dim3": "dim3",

"dim2": "dim2",

"dim1": "dim1",

"metric1_sum": 0

}

},

{

"segmentId": "index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z",

"offset": -2,

"event": {

"timestamp": "2017-03-01T09:30:00.000Z",

"dim4": "dim4",

"dim3": "dim3",

"dim2": "dim2",

"dim1": "dim1",

"metric1_sum": 0

}

},

{

"segmentId": "index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z",

"offset": -3,

"event": {

"timestamp": "2017-03-01T09:15:00.000Z",

"dim4": "dim4",

"dim3": "dim3",

"dim2": "dim2",

"dim1": "dim1",

"metric1_sum": 0

}

},

{

"segmentId": "index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z",

"offset": -4,

"event": {

"timestamp": "2017-03-01T09:00:00.000Z",

"dim4": "dim4",

"dim3": "dim3",

"dim2": "dim2",

"dim1": "dim1",

"metric1_sum": 0

}

},

{

"segmentId": "index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z",

"offset": -5,

"event": {

"timestamp": "2017-03-01T08:45:00.000Z",

"dim4": "dim4",

"dim3": "dim3",

"dim2": "dim2",

"dim1": "dim1",

"metric1_sum": 0

}

},

{

"segmentId": "index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z",

"offset": -6,

"event": {

"timestamp": "2017-03-01T08:30:00.000Z",

"dim4": "dim4",

"dim3": "dim3",

"dim2": "dim2",

"dim1": "dim1",

"metric1_sum": 0

}

},

{

"segmentId": "index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z",

"offset": -7,

"event": {

"timestamp": "2017-03-01T08:00:00.000Z",

"dim4": "dim4",

"dim3": "dim3",

"dim2": "dim2",

"dim1": "dim1",

"metric1_sum": 0

}

},

{

"segmentId": "index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z",

"offset": -8,

"event": {

"timestamp": "2017-03-01T07:45:00.000Z",

"dim4": "dim4",

"dim3": "dim3",

"dim2": "dim2",

"dim1": "dim1",

"metric1_sum": 0

}

},

{

"segmentId": "index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z",

"offset": -9,

"event": {

"timestamp": "2017-03-01T07:30:00.000Z",

"dim4": "dim4",

"dim3": "dim3",

"dim2": "dim2",

"dim1": "dim1",

"metric1_sum": 0

}

},

{

"segmentId": "index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z",

"offset": -10,

"event": {

"timestamp": "2017-03-01T07:15:00.000Z",

"dim4": "dim4",

"dim3": "dim3",

"dim2": "dim2",

"dim1": "dim1",

"metric1_sum": 0

}

},

{

"segmentId": "index_hadoop_test20_2017-03-01T00:00:00.000Z_2017-03-02T00:00:00.000Z_2017-03-03T07:04:25.068Z",

"offset": -11,

"event": {

"timestamp": "2017-03-01T07:00:00.000Z",

"dim4": "dim4",

"dim3": "dim3",

"dim2": "dim2",

"dim1": "dim1",

"metric1_sum": 0

}

]

}

Please give some solution on this.

Gian Merlino

unread,

Mar 6, 2017, 4:35:59 AM3/6/17

to druid...@googlegroups.com

Hey Santosh,

What version of Druid is this and what did the ingestion spec for the original load (not reindexing) look like?

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/6bf1fa7f-9c7a-48fa-8085-11645ddf82af%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

santosh sahoo

unread,

Mar 6, 2017, 4:56:49 AM3/6/17

to Druid User

Hi Gian,

I have tested it on druid-0.9.2 , druid-0.10.0-SNAPSHOT and druid-0.10.0-rc1 versions. Please find the original ingestion spec I have used to load data.

{

"type": "kafka",

"dataSchema": {

"dataSource": "arpan1",

"parser": {

"type": "string",

"parseSpec": {

"format": "json",

"timestampSpec": {

"column": "timestamp",

"format": "ddMMyyyyHHmmss"

},

"dimensionsSpec": {

"dimensions": [

]

}

},

"metricsSpec": [

{

"name": "metric1",

"fieldName": "metric1",

"type": "doubleSum"

}

],

"granularitySpec": {

"type": "uniform",

"segmentGranularity": "HOUR",

"queryGranularity": "NONE",

"rollup": false

}

},

"ioConfig": {

"topic": "testdemo",

"consumerProperties": {

"bootstrap.servers": "localhost:9092"

}

Thanks,

Santosh

Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.

Gian Merlino

unread,

Mar 6, 2017, 5:18:58 AM3/6/17

to druid...@googlegroups.com

Does it work if you add metric1 to the dataSource ingestionSpec metrics list when you do reindexing? Like this:

"ioConfig": {

"type": "hadoop",

"inputSpec": {

"type":"dataSource",

"ingestionSpec": {

"dataSource":"arpan1",

"intervals" : ["2017-03-01T02:00:00/2017-03-02T02:00:00"],

"metrics" : ["metric1"]

}

Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/07ecdb7d-a7e6-4000-aeea-baa53fb5be90%40googlegroups.com.

santosh sahoo

unread,

Mar 6, 2017, 5:42:49 AM3/6/17

to Druid User

Hi Gian,

I have done the modification in ingestionSpec as per your suggestion, but got the same result.

Please have a look again.

Thanks in advance.

Gian

Gian Merlino

unread,

Mar 6, 2017, 6:40:30 AM3/6/17

to druid...@googlegroups.com

Ah, what's going on is that the hadoop reindexing mechanism is being sneaky. It doesn't apply your metricsSpec to the segments as-is, it applies them in the "combining" form. This is nice I guess since it lets you use the same metricsSpec while reindexing as you would on your raw data. But it also means you can't use the metricsSpec to define new aggregators. This would be useful and would be a new feature. The docs could also use some clarifications.

Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/552b18ca-4efe-4b10-ab7a-02c5ccef7744%40googlegroups.com.

Vadim

unread,

Jan 19, 2018, 1:55:32 AM1/19/18

to Druid User

Hi Gian, I was wondering how did you manage to solve this issue? I am facing the same problem right now at https://groups.google.com/d/msg/druid-user/4I9lUreb60k/JdWIyE1xAAAJ

понедельник, 6 марта 2017 г., 13:40:30 UTC+2 пользователь Gian Merlino написал:

Gian

Reply all

Reply to author

Forward

Result of all metrics is 0 after running index_hadoop task

santosh sahoo

Gian Merlino

santosh sahoo

Gian Merlino

santosh sahoo

Gian Merlino

Vadim