Data not showing up at all after ingestion

Amy Troschinetz

unread,

Oct 17, 2014, 10:24:24 AM10/17/14

to druid-de...@googlegroups.com

The index task log is 1.1MB, so instead of attaching it, I've uploaded it to my personal webspace: http://lexicalunit.com/shares/spilling.log

Some relevant excerpts:

[...]

2014-10-16 16:45:51,095 INFO [main] io.druid.indexing.worker.executor.ExecutorLifecycle - Running with task: {

"type" : "index",

"id" : "index_click_conversion_2014-10-16T16:45:43.002Z",

"schema" : {

"dataSchema" : {

"dataSource" : "click_conversion",

"parser" : {

"type" : "string",

"parseSpec" : {

"format" : "tsv",

"timestampSpec" : {

"column" : "timestamp",

"format" : "yyyy-MM-dd HH:mm:ss"

},

"dimensionsSpec" : {

"dimensions" : [ "browser", "city", "country", "coupon", "device", "channel", "region", "site", "property", "outclick_type" ],

"dimensionExclusions" : [ ],

"spatialDimensions" : [ ]

},

"delimiter" : "\t",

"columns" : [ "browser", "city", "timestamp", "country", "coupon", "device", "channel", "region", "site", "property", "outclick_type", "commissions", "sales", "orders" ]

}

},

"metricsSpec" : [ {

"type" : "count",

"name" : "count"

}, {

"type" : "doubleSum",

"name" : "commissions",

"fieldName" : "commissions"

}, {

"type" : "doubleSum",

"name" : "sales",

"fieldName" : "sales"

}, {

"type" : "doubleSum",

"name" : "orders",

"fieldName" : "orders"

} ],

"granularitySpec" : {

"type" : "uniform",

"segmentGranularity" : "DAY",

"queryGranularity" : {

"type" : "duration",

"duration" : 1000,

"origin" : "1970-01-01T00:00:00.000Z"

},

"intervals" : [ "2014-08-01T00:00:00.000Z/2014-08-15T00:00:00.000Z" ]

}

},

"ioConfig" : {

"type" : "index",

"firehose" : {

"type" : "static-s3",

"parser" : {

"type" : "string",

"parseSpec" : {

"format" : "tsv",

"timestampSpec" : {

"column" : "timestamp",

"format" : "yyyy-MM-dd HH:mm:ss"

},

"dimensionsSpec" : {

"dimensions" : [ "browser", "city", "country", "coupon", "device", "channel", "region", "site", "property", "outclick_type" ],

"dimensionExclusions" : [ ],

"spatialDimensions" : [ ]

},

"delimiter" : "\t",

"columns" : [ "browser", "city", "timestamp", "country", "coupon", "device", "channel", "region", "site", "property", "outclick_type", "commissions", "sales", "orders" ]

}

},

"uris" : [ "s3://s3-int-std-agg-deep-storage/test/2014-08-01_2014-08-15/2014-08-01_2014-08-15muSqCW.gz" ]

}

},

"tuningConfig" : {

"type" : "index",

"targetPartitionSize" : 0,

"rowFlushBoundary" : 0

}

},

"dataSource" : "click_conversion",

"groupId" : "index_click_conversion_2014-10-16T16:45:43.002Z",

"interval" : "2014-08-01T00:00:00.000Z/2014-08-15T00:00:00.000Z",

"resource" : {

"availabilityGroup" : "index_click_conversion_2014-10-16T16:45:43.002Z",

"requiredCapacity" : 1

}

[...]

2014-10-16 17:29:29,934 INFO [task-runner-0] io.druid.indexing.common.index.YeOldePlumberSchool - Spilling index[4] with rows[130037] to: /tmp/persistent/task/index_click_conversion_2014-10-16T16:45:43.002Z/work/click_conversion_2014-08-06T00:00:00.000Z_2014-08-07T00:00:00.000Z_2014-10-16T16:45:43.003Z_0/click_conversion_2014-08-06T00:00:00.000Z_2014-08-07T00:00:00.000Z_2014-10-16T16:45:43.003Z/spill4

2014-10-16 17:29:29,936 INFO [task-runner-0] io.druid.segment.IndexMerger - Starting persist for interval[2014-08-06T00:00:00.000Z/2014-08-07T00:00:00.000Z], rows[130,037]

2014-10-16 17:29:30,237 INFO [task-runner-0] io.druid.segment.IndexMerger - outDir[/tmp/persistent/task/index_click_conversion_2014-10-16T16:45:43.002Z/work/click_conversion_2014-08-06T00:00:00.000Z_2014-08-07T00:00:00.000Z_2014-10-16T16:45:43.003Z_0/click_conversion_2014-08-06T00:00:00.000Z_2014-08-07T00:00:00.000Z_2014-10-16T16:45:43.003Z/spill4/v8-tmp] completed index.drd in 1 millis.

2014-10-16 17:29:30,342 INFO [task-runner-0] io.druid.segment.IndexMerger - outDir[/tmp/persistent/task/index_click_conversion_2014-10-16T16:45:43.002Z/work/click_conversion_2014-08-06T00:00:00.000Z_2014-08-07T00:00:00.000Z_2014-10-16T16:45:43.003Z_0/click_conversion_2014-08-06T00:00:00.000Z_2014-08-07T00:00:00.000Z_2014-10-16T16:45:43.003Z/spill4/v8-tmp] completed dim conversions in 105 millis.

2014-10-16 17:29:31,715 INFO [task-runner-0] io.druid.segment.IndexMerger - outDir[/tmp/persistent/task/index_click_conversion_2014-10-16T16:45:43.002Z/work/click_conversion_2014-08-06T00:00:00.000Z_2014-08-07T00:00:00.000Z_2014-10-16T16:45:43.003Z_0/click_conversion_2014-08-06T00:00:00.000Z_2014-08-07T00:00:00.000Z_2014-10-16T16:45:43.003Z/spill4/v8-tmp] completed walk through of 130,037 rows in 1,372 millis.

[...]

2014-10-16 18:23:24,687 INFO [task-runner-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {

"id" : "index_click_conversion_2014-10-16T16:45:43.002Z",

"status" : "SUCCESS",

"duration" : 5853424

}

[...]

Given the above log, I would except to see data for the date range [2014-08-01, 2014-08-15). However, when I run a timeseries query I get no results:

$ cat series.json

{

"queryType": "timeseries",

"dataSource": "click_conversion",

"granularity": "day",

"aggregations": [

{

"type": "longSum",

"fieldName": "count",

"name": "events"

}

],

"intervals": ["2014-08-01/2014-08-15"]

}

$ dq "http://broker-ip:8080/druid/v2/?pretty" series.json

curl --silent --show-error -d @series.json -H 'content-type: application/json' 'http://broker-ip:8080/druid/v2/' --data-urlencode 'pretty' | python -mjson.tool | pygmentize -l json -f terminal256

[]

real 0m0.090s

user 0m0.003s

sys 0m0.002s

I have no idea what's going on here. Any help?

Amy Troschinetz

Data Software Engineer

This e-mail, including attachments, contains confidential and/or proprietary information, and may be used only by the person or entity to which it is addressed. The reader is hereby notified that any dissemination, distribution or copying of this e-mail is prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and delete this e-mail immediately.

Gian Merlino

unread,

Oct 17, 2014, 10:34:23 AM10/17/14

to druid-de...@googlegroups.com

It does look like the indexing went through properly, so possibly something is wrong with the coordinator/historical data loading dance. Are there any logs on the coordinator that look like it's trying to assign these segments to historical nodes (and succeeding)? Are there any logs on your historical nodes indicating they're trying to download these segments (and succeeding)?

Amy Troschinetz

unread,

Oct 17, 2014, 12:51:23 PM10/17/14

to druid-de...@googlegroups.com

Sure enough, there's some exceptions in the historical node logs:

2014-10-17 16:46:43,191 ERROR [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Failed to load segment for dataSource: {class=io.druid.server.coordination.ZkCoordinator, exceptionType=class io.druid.segment.loading.SegmentLoadingException, exceptionMessage=Exception loading segment[click_conversion_2014-08-31T00:00:00.000Z_2014-09-01T00:00:00.000Z_2014-10-16T16:59:39.454Z], segment=DataSegment{size=68530163, shardSpec=NoneShardSpec, metrics=[count, commissions, sales, orders], dimensions=[browser, channel, city, country, coupon, device, outclick_type, property, region, site], version='2014-10-16T16:59:39.454Z', loadSpec={type=s3_zip, bucket=s3-int-std-agg-deep-storage, key=click_conversion/2014-08-31T00:00:00.000Z_2014-09-01T00:00:00.000Z/2014-10-16T16:59:39.454Z/0/index.zip}, interval=2014-08-31T00:00:00.000Z/2014-09-01T00:00:00.000Z, dataSource='click_conversion', binaryVersion='9'}}

io.druid.segment.loading.SegmentLoadingException: Exception loading segment[click_conversion_2014-08-31T00:00:00.000Z_2014-09-01T00:00:00.000Z_2014-10-16T16:59:39.454Z]

at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:129)

at io.druid.server.coordination.SegmentChangeRequestLoad.go(SegmentChangeRequestLoad.java:44)

at io.druid.server.coordination.BaseZkCoordinator$1.childEvent(BaseZkCoordinator.java:113)

at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:494)

at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:488)

at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92)

at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)

at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:83)

at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:485)

at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35)

at org.apache.curator.framework.recipes.cache.PathChildrenCache$11.run(PathChildrenCache.java:755)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Caused by: io.druid.segment.loading.SegmentLoadingException: Problem decompressing object[S3Object [key=click_conversion/2014-08-31T00:00:00.000Z_2014-09-01T00:00:00.000Z/2014-10-16T16:59:39.454Z/0/index.zip, bucket=s3-int-std-agg-deep-storage, lastModified=Thu Oct 16 19:07:53 UTC 2014, dataInputStream=org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream@42ff28c0, Metadata={ETag="369cff46932e170a9cce1aa6e8df3019", Date=Fri Oct 17 16:46:38 UTC 2014, Content-Length=34937426, id-2=K53EBUsDk+HZxt+c8OazPSM43ZJ9i/1elkoty7V+m+Doc3kOQYXfC757/Gg1C2HKIItcYfnUM+Y=, request-id=1FDA28445F60E5F9, Last-Modified=Thu Oct 16 19:07:53 UTC 2014, md5-hash=369cff46932e170a9cce1aa6e8df3019, Content-Type=application/zip}]]

at io.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:138)

at io.druid.segment.loading.OmniSegmentLoader.getSegmentFiles(OmniSegmentLoader.java:125)

at io.druid.segment.loading.OmniSegmentLoader.getSegment(OmniSegmentLoader.java:93)

at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:146)

at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:125)

... 17 more

Caused by: java.io.IOException: Problem decompressing object[S3Object [key=click_conversion/2014-08-31T00:00:00.000Z_2014-09-01T00:00:00.000Z/2014-10-16T16:59:39.454Z/0/index.zip, bucket=s3-int-std-agg-deep-storage, lastModified=Thu Oct 16 19:07:53 UTC 2014, dataInputStream=org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream@42ff28c0, Metadata={ETag="369cff46932e170a9cce1aa6e8df3019", Date=Fri Oct 17 16:46:38 UTC 2014, Content-Length=34937426, id-2=K53EBUsDk+HZxt+c8OazPSM43ZJ9i/1elkoty7V+m+Doc3kOQYXfC757/Gg1C2HKIItcYfnUM+Y=, request-id=1FDA28445F60E5F9, Last-Modified=Thu Oct 16 19:07:53 UTC 2014, md5-hash=369cff46932e170a9cce1aa6e8df3019, Content-Type=application/zip}]]

at io.druid.storage.s3.S3DataSegmentPuller$1.call(S3DataSegmentPuller.java:114)

at io.druid.storage.s3.S3DataSegmentPuller$1.call(S3DataSegmentPuller.java:88)

at com.metamx.common.RetryUtils.retry(RetryUtils.java:22)

at io.druid.storage.s3.S3Utils.retryS3Operation(S3Utils.java:79)

at io.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:86)

... 21 more

Caused by: java.io.IOException: No space left on device

at java.io.FileOutputStream.writeBytes(Native Method)

at java.io.FileOutputStream.write(FileOutputStream.java:345)

at com.google.common.io.ByteStreams.copy(ByteStreams.java:211)

at io.druid.utils.CompressionUtils.unzip(CompressionUtils.java:104)

at io.druid.storage.s3.S3DataSegmentPuller$1.call(S3DataSegmentPuller.java:103)

... 25 more

2014-10-17 16:46:43,192 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - /druid/loadQueue/10.101.187.175:8080/click_conversion_2014-08-31T00:00:00.000Z_2014-09-01T00:00:00.000Z_2014-10-16T16:59:39.454Z was removed

It looks like the root filesystem is full due to the indexCache in /tmp. I checked the config for the historical nodes and sure enough, the specified max size for the segmentCache is set too high and needs to be lowered. Could this be what's causing these exceptions?

On Friday, October 17, 2014 9:34:23 AM UTC-5, Gian Merlino wrote:

It does look like the indexing went through properly, so possibly something is wrong with the coordinator/historical data loading dance. Are there any logs on the coordinator that look like it's trying to assign these segments to historical nodes (and succeeding)? Are there any logs on your historical nodes indicating they're trying to download these segments (and succeeding)?

Nishant Bangarwa

unread,

Oct 17, 2014, 1:51:58 PM10/17/14

to druid-de...@googlegroups.com

Hi Amy,

yeah this is related to disk being full, which means the historical is not able to load new segments.

lowering the max size on historical should fix this, you may also need to provision more historical nodes in order to hold your data.

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/5c66ab9e-2654-4791-aaa4-86e55b86ede6%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Nishant

Software Engineer

|

METAMARKETS

m	+91-9729200044

nishant....@metamarkets.com

Fangjin Yang

unread,

Oct 17, 2014, 1:53:12 PM10/17/14

to druid-de...@googlegroups.com

I think Nishant means increasing maxSize on the historical.

To unsubscribe from this group and stop receiving emails from it, send an email to druid-development+unsubscribe@googlegroups.com.
To post to this group, send email to druid-development@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/5c66ab9e-2654-4791-aaa4-86e55b86ede6%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Nishant
Software Engineer | METAMARKETS
m +91-9729200044

nishant.bangarwa@metamarkets.com

Fangjin Yang

unread,

Oct 17, 2014, 2:00:41 PM10/17/14

to druid-de...@googlegroups.com

Okay I'm dumb :P, the no space left on device means reducing the maxSize.

Amy Troschinetz

unread,

Oct 17, 2014, 2:19:18 PM10/17/14

to druid-de...@googlegroups.com

I reduced the maxSize but now I get this error on my historical nodes:

2014-10-17 18:12:53,531 ERROR [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Failed to load segment for dataSource: {class=io.druid.server.coordination.ZkCoordinator, exceptionType=class io.druid.segment.loading.SegmentLoadingException, exceptionMessage=Exception loading segment[click_conversion_2014-04-06T00:00:00.000Z_2014-04-07T00:00:00.000Z_2014-10-15T16:40:58.252Z], segment=DataSegment{size=61869449, shardSpec=NoneShardSpec, metrics=[count, commissions, sales, orders], dimensions=[browser, channel, city, country, coupon, device, outclick_type, property, region, site], version='2014-10-15T16:40:58.252Z', loadSpec={type=s3_zip, bucket=s3-int-std-agg-deep-storage, key=click_conversion/2014-04-06T00:00:00.000Z_2014-04-07T00:00:00.000Z/2014-10-15T16:40:58.252Z/0/index.zip}, interval=2014-04-06T00:00:00.000Z/2014-04-07T00:00:00.000Z, dataSource='click_conversion', binaryVersion='9'}}

io.druid.segment.loading.SegmentLoadingException: Exception loading segment[click_conversion_2014-04-06T00:00:00.000Z_2014-04-07T00:00:00.000Z_2014-10-15T16:40:58.252Z]