Segment Handoff failed with too large for storage exception

GunWoo Kim

unread,

Jul 28, 2016, 11:44:56 PM7/28/16

to Druid Development

Hi guys, I got an error on Indexing task segment hand off.

I checked historical node log and found an 'Failed to load segment' error message.

Log messages as follows:

2016-07-27 06:35:39,398 INFO [io.druid.server.coordination.ZkCoordinator] New request[LOAD: stb_2016-07-27T05:00:00.000Z_2016-07-27T06:00:00.000Z_2016-07-27T05:03:01.960Z] with zNode[/druid/loadQueue/ndap03.ndap.com:8083/stb_2016-07-27T05:00:00.000Z_2016-07-27T06:00:00.000Z_2016-07-27T05:03:01.960Z].

2016-07-27 06:35:39,398 INFO [io.druid.server.coordination.ZkCoordinator] Loading segment stb_2016-07-27T05:00:00.000Z_2016-07-27T06:00:00.000Z_2016-07-27T05:03:01.960Z

2016-07-27 06:35:39,398 WARN [io.druid.server.coordination.BatchDataSegmentAnnouncer] No path to unannounce segment[stb_2016-07-27T05:00:00.000Z_2016-07-27T06:00:00.000Z_2016-07-27T05:03:01.960Z]

2016-07-27 06:35:39,398 INFO [io.druid.server.coordination.ZkCoordinator] Completely removing [stb_2016-07-27T05:00:00.000Z_2016-07-27T06:00:00.000Z_2016-07-27T05:03:01.960Z] in [30,000] millis

2016-07-27 06:35:39,399 INFO [io.druid.server.coordination.ZkCoordinator] Completed request [LOAD: stb_2016-07-27T05:00:00.000Z_2016-07-27T06:00:00.000Z_2016-07-27T05:03:01.960Z]

2016-07-27 06:35:39,399 ERROR [io.druid.server.coordination.ZkCoordinator] Failed to load segment for dataSource: {class=io.druid.server.coordination.ZkCoordinator, exceptionType=class io.druid.segment.loading.SegmentLoadingException, exceptionMessage=Exception loading segment[stb_2016-07-27T05:00:00.000Z_2016-07-27T06:00:00.000Z_2016-07-27T05:03:01.960Z], segment=DataSegment{size=4033, shardSpec=LinearShardSpec{partitionNum=0}, metrics=[count], dimensions=[action, cause, dvc_type, scn, topology], version='2016-07-27T05:03:01.960Z', loadSpec={type=hdfs, path=hdfs://ndap09.ndap.com:8020/user/root/druid/segments/stb/20160727T050000.000Z_20160727T060000.000Z/2016-07-27T05_03_01.960Z/0/index.zip}, interval=2016-07-27T05:00:00.000Z/2016-07-27T06:00:00.000Z, dataSource='stb', binaryVersion='9'}}

io.druid.segment.loading.SegmentLoadingException: Exception loading segment[stb_2016-07-27T05:00:00.000Z_2016-07-27T06:00:00.000Z_2016-07-27T05:03:01.960Z]

Caused by: com.metamx.common.ISE: Segment[stb_2016-07-27T05:00:00.000Z_2016-07-27T06:00:00.000Z_2016-07-27T05:03:01.960Z:4,033] too large for storage[var/druid/segment-cache:-555,499,020].

It is really strange that the value in exception cause message is displayed as a negative number about segment-cache size.

too large for storage[var/druid/segment-cache:-555,499,020]

Historical node can't load segment and so Indexing task can't complete it's task.

My test environment as follows:

- CentOS 6.5

- Druid 0.9.0

- historical node config:

# HTTP server threads

druid.server.http.numThreads=50

druid.server.maxSize=300000000000

# Processing threads and buffers

druid.processing.buffer.sizeBytes=1073741824

druid.processing.numThreads=12

# Segment storage

druid.segmentCache.locations=[{"path":"var/druid/segment-cache","maxSize"\:15000000000}]

Any reply is appreciated. =)

Thank you.

Fangjin Yang

unread,

Jul 29, 2016, 6:38:31 PM7/29/16

to Druid Development

Hi GunWoo, how much disk space does your machine have? The maxSize should correlate to the actual available disk space. The default values are just some examples values.

GunWoo Kim

unread,

Aug 1, 2016, 6:54:46 AM8/1/16

to Druid Development

Hi Fangjin Yang, disk space for druid was set about 100Gb but from the druid document, i saw description for druid.server.maxSize property as below:

The maximum number of bytes-worth of segments that the node wants assigned to it. This is not a limit that Historical nodes actually enforces, just a value published to the Coordinator node so it can plan accordingly.

and the default value of it is 0(from druid document), so i did not care about it.

Segment storage property was set as below:

druid.segmentCache.locations=[{"path":"var/druid/segment-cache","maxSize"\:15000000000}]

and 66Gb disk available now for historical node.

Thank you for you reply.

Fangjin Yang

unread,

Aug 15, 2016, 6:01:39 PM8/15/16

to Druid Development

Druid.server.maxSize must be set to a non-zero value on historicals otherwise no segments will get downloaded. The documentation on that config should be improved. Druid definitely enforces this limit, but it is the coordinator that enforces the maxSize limit and the coordinator is what tells historicals to download segments.

Saravana Soundararajan

unread,

Apr 4, 2018, 2:35:34 PM4/4/18

to Druid Development

After seeing a similar error in Druid historical

Caused by: com.metamx.common.ISE: Segment[timeseries_dogstatsd_counter_2018-04-04T16:00:00.000Z_2018-04-04T17:00:00.000Z_2018-04-04T16:00:00.000Z_1528210995:152,770,889] too large for storage[/var/tmp/druid/indexCache:22,010].

, we notice that the historical node stops loading new segments from realtime and the realtime nodes starts accumulating segments.

Our maxSize settings goes like this and we had enough free disk space.

druid.server.maxSize=882159184076
druid.segmentCache.locations=[{"path":"/var/tmp/druid/indexCache","maxSize":882159184076}]

Restarting Druid historical fixes the issue. We suspect that there is something going wrong with how Druid calculates the available size i.e., 22,010.

Reply all

Reply to author

Forward