Tasks logs disappear

1,648 views
Skip to first unread message

Fede

unread,
Aug 9, 2016, 7:14:17 AM8/9/16
to Druid User
Hi all,

I'm having big trouble to understand why some tasks fail, because when I try to see the log file I get this: "No log was found for this task. The task may not exist, or it may not have begun running yet.".

Even if the tasks has SUCCESS status!

I don't know what to do... Could you please help me? It's getting very annoying and I can't go on with the job.

In the overlord logs I see this:

2016-08-09T11:02:32,560 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] location changed to [TaskLocation{host='localhost', port=8106}].
2016-08-09T11:02:36,572 INFO [qtp377356799-87] io.druid.indexing.common.actions.LocalTaskActionClient - Performing action for task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z]: LockTryAcquireAction{interval=2016-01-01T00:00:00.000Z/2016-01-02T00:00:00.000Z}
2016-08-09T11:02:36,572 INFO [qtp377356799-87] io.druid.indexing.overlord.TaskLockbox - Task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] already present in TaskLock[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z]
2016-08-09T11:02:37,589 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.RemoteTaskRunner - Worker[localhost:8091] wrote FAILED status for task [index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] on [TaskLocation{host='localhost', port=8106}]
2016-08-09T11:02:37,589 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.RemoteTaskRunner - Worker[localhost:8091] completed task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] with status[FAILED]
2016-08-09T11:02:37,590 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskQueue - Received FAILED status for task: index_hadoop_datasource-test_2016-08-09T11:02:32.531Z
2016-08-09T11:02:37,590 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.RemoteTaskRunner - Cleaning up task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] on worker[localhost:8091]
2016-08-09T11:02:37,592 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskLockbox - Removing task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] from activeTasks
2016-08-09T11:02:37,592 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskLockbox - Removing task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] from TaskLock[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z]
2016-08-09T11:02:37,592 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskLockbox - TaskLock is now empty: TaskLock{groupId=index_hadoop_datasource-test_2016-08-09T11:02:32.531Z, dataSource=datasource-test, interval=2016-01-01T00:00:00.000Z/2016-01-02T00:00:00.000Z, version=2016-08-09T11:02:32.534Z}
2016-08-09T11:02:37,594 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.MetadataTaskStorage - Deleting TaskLock with id[1525]: TaskLock{groupId=index_hadoop_datasource-test_2016-08-09T11:02:32.531Z, dataSource=datasource-test, interval=2016-01-01T00:00:00.000Z/2016-01-02T00:00:00.000Z, version=2016-08-09T11:02:32.534Z}
2016-08-09T11:02:37,596 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.MetadataTaskStorage - Updating task index_hadoop_datasource-test_2016-08-09T11:02:32.531Z to status: TaskStatus{id=index_hadoop_datasource-test_2016-08-09T11:02:32.531Z, status=FAILED, duration=5033}
2016-08-09T11:02:37,598 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskQueue - Task done: HadoopIndexTask{id=index_hadoop_datasource-test_2016-08-09T11:02:32.531Z, type=index_hadoop, dataSource=datasource-test}

Thank you. 

Nishant Bangarwa

unread,
Aug 9, 2016, 7:25:51 AM8/9/16
to Druid User
Hi, 
You need to configure task logging to store your task logs. 
Please refer Task logging section here - http://druid.io/docs/latest/configuration/indexing-service.html

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/8e863f98-a4b0-47d3-a38d-10ffecd0c99d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fede

unread,
Aug 9, 2016, 7:40:41 AM8/9/16
to Druid User
Thank you Nishant, I'm already on it! 

Yogesh Agrawal

unread,
Mar 15, 2017, 2:04:33 PM3/15/17
to Druid User
Nishant,
I have configured task logging with type=hdfs and druid.indexer.logs.directory=hdfs_path
Still sometime logs go missing. It is weird that logs are not missing in all cases but only in certain cases (more when tasks fail as opposed to successful tasks). any other configuration required or any checks I may do?

Thanks

Davor Poldrugo

unread,
Oct 24, 2018, 12:04:00 PM10/24/18
to Druid User
I'm having the same issue. Logs are shown ok while the peon task is running. Log on this Indexer console HTTP endpoint: /druid/indexer/v1/task/index_kafka_<my_task_id>/log
The log in that point in time is read from the middle manager local dir: /tmp/druid/task/index_kafka_<my_task_id>/log

After tasks are finished and their logs transferred to the paramterized shared folder: ${druid.indexer.logs.directory}
After that the HTTP endpoint starts showing this in the response: No log was found for this task. The task may not exist, or it may not have begun running yet.
Even though the log file actually exists in the shared deep storage folder, called like this: ${druid.indexer.logs.directory}/index_kafka_<my_task_id>.log

This to me seems a bug in overlord, because maybe it's expecting the log file in deep storage on this path: ${druid.indexer.logs.directory}/index_kafka_<my_task_id>/log

Imply team, can you please check that, thanks...

Frank Zhang

unread,
Oct 30, 2018, 7:57:47 AM10/30/18
to druid...@googlegroups.com
I have a same issue, did you fix that?

Davor Poldrugo <dpol...@gmail.com> 于2018年10月25日周四 上午12:04写道:

Jonathan Wei

unread,
Oct 30, 2018, 5:31:45 PM10/30/18
to druid...@googlegroups.com
> This to me seems a bug in overlord, because maybe it's expecting the log file in deep storage on this path: ${druid.indexer.logs.directory}/index_kafka_<my_task_id>/log

If you're using local file task log, this is how the path for the file log is constructed, it doesn't create a separate directory per task:

```
private File fileForTask(final String taskid, String filename)
{
  return new File(config.getDirectory(), StringUtils.format("%s.%s", taskid, filename));
}
```

Likewise for HDFS task logs:

```
/**
* Due to https://issues.apache.org/jira/browse/HDFS-13 ":" are not allowed in
* path names. So we format paths differently for HDFS.
*/
private Path getTaskLogFileFromId(String taskId)
{
return new Path(mergePaths(config.getDirectory(), taskId.replaceAll(":", "_")));
}
```

I would double check that `druid.indexer.logs.directory` is set consistently on overlords and MMs, and also check that the directory is accessible from your overlord.




Reply all
Reply to author
Forward
0 new messages