Alert or Metric for Failed Tasks

95 views
Skip to first unread message

cay...@apollographql.com

unread,
Dec 14, 2018, 7:43:28 PM12/14/18
to Druid User
Hi there! 

Are there any metrics or alerts for failed tasks? I want to create Datadog monitoring over Druid tasks to see when they have failed or succeeded but I don't currently see anything in the docs. Can anyone point me to the right resources? :) 

Jonathan Wei

unread,
Dec 18, 2018, 6:00:33 PM12/18/18
to druid...@googlegroups.com
I don't think there is a currently a metric/alert that's generated upon task failure (I think there are alerts for some kinds of task failure modes but not a general alert, if you search for EmittingLogger.makeAlert() calls in the codebase).

One option could be to periodically fetch the list of completed tasks using `http://overlord:overlordPort/druid/indexer/v1/completeTasks` and check for task failures that way. In 0.12.0 and later there is an "n" parameter that allows you to fetch the most recent N task statuses.





On Fri, Dec 14, 2018 at 4:43 PM <cay...@apollographql.com> wrote:
Hi there! 

Are there any metrics or alerts for failed tasks? I want to create Datadog monitoring over Druid tasks to see when they have failed or succeeded but I don't currently see anything in the docs. Can anyone point me to the right resources? :) 

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/dcc23e41-1610-4173-9f3a-590c84c0a5d1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Caydie Tran

unread,
Dec 19, 2018, 2:55:00 PM12/19/18
to druid...@googlegroups.com
Thanks Jonathan for the response!

Looks like we'll have to just go with the second option!

Caydie Tran

unread,
Dec 19, 2018, 6:40:49 PM12/19/18
to druid...@googlegroups.com
Actually one more quick question, how do I grab the N most recent task statuses through that endpoint? I couldn't find what the shape of this endpoint to get pass in n in the API docs :( 

Eyal Yurman

unread,
Dec 19, 2018, 7:11:25 PM12/19/18
to druid...@googlegroups.com
I believe this is an internal API (undocumented), so I would trust Jon if he suggests using it, but perhaps you could file a GitHub issue to request documentation, so any future change would be announced properly.

Eyal Yurman

unread,
Dec 19, 2018, 7:13:45 PM12/19/18
to druid...@googlegroups.com
So the URL would currently be: http://overlord:overlordPort/druid/indexer/v1/completeTasks?n=maxTaskStatuses

Where maxTaskStatuses is a number.

Jonathan Wei

unread,
Dec 19, 2018, 7:20:52 PM12/19/18
to druid...@googlegroups.com
Thanks Eyal, it should look like that URL.

The "completeTasks" (and related APIs like "runningTasks") should be documented IMO, I don't believe it's meant to be obscured but it's missing from a documentation oversight. 

Also worth mentioning, in 0.13.0-incubating, task information is also exposed via Druid SQL: http://druid.io/docs/0.13.0-incubating/querying/sql.html#system-schema

Reply all
Reply to author
Forward
Message has been deleted
0 new messages