Kafka connect task identity

274 views
Skip to first unread message

Skrzypek, Jonathan

unread,
Nov 30, 2016, 7:32:05 AM11/30/16
to confluent...@googlegroups.com

Hi,

 

Is there a way to know about the taskId in a SinkTask ?

I need to differentiate tasks when logging specific actions, so would need to have access to a clientId or a taskId in the start() or put() method of my SinkTask.

 

I couldn’t find any public field anywhere, I only found a private KafkaConsumer field in the task’s context (SinkTaskContext), that has a private clientId field.

I could probably use reflection to dig the clientId string, but it’s a bit ugly.

 

 

Any ideas ?

 

Shikhar Bhushan

unread,
Nov 30, 2016, 4:58:40 PM11/30/16
to confluent...@googlegroups.com
Hi Jonathan,

The task ID is not currently available through the interfaces exposed by the Connect API, though that seems like a simple reasonable enhancement, so that tasks' logging can also be correlated with framework actions on the tasks.

If you just want a way to distinguish between different tasks for connector-specific logging, you could pass in a property for it when generating your tasks' config in `Connector.taskConfigs()`.

[ Maybe not worth relying on, task ID's follow the format "$connectorName-$index", so you could in theory just pass that through. ]

Best,

Shikhar

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/F6A8DEE6B40A30419B89834C7CFF43F40357D994%40gsdgeup01env2.firmwide.corp.gs.com.
For more options, visit https://groups.google.com/d/optout.

Skrzypek, Jonathan

unread,
Dec 1, 2016, 5:52:20 AM12/1/16
to confluent...@googlegroups.com

Hi,

 

Yeah I thought of that, but this would be an arbitrary naming, uncorrelated from the kafka consumer threads.

Whilst this gives you a way of differentiating tasks, it doesn’t allow to pinpoint what tasks is working on which partitions.


Ideally, the Connect API would give access to this information (it’s here somewhere, it’s just not public nor protected), allowing to then correlate ; taskX is logging ‘foobar messages’, taskX is assigned to partitions 1/2/3 and its consumerlag is 123456.
Ultimately you could argue one could have the task log informations from TopicPartition etc, but other tools give you that anyway.

Ewen Cheslack-Postava

unread,
Dec 1, 2016, 5:37:07 PM12/1/16
to Confluent Platform
It's a bit of a hack, but you can have your connector pass this info through the configs it generates. Currently Connect will maintain the ordering of the configs you pass (and I don't see any reason we'd change that).

For logging specifically, we have https://issues.apache.org/jira/browse/KAFKA-3816 filed which would probably give you enough context such that the task itself wouldn't need to have the ID (and saves you the trouble of having to include it everywhere).

-Ewen

On Thu, Dec 1, 2016 at 2:52 AM, Skrzypek, Jonathan <Jonathan...@gs.com> wrote:

Hi,

 

Yeah I thought of that, but this would be an arbitrary naming, uncorrelated from the kafka consumer threads.

Whilst this gives you a way of differentiating tasks, it doesn’t allow to pinpoint what tasks is working on which partitions.


Ideally, the Connect API would give access to this information (it’s here somewhere, it’s just not public nor protected), allowing to then correlate ; taskX is logging ‘foobar messages’, taskX is assigned to partitions 1/2/3 and its consumerlag is 123456.
Ultimately you could argue one could have the task log informations from TopicPartition etc, but other tools give you that anyway.



From: confluent-platform@googlegroups.com [mailto:confluent-platform@googlegroups.com] On Behalf Of Shikhar Bhushan
Sent: 30 November 2016 21:58
To: confluent-platform@googlegroups.com
Subject: Re: Kafka connect task identity

 

Hi Jonathan,

 

The task ID is not currently available through the interfaces exposed by the Connect API, though that seems like a simple reasonable enhancement, so that tasks' logging can also be correlated with framework actions on the tasks.

 

If you just want a way to distinguish between different tasks for connector-specific logging, you could pass in a property for it when generating your tasks' config in `Connector.taskConfigs()`.

 

[ Maybe not worth relying on, task ID's follow the format "$connectorName-$index", so you could in theory just pass that through. ]

 

Best,

 

Shikhar

 

On Wed, Nov 30, 2016 at 4:32 AM Skrzypek, Jonathan <Jonathan...@gs.com> wrote:

Hi,

 

Is there a way to know about the taskId in a SinkTask ?

I need to differentiate tasks when logging specific actions, so would need to have access to a clientId or a taskId in the start() or put() method of my SinkTask.

 

I couldn’t find any public field anywhere, I only found a private KafkaConsumer field in the task’s context (SinkTaskContext), that has a private clientId field.

I could probably use reflection to dig the clientId string, but it’s a bit ugly.

 

 

Any ideas ?

 

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.

To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.

To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/F6A8DEE6B40A30419B89834C7CFF43F40358EC08%40gsdgeup01env2.firmwide.corp.gs.com.

For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Ewen
Reply all
Reply to author
Forward
0 new messages