Kafka-Connect tasks.max

1,288 views
Skip to first unread message

Tushar Sudhakar Jee

unread,
Jun 28, 2017, 12:33:15 PM6/28/17
to Confluent Platform
Hi,
I am integrating Kafka connect with a sink to get the connector in line with Confluent best practices.
I wanted to be clear regarding the meaning of tasks.max .

Am I wrong in my understanding that it amounts to spawning threads?
If yes then is there a difference between tasks.max or using newFixedThreadPool(numberOfThreads) for doing the same.

-
Thanks,
Tushar

Konstantine Karantasis

unread,
Jun 28, 2017, 2:52:57 PM6/28/17
to confluent...@googlegroups.com
Tushar, 

tasks.max is a cluster wide property. It refers to all the connector tasks across a connect cluster with workers running in distributed mode. 

Also, it's used by the framework and not the Connector code itself. Is there something specific you are trying to achieve with respect to this property?

Konstantine

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/edde2c45-b82f-4266-a6fe-5767fa5be96b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tushar Sudhakar Jee

unread,
Jun 28, 2017, 4:16:15 PM6/28/17
to Confluent Platform
Konstantine,
I am trying to make pushing data into my ByteBuffer multithreaded.
Also I am doing this in the put() of the class that extends SinkTask. 

Earlier when I wrote a Kafka Consumer for the same setup I made 10 consumer instances and assigned 8 threads to each instance. 
Each of these threads were filling up the ByteBuffer. This is how I achieved parallelism.

But now while writing the class that extends SinkTask I am trying to achieve parallelism by increasing tasks.max to 3(since that gives me a good throughput but not high enough compared to before).
Should I  be doing something differently to make the parallelism work?




  

On Wednesday, June 28, 2017 at 11:52:57 AM UTC-7, Konstantine Karantasis wrote:
Tushar, 

tasks.max is a cluster wide property. It refers to all the connector tasks across a connect cluster with workers running in distributed mode. 

Also, it's used by the framework and not the Connector code itself. Is there something specific you are trying to achieve with respect to this property?

Konstantine
On Wed, Jun 28, 2017 at 9:33 AM, Tushar Sudhakar Jee <tshrsu...@gmail.com> wrote:
Hi,
I am integrating Kafka connect with a sink to get the connector in line with Confluent best practices.
I wanted to be clear regarding the meaning of tasks.max .

Am I wrong in my understanding that it amounts to spawning threads?
If yes then is there a difference between tasks.max or using newFixedThreadPool(numberOfThreads) for doing the same.

-
Thanks,
Tushar

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages