Kafka-connect k8s deployment

455 views
Skip to first unread message

g...@rollout.io

unread,
Apr 24, 2017, 8:38:34 AM4/24/17
to Confluent Platform
Hi,
I'm working with Kafka connect s3 based on Spredfast repo (https://github.com/spredfast/kafka-connect-s3) over k8s.

The connector writes files to temp dir and once in a while (according to configuration), flush it to s3.

Now I'm wondering who Kafka connect handle with a new deployment/crash.
Kubernetes will create another pod with kafka-connect but the all data already gone.

I thought to solve it with another topic that saves a pointer to last flash offset, or save the events to persistence volume.

Any ideas?

Thanks.


Konstantine Karantasis

unread,
Apr 24, 2017, 1:17:48 PM4/24/17
to confluent...@googlegroups.com
Hi, 

from what you describe, what you experience seems to be one of the unwanted side-effects of writing temporary files to local disks before you write them to S3.  

This was one of the issues, among many others, that lead us to design a new S3 Sink Connector from scratch.
You may find this connector here: 


What you describe above serves as a great example to highlight how depending on ephemeral resources, especially in a containerized environment, is the wrong direction to follow. My suggestion is, instead of trying to patch the above scenario with a complicated and fragile workaround, you should get rid of this dependency on local storage altogether. 

A Kafka Connect cluster is deployed exactly to connect Kafka with data sources and data sinks. It's not meant to replace Kafka. Therefore, the way Connectors are designed should allow for a Connect cluster to scale up or down in size, move transparently (e.g. by using an underlying resource manager such as Kubernetes) and operate in a fault tolerant way.

Konstantine

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/86933a76-0afe-4505-b5b3-309a201fb838%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages