no DNS and producer hanging forever

422 views
Skip to first unread message

Sergio Jimenez

unread,
Jul 1, 2014, 6:27:57 PM7/1/14
to sarama...@googlegroups.com
Hi,

first of all, I'd like to make it clear that I'm pretty much new to both golang and kafka :-)

I've been spending a bit of time testing sarama and I found that a producer (I didn't try consumer yet) will hang for ever if the kafka server host name cannot be resolve.

After some digging, I was able to enable the driver's logging from my producer and this is the output:

[Sarama] 2014/07/02 00:02:06 Initializing new client
[Sarama] 2014/07/02 00:02:06 Fetching metadata from broker 10.0.3.213:9092
[Sarama] 2014/07/02 00:02:06 Connected to broker 10.0.3.213:9092
[Sarama] 2014/07/02 00:02:06 Registered new broker #0 at kafka01:9092
[Sarama] 2014/07/02 00:02:06 Successfully initialized new client
[Sarama] 2014/07/02 00:02:06 Failed to connect to broker kafka01:9092
[Sarama] 2014/07/02 00:02:06 dial tcp: lookup kafka01: no such host

And here it hangs, nothing else is happening. I'm using IP address for connecting to the kafka server but somehow, the producer gets the hostname from kafka and then retries the connection against that hostname. I'm guessing this is how kafka works because I found the Ruby driver does the same but instead it raises an exception produced by the Socket call, I might be wrong, haven't checked kafka documentation deeply enough.

Anyway, what I was wondering is if this is the expected behavior because IMHO I'd expect it to panic...I don't really see the point for it hanging there.

Thanks!

Evan Huus

unread,
Jul 1, 2014, 7:38:06 PM7/1/14
to sarama...@googlegroups.com
Not expected behaviour, you should be getting an error or something. Can you post the code you're using to trigger this?

(For what it's worth, go will detect "real" deadlocks and dump a stack trace, so something somewhere is still spinning.)

Sergio Jimenez

unread,
Jul 2, 2014, 3:01:59 AM7/2/14
to sarama...@googlegroups.com
Hi Evan,

here is the code, just the producer example slightly modified:


When I have an entry in /etc/hosts with kafka server IP address this is the output:

[Sarama] 2014/07/02 08:54:01 Initializing new client
[Sarama] 2014/07/02 08:54:01 Fetching metadata from broker localhost:9092
[Sarama] 2014/07/02 08:54:01 Connected to broker localhost:9092
[Sarama] 2014/07/02 08:54:01 Registered new broker #0 at dd04009be310:9092
[Sarama] 2014/07/02 08:54:01 Successfully initialized new client
> connected
[Sarama] 2014/07/02 08:54:01 Connected to broker dd04009be310:9092
 > message sent
[Sarama] 2014/07/02 08:54:01 Closing Client


When I comment out the entry in hosts file:

[Sarama] 2014/07/02 08:54:24 Initializing new client
[Sarama] 2014/07/02 08:54:24 Fetching metadata from broker localhost:9092
[Sarama] 2014/07/02 08:54:24 Connected to broker localhost:9092
[Sarama] 2014/07/02 08:54:24 Registered new broker #0 at dd04009be310:9092
[Sarama] 2014/07/02 08:54:24 Successfully initialized new client
> connected
[Sarama] 2014/07/02 08:54:24 Failed to connect to broker dd04009be310:9092
[Sarama] 2014/07/02 08:54:24 dial tcp: lookup dd04009be310: no such host


It hangs there and nothing else happens.


Thanks!

Evan Huus

unread,
Jul 2, 2014, 9:36:43 AM7/2/14
to sarama...@googlegroups.com
Interesting, I can't seem to reproduce this in the simple set-up I have. Is the hostname resolvable to the brokers but not to the clients, or is it unresolvable to either?

Evan Huus

unread,
Jul 2, 2014, 9:50:02 AM7/2/14
to sarama...@googlegroups.com
Ah, OK, I can reproduce if the hostname is resolvable to the brokers themselves but not to the client (previously it was unresolvable to either). Forcing it to dump stack at that point indicates it's the same issue as https://github.com/Shopify/sarama/issues/65

Evan

Sergio Jimenez

unread,
Jul 2, 2014, 6:11:26 PM7/2/14
to sarama...@googlegroups.com
Yep, that is the case...sorry for the confusion :-)

I thought about that issue already but wasn't 100% sure...

I will test again with the fix and let you know if I find something else regarding this issue.

Thanks a lot!

Sergio Jimenez

unread,
Jul 3, 2014, 6:09:18 AM7/3/14
to sarama...@googlegroups.com
Hi,

I just wanted to let you know the producer now panics when it cannot connect :-)

Thanks a lot for the quick fix.
Reply all
Reply to author
Forward
0 new messages