WARN DiscoveryService:67: DNS lookup for serviceDns '<service-prefix>.<service>.svc.cluster.local' failed: name not found

540 views
Skip to first unread message

merkury

unread,
Aug 5, 2019, 11:15:13 AM8/5/19
to Hazelcast
Hi,

I have just a naive question.
I get this warning: 
WARN DiscoveryService:67: DNS lookup for serviceDns '<service-prefix>.<service>.svc.cluster.local' failed: name not found

What does this warning mean?
Does it mean, that the cache is not working properly? 

Thanks!



Here is my configuration: 
I use hazelcast-kubernetes:1.5.1.

val retVal = Config(HZ_INSTANCE_NAME)
      .setProperty("hazelcast.logging.type", "slf4j")
.setProperty("hazelcast.jmx", "true")
...

//retVal.addMapConfig(createDefaultConfig(name)) ...
val multiCastConfig = retVal.networkConfig.join.multicastConfig

...

with(retVal.networkConfig.join) {
awsConfig.isEnabled = false
kubernetesConfig.setEnabled(true)
.setProperty("service-dns", serviceDnsValue())
tcpIpConfig.isEnabled = false
multicastConfig.isEnabled = false
}
retVal.partitionGroupConfig.setEnabled(false)
// set retVal.groupConfig.name


Service.yml:
--
apiVersion: v1
kind: Service
metadata:
name: <service-prefix>
spec:
type: ClusterIP
clusterIP: None
selector:
app: <service>
ports:
- name: hazelcast
port: 5701
--

Rafal Leszko

unread,
Aug 6, 2019, 2:39:00 AM8/6/19
to haze...@googlegroups.com
Hi,

If you use the DNS Lookup mode with Deployment (not StatefulSet), then it may occur at the beginning and even your cluster may start in the Split Brain (you can check in logs if that happened). Hazelcast members try to find each other using DNS Records from the headless ClusterIP service and the Kubernetes DNS propagation time is long, so at the beginning they can't find each other.

We generally recommend using DNS Lookup mode only with StatefulSets, because of that reason (you'd still see that logs in the first started member, but you should never encounter the Split Brain issue).

Cheers,
Rafał

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/f144f8de-2e9b-4764-9963-c70fd5675c7c%40googlegroups.com.


--
Rafał Leszko
Software Engineer
@RafalLeszko

David Obermann

unread,
Aug 6, 2019, 6:25:21 AM8/6/19
to haze...@googlegroups.com
Hi,
 
thanks for your prompt reply. We use StatefulSet. I added now a network policy allow-from-same-project:
--
kind: NetworkPolicy
apiVersion: extensions/v1beta1
metadata:
  name: allow-from-same-project
spec:
  podSelector:
  ingress:
  - from:
    - podSelector: {}
--

But I still get the same error. I guess with retVal.partitionGroupConfig.setEnabled(false) the ZONE_AWARE feature is disabled.

I think I do not need this feature actually. Is it possible that the Hostname must be different in OpenShift? I mean without .cluster.local?

Here is the error again:

0Cannot fetch the current zone, ZONE_AWARE feature is disabled

2019-08-06 10:21:01.375  WARN DiscoveryService:67 [main] [] - [10.81.4.122]:5701 [ordersheet] [3.11.2] DNS lookup for serviceDns 'ordersheet-tst-cache.ordersheet.svc.cluster.local' failed: name not found

Best regards,
 
David
Gesendet: Dienstag, 06. August 2019 um 08:38 Uhr
Von: "Rafal Leszko" <ra...@hazelcast.com>
An: haze...@googlegroups.com
Betreff: Re: WARN DiscoveryService:67: DNS lookup for serviceDns '<service-prefix>.<service>.svc.cluster.local' failed: name not found
You received this message because you are subscribed to a topic in the Google Groups "Hazelcast" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/hazelcast/C9uQ84tT178/unsubscribe.
To unsubscribe from this group and all its topics, send an email to hazelcast+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/CAHAsrUQZi3_S2QYZdiGC5XzuPsVhyWM3NR9GY8V9tRxDem4-mw%40mail.gmail.com.

Rafal Leszko

unread,
Aug 6, 2019, 6:36:20 AM8/6/19
to haze...@googlegroups.com
Hi David,

Two questions:
1. Do you see this warning in all PODs?
2. Does your cluster members find each other? And if yes, do you experience the Split Brain issue while starting the cluster?

Cheers,
Rafał


david.o...@web.de

unread,
Aug 6, 2019, 7:08:36 AM8/6/19
to Hazelcast
Hi,

still use only one POD. But I startet another one, which logged the same warnings.
How can I test your 2. question? In the logs was nothing about Split Brain.

David


Am Dienstag, 6. August 2019 12:36:20 UTC+2 schrieb Rafal Leszko:
Hi David,

Two questions:
1. Do you see this warning in all PODs?
2. Does your cluster members find each other? And if yes, do you experience the Split Brain issue while starting the cluster?

Cheers,
Rafał


Best regards,
 
David
To unsubscribe from this group and stop receiving emails from it, send an email to haze...@googlegroups.com.
 
 
--
Rafał Leszko
Software Engineer
@RafalLeszko

 

--
You received this message because you are subscribed to a topic in the Google Groups "Hazelcast" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/hazelcast/C9uQ84tT178/unsubscribe.
To unsubscribe from this group and all its topics, send an email to haze...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to haze...@googlegroups.com.

Rafal Leszko

unread,
Aug 6, 2019, 8:53:42 AM8/6/19
to haze...@googlegroups.com
In the logs you should see something like: 
$ kubectl logs pod/hazelcast-embedded-57f84c545b-jjhcs
 ...
 Members {size:2, ver:4} [
         Member [10.16.2.6]:5701 - 33076b61-e99d-46f2-b5c1-35e0e75f2311
         Member [10.16.2.8]:5701 - 9ba9bb61-6e34-460a-9208-c5a644490107 this
 ]
 ...
That would mean that your two members formed a cluster (no Split Brain).

To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/71082eeb-078f-4243-8f66-1c03b7d6e35c%40googlegroups.com.

david.o...@idealo.de

unread,
Aug 6, 2019, 10:32:34 AM8/6/19
to Hazelcast
I think I found the solution. The problem was in here:


Service.yml:
--
apiVersion: v1
kind: Service
metadata:
  name: <service-prefix>
spec:
  type: ClusterIP
  clusterIP: None
  selector:
    app: <service>
  ports:
  - name: hazelcast
    port: 5701
--

 


I changed 
    selector:  
       app: <service>

to 
   selector:
      name: <service>

and it seams to work now.

Thanks for all your replies!!!

Rafal Leszko

unread,
Aug 6, 2019, 10:39:04 AM8/6/19
to haze...@googlegroups.com
Happy you solved the issue.

Cheers,
Rafał

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.

david.o...@web.de

unread,
Aug 7, 2019, 2:41:55 AM8/7/19
to Hazelcast
Hi,

just a hint, you might change this in the service template on your Readme.md as well (Section: creating headless service... I don't know if the other configuration with Kubernetes API is affected as well). I did not investigate further. But I think that it worked with "app" before. Maybe something changed in kubernetes or OpenShift? (We use OpenShift 3.9)

regards,
David
Reply all
Reply to author
Forward
0 new messages