TLS Handshake issue with SSL enabled Hazelcast Enterprice

864 views
Skip to first unread message

Shivi Garg

unread,
Nov 25, 2019, 3:35:14 AM11/25/19
to Hazelcast
Hi, 

we are facing issue with hazelcast enterprise with embeded mode with TLS, where when we start a single node we get below in exception:


2019-11-25T07:58:44.606+0100 [hz._hzInstance_1_LockManager.IO.thread-in-1] WARN StandardLoggerFactory$StandardLogger:49 log | 311 - com.hazelcast.enterprise - 3.12.1 | [10.223.53.76]:14701 [LockManager] [3.12.1] Connection[id=4, /10.223.53.76:14702->/10.223.53.76:14702, qualifier=null, endpoint=[10.223.53.76]:14702, alive=false, type=NONE] closed. Reason: Exception in Connection[id=4, /10.223.53.76:14702->/10.223.53.76:14702, qualifier=null, endpoint=[10.223.53.76]:14702, alive=true, type=NONE], thread=hz._hzInstance_1_LockManager.IO.thread-in-1
javax.net.ssl.SSLProtocolException: Handshake message sequence violation, 1
at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1530)
at sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:528)
at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:802)
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:766)
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
at com.hazelcast.nio.ssl.TLSHandshakeDecoder.onRead(TLSHandshakeDecoder.java:87)
at com.hazelcast.internal.networking.nio.NioInboundPipeline.process(NioInboundPipeline.java:135)
at com.hazelcast.internal.networking.nio.NioPipeline.run(NioPipeline.java:227)
at com.hazelcast.internal.networking.nio.NioInboundPipeline$1.run0(NioInboundPipeline.java:273)
at com.hazelcast.internal.networking.nio.NioPipelineTask.run(NioPipelineTask.java:47)
at com.hazelcast.internal.networking.nio.NioThread.processTaskQueue(NioThread.java:341)
at com.hazelcast.internal.networking.nio.NioThread.selectLoop(NioThread.java:276)
at com.hazelcast.internal.networking.nio.NioThread.run(NioThread.java:235)
Caused by: javax.net.ssl.SSLProtocolException: Handshake message sequence violation, 1
at sun.security.ssl.HandshakeStateManager.check(HandshakeStateManager.java:362)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:196)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1037)
at sun.security.ssl.Handshaker$1.run(Handshaker.java:970)
at sun.security.ssl.Handshaker$1.run(Handshaker.java:967)
at java.security.AccessController.doPrivileged(Native Method)
at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1459)
at com.hazelcast.nio.ssl.TLSExecutor$HandshakeTask.run(TLSExecutor.java:73)
at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:64)
at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:80)


Here it is trying to connect to itself over TLS and failing.
Please help to understand the behavior.

Josef Cacek

unread,
Nov 25, 2019, 3:55:44 AM11/25/19
to haze...@googlegroups.com
Hello,

could you share your Hazelcast member configuration? It would help with the investigation.
Thank you,
-- Josef

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/fd3a1375-b774-4e34-9a67-e3376debb2b3%40googlegroups.com.

Josef Cacek

unread,
Nov 25, 2019, 8:18:19 AM11/25/19
to haze...@googlegroups.com
I think there is a Hazelcast member's misconfiguration.

The log entry says there is a connection established from the outbound port 14702 to itself. Which then results in TLS client-side talking to itself instead of a member.

Normal TLS handshake starts with:
TLS_client  sends a ClientHello message to a TLS_server.
TLS_server replies with the ServerHello message to the TLS_client.

On the other hand - in your case we see the following:
TLS_client sends a ClientHello message to itself (i.e. TLS_client).
The SSLProtocolException is thrown by the TLS processing logic because of the TLS_Client expects a ServerHello message, but it received the ClientHello message.

I guess, your config is similar to this one:

Config config = new Config()
        .setProperty("hazelcast.socket.bind.any", "false");
NetworkConfig networkConfig = config.getNetworkConfig()
        .setPort(14701)
        .addOutboundPort(14702)
        .setSSLConfig(new SSLConfig().setEnabled(true))
        .setPublicAddress("10.223.53.76");
JoinConfig joinConfig = networkConfig.getJoin();
joinConfig.getMulticastConfig().setEnabled(false);
joinConfig.getTcpIpConfig()
    .setEnabled(true)
    .addMember("10.223.53.76");
If you don't specify port numbers in the TCP config member list, 3 ports are tried automatically (port, port+1, port+2). And because the outbound port number is port+1, the bind to it is also tried.

You have the following options, how to get rid of the TLS client talking to itself:
* don't specify the outbound port number
* explicitly specify ports in the list of members: e.g. ...addMember("10.223.53.76:14701");

Regards,
-- Josef

Shivi Garg

unread,
Nov 25, 2019, 8:54:36 AM11/25/19
to Hazelcast

Yes, you are right. Our Configuration is quite similar.

But, we have set port increment as false.

So, from your reply I have two queries:-

If you don't specify port numbers in the TCP config member list, 3 ports are tried automatically (port, port+1, port+2). And because the outbound port number is port+1, the bind to it is also tried.
[Shivi - if auto increment is false then why port+1 and port+2 are tried ?]

You have the following options, how to get rid of the TLS client talking to itself:
* don't specify the outbound port number
[Shivi:- What if we define outbond port some number greater than port +2 ?]


Regards,
Shivi Garg

Josef Cacek

unread,
Nov 25, 2019, 9:15:47 AM11/25/19
to haze...@googlegroups.com
Hello,

the auto-increment is only used for accepting (listening) port assignment. it's not used for TCP-IP join method, where the count is controlled by a Hazelcast config property "hazelcast.tcp.join.port.try.count".
I.e. other option for your case is setting:
config.setProperty("hazelcast.tcp.join.port.try.count", "1");

Defining the outbound port number which doesn't overlap with TCP-IP join configuration is also a solution.

Regards,
-- Josef

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.

Shivi Garg

unread,
Nov 26, 2019, 1:18:54 AM11/26/19
to Hazelcast

Hi Josef,


Thank you!!! After separating, TCP_IP Join port and outbond ports range, this issue got resolved.

However, I am still not clear about the usage of properties "hazelcast.tcp.join.port.try.count" and then NetworkConfig.setPortCount and NetworkConfig.setPortAutoIncrement

Regards,
Shivi Garg


On Monday, 25 November 2019 14:05:14 UTC+5:30, Shivi Garg wrote:

Josef Cacek

unread,
Nov 26, 2019, 2:59:57 AM11/26/19
to haze...@googlegroups.com
Hello,

let me give you some examples. Setting portAutoIncrement to false is nearly the same as leaving it true and setting portCount to 1. Both ways allow only the explicitly configured port number to be used by the Hazelcast member.

Example 1:

config.getNetworkConfig()
    .setPort(5000)
    .setPortAutoIncrement(true)
    .setPortCount(1);

This configuration only allows Hazelcast to bind to port 5000. If the port is already used, then the following exception is thrown:
com.hazelcast.core.HazelcastException: Cannot bind to a given address: /192.168.1.1. Hazelcast cannot start. Config-port: 5000, latest-port: 5000

Example  2:
config.getNetworkConfig()
    .setPort(5000)
    .setPortAutoIncrement(false);

Behaves in the same way, only the exception message is a little bit different:
com.hazelcast.core.HazelcastException: Cannot bind to a given address: /192.168.1.1. Hazelcast cannot start. Port [5000] is already in use and auto-increment is disabled.

Example 3:
config.getNetworkConfig()
    .setPort(5000)
    .setPortAutoIncrement(true)
    .setPortCount(3);
Hazelcast.newHazelcastInstance(config);
Hazelcast.newHazelcastInstance(config);
Hazelcast.newHazelcastInstance(config);
Hazelcast.newHazelcastInstance(config);

If the ports 5000-5003 were free before running the code, then the first Hazelcast instance is bound to port 5000, second to 5001, third to 5002 and the fourth fails to start, because there is no other free port available in the range [port .. port+portCount-1]. Following exception is thrown:
com.hazelcast.core.HazelcastException: Cannot bind to a given address: /192.168.1.1. Hazelcast cannot start. Config-port: 5000, latest-port: 5002

I hope it's more clear now.
-- Josef

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.

Shivi Garg

unread,
Nov 26, 2019, 3:53:20 AM11/26/19
to Hazelcast
Hi Josef,

Thanks for detailed examples. For the below scenario what will be the behavior:

config.getNetworkConfig()
    .setPort(5000)
    .setPortAutoIncrement(true)
    .setPortCount(4)
    .setProperty("hazelcast.tcp.join.port.try.count", "2");
Hazelcast.newHazelcastInstance(config);
Hazelcast.newHazelcastInstance(config);
Hazelcast.newHazelcastInstance(config);
Hazelcast.newHazelcastInstance(config);
Hazelcast.newHazelcastInstance(config);
Hazelcast.newHazelcastInstance(config);


Regards,
Shivi

On Monday, 25 November 2019 14:05:14 UTC+5:30, Shivi Garg wrote:

Josef Cacek

unread,
Nov 28, 2019, 2:35:12 AM11/28/19
to haze...@googlegroups.com
Hello,

by default, the multicast join is used. Setting the property "hazelcast.tcp.join.port.try.count" has no effect in this case.
Result: 4 members will start on ports 5000-5003 and form the cluster

If you enable TcpIp join method and provide a member address, the situation is similar. A new member, after a successful connection to any member who is already part of the cluster, receives a list of members already connected to the cluster. As a result it knows all the addresses.
        config.getNetworkConfig().getJoin().getMulticastConfig().setEnabled(false);
        config.getNetworkConfig().getJoin().getTcpIpConfig().setEnabled(true).addMember("127.0.0.1");
Result: 4 members will start on ports 5000-5003 and form the cluster

Regards,
-- Josef



--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.

Shivi Garg

unread,
Dec 4, 2019, 6:04:50 AM12/4/19
to Hazelcast
Hi Josef,

Thanks for so many queries.

Probably I have last query:-

Why we need outbound port with hazelcast in used in embeded mode, please help to understand. We have a cluster of multiple hazelcast member nodes, and we have to specify outbound ports due to firewall restrictions. We want to understand why it is needed.

Thanks,
Shivi 


On Monday, 25 November 2019 14:05:14 UTC+5:30, Shivi Garg wrote:

Josef Cacek

unread,
Dec 4, 2019, 8:10:07 AM12/4/19
to haze...@googlegroups.com
Hi,

Hazelcast members need to talk to each other.
So, when a member starts, it opens a port and listens for incoming connections on it. Members try to discover other members according to their join configuration. Once they find a member available, they establish a new connection to the "listening" port and start communication over the connection.

Regards,

-- Josef

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages