JGroups Cluster Communication Issue

12 views
Skip to first unread message

Martin Tauber

unread,
Apr 2, 2026, 5:31:07 AMApr 2
to jgroups-raft

I am currently experiencing an unusual issue in my JGroups-based cluster, which consists of three nodes: Node1 (acting as the leader) and Node2/Node3 (acting as followers). All three nodes host my application’s UI, allowing me to access cluster functionality from any node. This requires seamless communication between the nodes.

Issue Description
  • When I log in to Node1, it performs remote calls to Node2 and Node3 to retrieve their statuses.

  • Most of the time, this works as expected.

  • Occasionally, however, Node1 fails to retrieve the status from Node2 (while still successfully retrieving it from Node3). This makes it appear as though Node2 is unresponsive.

  • If I log in to Node3, I can successfully retrieve the status from all nodes (including Node2).

  • If I attempt to log in to Node2 directly, the login fails entirely.

Technical Context
  • The status retrieval is implemented using callRemoteFunctionWithFuture().

  • The cluster also uses JGroups-Raft for consensus and data replication. During the initial login, user preferences are fetched via Raft.

  • While I suspect the issue might be related to inter-node communication rather than Raft itself, I cannot rule out any potential interactions.

Observations
  • The behavior suggests that Node2 is not responding to messages from Node1, even though it remains accessible from Node3.

  • This inconsistency leads me to believe there may be an underlying issue with JGroups communication between Node1 and Node2.

Request for Assistance

I would greatly appreciate any insights or suggestions regarding:

  • Potential causes for this selective communication failure.

  • Debugging steps to identify whether the issue lies in JGroups, Raft, or network-level problems.

  • Best practices for diagnosing and resolving such issues in a JGroups cluster.

Thank you in advance for your help!

Jose Bolina

unread,
Apr 6, 2026, 8:13:24 AM (11 days ago) Apr 6
to Martin Tauber, jgroups-raft

Hey, Martin


Thanks for the details.


> If I attempt to log in to Node2 directly, the login fails entirely.

Do you mean even SSH into Node2 fails completely?


Some potential causes and some hints to debug this:

* Some firewall communication blocking a port. E.g., some socket performing failure detection. You could list the ports in use in Node2 and compare which processes are using them according to your JGroups stack.

* You could utilize netcat on Node2 and connect with client with telnet from the other nodes to check whether the connection succeed.

* You can utilize `probe` [1] to investigate at the JGroups level. This would allow to inspect the stack, see message drops, views, etc.


Hope this helps,

Cheers


[1] http://jgroups.org/manual5/index.html#Probe 

--
You received this message because you are subscribed to the Google Groups "jgroups-raft" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jgroups-raft...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/jgroups-raft/4a27d42f-eb72-439f-80a5-868705568917n%40googlegroups.com.

Martin Tauber

unread,
Apr 6, 2026, 11:49:51 AM (11 days ago) Apr 6
to jgroups-raft

Hi,

Thank you for your insights and suggestions. I’ve made some progress in narrowing down the issue.

Observations and Suspicions
  1. Remote Call Behavior:
    The issue seems related to a remote call using dispatcher.callRemoteMethodWithFuture(). When connecting to Node1, this node performs remote calls to Node2 and Node3 to retrieve their statuses. While this usually works, occasional failures occur when the call returns large amounts of data.

  2. Potential Blocking:
    I suspect that large data transfers might be causing Node2 to block or become unresponsive. This aligns with the observation that Node2 appears unresponsive to Node1 but remains accessible from Node3.

  3. JGroups Configuration:
    I’ve been reading about max_credits and the FRAG protocol in JGroups. While the FRAG protocol is designed to handle large messages by fragmenting them, I’m still unclear about how max_credits might impact communication. Could misconfigured max_credits or fragmentation settings lead to blocked communication when large messages are sent?

  4. Workaround:
    I modified the application logic to avoid large remote calls. I still need to test it but fingers crossed I hope this addresses the issue.

Questions
  • Could large remote calls indeed block communication between nodes, particularly if max_credits or fragmentation settings are not optimized?
  • Are there best practices for configuring max_credits or the FRAG protocol to handle large messages in JGroups?
Next Steps

I’ll continue investigating the max_credits and FRAG settings to see if adjusting them resolves the issue. If you have any specific recommendations or debugging tips for these parameters, I’d greatly appreciate it.

Thank you again for your help!

Kind regards,
Martin

Bela Ban

unread,
Apr 8, 2026, 2:50:59 AM (9 days ago) Apr 8
to jgroup...@googlegroups.com
Hi Martin

1:
* You mentioned you use jgroups-raft *and* an RpcDispatcher. Does this mean you use 2 JChannels/application, or do you reuse the JChannel used by jgroups-raft?
* How do you invoke callMethodWithFuture()? I assume you're calling twice method twice, once for node2 and once for node3?
* A reproducer would help, so I can take a look the above questions

2:
* That doesn't make any sense: if the fail criteria is the size of a message, then both nodes would be affected. Again, a simple reproducer would help here...

3:
* Yes, this could happen, but JGroups would issue a warning message (misconfiguration) when started
* Which JGroups version do you use? Can you post the configuration? FRAG is quite old, there are newer protocols, up to FRAG4

Cheers

Martin Tauber

unread,
Apr 8, 2026, 4:13:02 AM (9 days ago) Apr 8
to jgroups-raft
Good Morning Bela,

(1.a) I am  using one chanel for jgroups and jgroups-raft.
(1.b) I invoke the callMethodWithFuture() regularly. I have one method that is resposible for all the remote (jgroup) calls. I am not sure what you are looking for. Yes it could happen that I have two calls running at the same time.
(1.c) It is very difficult to reproduce the problem, because the system can be running for days without having any problem ... i am trying to get closer to the cause ... 
(2) ok ... Now i wonder if the multiple callMethodWithFuture ... could cause the issue and by making them responde faster just made it more unlikely that two run at the same time ...
(3.a) ok i did not see a warning
(3.b) I am using jgroups: 5.4.5.final and for raft 1.0.14.final

Sine i am configuring the stack via java and not via xml it is not so easy to share the configuration. here are some code snippets:

return new Protocol[]{
configureTransportProtocol(), // Transport protocol: TCP or UDP (configured)
configurePing(), // Discovery protocol (configured)
configureMERGE3(), // Handles cluster splits (configured)
configureFDSock(), // Failure detection based on nodes connecting to neighbors (configured)
new FD_ALL3(), // Failure detection based on heartbeats

new VERIFY_SUSPECT(), // After detecting Failure check if node is really gone
new NAKACK2(), // Reliable FIFO message transfer
new UNICAST3(), // Reliable FIFO message transfer
new STABLE(), // Garbage Collect message
new NO_DUPES(), // Avoids duplicates to connect to cluster
configureGMS(), // Handle Group Membership (configured)
new UFC(), // Unicast Flow Control
new MFC(), // Multicast Flow Control
new FRAG4(), // Fragment large messages
new ELECTION(),
new STATE_TRANSFER(), // State transfer
configureRaft(), // RAFT consensus protocol (configured)
new REDIRECT()
// new CLIENT() // only needed if client.sh is used
};


private TP configureTransportProtocol() {
log.info("Using cluster protocol '{}'.", clusterConfiguration.getProtocol());

TP tp;

if (clusterConfiguration.getProtocol().equals("tcp")) {
log.debug("Configuring TCP transport protocol");
tp = new TCP();
tp.setBindPort(clusterConfiguration.getPort());
log.info("TCP: bind_port = {}", clusterConfiguration.getPort());
} else {
log.debug("Configuring UDP transport protocol");
tp = new UDP();
}

if (clusterConfiguration.getBindAddress() != null &&
!clusterConfiguration.getBindAddress().isEmpty()) {
log.info("Using external address '{}'.", clusterConfiguration.getBindAddress());

try {
tp.setExternalAddr(InetAddress.getByName(clusterConfiguration.getBindAddress()));
log.info("Transport: external_addr = {}", clusterConfiguration.getBindAddress());
} catch (UnknownHostException e) {
log.error("External address '{}' was not found.", clusterConfiguration.getBindAddress(), e);
}
}

return tp;
}

/**
* Configures and returns the appropriate discovery protocol (PING or TCPPING).
* <p>
* This method creates either a PING or TCPPING protocol instance depending on the
* configured cluster protocol (UDP or TCP).
* </p>
* <ul>
* <li><b>TCP mode</b>: Uses TCPPING with initial_hosts list and port_range for static member discovery</li>
* <li><b>UDP mode</b>: Uses PING for multicast-based discovery</li>
* </ul>
* <p>
* For TCP mode, the method:
* <ul>
* <li>Parses the bootstrap servers into a list of IpAddress objects</li>
* <li>Configures TCPPING with initial_hosts and port_range</li>
* <li>Logs all contact addresses for troubleshooting</li>
* </ul>
* </p>
*
* @return configured PING or TCPPING protocol instance
*/
private Protocol configurePing() {
Protocol ping;

if (clusterConfiguration.getProtocol().equals("tcp")) {
log.debug("Configuring TCPPING (TCP discovery) protocol");
log.info("Using port '{}'.", clusterConfiguration.getPort());
log.info("Using port range '{}'.", clusterConfiguration.getPortRange());
log.info("Using bootstrap server '{}'.", bootstrapServers);

List<IpAddress> initialHosts = getInitialHosts(bootstrapServers);

for (IpAddress ipAddress : initialHosts) {
log.info("Contacting node using address '{}'.", ipAddress.printIpAddress());
}

ping = new TCPPING()
.setValue("initial_hosts", initialHosts)
.setValue("port_range", clusterConfiguration.getPortRange());

log.info("TCPPING: initial_hosts = {}, port_range = {}", initialHosts.size() + " hosts", clusterConfiguration.getPortRange());
} else {
log.debug("Configuring PING (UDP multicast discovery) protocol");
ping = new PING();
}

return ping;
}

/**
* Configures and returns a MERGE3 protocol instance.
* <p>
* This method creates a new MERGE3 instance and configures it using Spring @Value properties.
* MERGE3 is responsible for detecting and healing cluster splits (network partitions).
* Only properties that are explicitly set (not null) will be applied to the configuration.
* This allows for selective configuration while using JGroups defaults for unspecified values.
* </p>
* <p>
* Configurable MERGE3 properties:
* <ul>
* <li><b>min_interval</b>: Minimum time (ms) between merge attempts</li>
* <li><b>max_interval</b>: Maximum time (ms) between merge attempts</li>
* <li><b>check_interval</b>: Interval (ms) to check for split clusters</li>
* </ul>
* </p>
* <p>
* Example configuration in application.yml:
* <pre>
* uvuyo:
* cluster:
* merge3:
* min-interval: 10000
* max-interval: 30000
* check-interval: 5000
* </pre>
* </p>
*
* @return configured MERGE3 protocol instance
*/
private MERGE3 configureMERGE3() {
MERGE3 merge3 = new MERGE3();
log.info("Configuring MERGE3 (Cluster Merge) protocol");

if (clusterConfiguration.getMerge3MinInterval() != null) {
merge3.setMinInterval(clusterConfiguration.getMerge3MinInterval());
log.info("MERGE3: min_interval = {}ms", clusterConfiguration.getMerge3MinInterval());
}

if (clusterConfiguration.getMerge3MaxInterval() != null) {
merge3.setMaxInterval(clusterConfiguration.getMerge3MaxInterval());
log.info("MERGE3: max_interval = {}ms", clusterConfiguration.getMerge3MaxInterval());
}

if (clusterConfiguration.getMerge3CheckInterval() != null) {
merge3.setCheckInterval(clusterConfiguration.getMerge3CheckInterval());
log.info("MERGE3: check_interval = {}ms", clusterConfiguration.getMerge3CheckInterval());
}

return merge3;
}

/**
* Configures and returns a FD_SOCK (Failure Detection Socket) protocol instance.
* <p>
* This method creates a new FD_SOCK instance and configures it using Spring @Value properties.
* FD_SOCK is a failure detection protocol that uses TCP connections to monitor cluster members.
* Only properties that are explicitly set (not null) will be applied to the configuration.
* This allows for selective configuration while using JGroups defaults for unspecified values.
* </p>
* <p>
* Configurable FD_SOCK properties:
* <ul>
* <li><b>bind_addr</b>: The bind address for the FD_SOCK server socket</li>
* <li><b>client_bind_port</b>: The port for the client socket (0 = any port)</li>
* <li><b>port_range</b>: The range of ports to try when binding</li>
* <li><b>suspect_msg_interval</b>: Interval for sending suspect messages (ms)</li>
* </ul>
* </p>
* <p>
* Example configuration in application.yml:
* <pre>
* uvuyo:
* cluster:
* fd-sock:
* client-bind-port: 0
* port-range: 10
* suspect-msg-interval: 3000
* </pre>
* </p>
*
* @return configured FD_SOCK protocol instance
*/
private FD_SOCK configureFDSock() {
FD_SOCK fdSock = new FD_SOCK();
log.info("Configuring FD_SOCK (Failure Detection Socket) protocol");

if (clusterConfiguration.getFdSockSuspectMsgInterval() != null) {
fdSock.setSuspectMsgInterval(clusterConfiguration.getFdSockSuspectMsgInterval());
log.info("FD_SOCK: suspect_msg_interval = {}ms", clusterConfiguration.getFdSockSuspectMsgInterval());
}

return fdSock;
}

/**
* Configures and returns a RAFT protocol instance.
* <p>
* This method creates a new RAFT instance and configures it with the required cluster settings.
* RAFT is used for distributed consensus and leader election in the cluster.
* </p>
*
* @return configured RAFT protocol instance
*/
private RAFT configureRaft() {
log.info("Configuring RAFT protocol");

List<String> membersList;
if (clusterConfiguration.getMembers() == null || clusterConfiguration.getMembers().isEmpty()) {
membersList = List.of(nodeId);
} else {
membersList = List.of(clusterConfiguration.getMembers().split(","));
}

RAFT raft = new RAFT() // Raft protocol
.members(membersList)
.setValue("raft_id", nodeId)
.setValue("log_dir", uvuyoHome + File.separator + "data")
.setValue("log_prefix", "memdb." + nodeId)
.setValue("max_log_size", clusterConfiguration.getMaxLogSize())
.setValue("log_class", "org.jgroups.protocols.raft.FileBasedLog");

log.info("RAFT: raft_id = {}, log_dir = {}, log_prefix = memdb.{}, max_log_size = {}",
nodeId,
uvuyoHome + File.separator + "data",
nodeId,
clusterConfiguration.getMaxLogSize());
log.info("RAFT: members = {}", membersList);

return raft;
}

/**
* Configures and returns a GMS (Group Membership Service) protocol instance.
* <p>
* This method creates a new GMS instance and configures it using Spring @Value properties.
* Only properties that are explicitly set (not null) will be applied to the GMS configuration.
* This allows for selective configuration while using JGroups defaults for unspecified values.
* </p>
* <p>
* Configurable GMS properties:
* <ul>
* <li><b>join_timeout</b>: How long a joining member will wait for the join response</li>
* <li><b>leave_timeout</b>: How long to wait for a response to a leave request</li>
* <li><b>merge_timeout</b>: Timeout for merging subgroups</li>
* <li><b>view_ack_collection_timeout</b>: Timeout for collecting view acknowledgements</li>
* <li><b>log_collect_messages</b>: Whether to log collection of view acks</li>
* </ul>
* </p>
* <p>
* Example configuration in application.yml:
* <pre>
* uvuyo:
* cluster:
* gms:
* join-timeout: 10000
* leave-timeout: 3000
* merge-timeout: 30000
* view-ack-collection-timeout: 5000
* log-collect-messages: false
* </pre>
* </p>
*
* @return configured GMS instance
*/
private GMS configureGMS() {
GMS gms = new GMS();

log.info("Configuring GMS (Group Membership Service) protocol");

// Configure join timeout if set
if (clusterConfiguration.getGmsJoinTimeout() != null) {
gms.setJoinTimeout(clusterConfiguration.getGmsJoinTimeout());
log.info("GMS: join_timeout = {}ms", clusterConfiguration.getGmsJoinTimeout());
}

// Configure leave timeout if set
if (clusterConfiguration.getGmsLeaveTimeout() != null) {
gms.setLeaveTimeout(clusterConfiguration.getGmsLeaveTimeout());
log.info("GMS: leave_timeout = {}ms", clusterConfiguration.getGmsLeaveTimeout());
}

// Configure merge timeout if set
if (clusterConfiguration.getGmsMergeTimeout() != null) {
gms.setMergeTimeout(clusterConfiguration.getGmsMergeTimeout());
log.info("GMS: merge_timeout = {}ms", clusterConfiguration.getGmsMergeTimeout());
}

// Configure view ack collection timeout if set
if (clusterConfiguration.getGmsViewAckCollectionTimeout() != null) {
gms.setViewAckCollectionTimeout(clusterConfiguration.getGmsViewAckCollectionTimeout());
log.info("GMS: view_ack_collection_timeout = {}ms", clusterConfiguration.getGmsViewAckCollectionTimeout());
}

// Configure max join attempts if set
if (clusterConfiguration.getGmsMaxJoinAttempts() != null) {
gms.setMaxJoinAttempts(clusterConfiguration.getGmsMaxJoinAttempts());
log.info("GMS: max_join_attempts = {}", clusterConfiguration.getGmsMaxJoinAttempts());
}

log.debug("GMS configuration complete");

return gms;
}

Bela Ban

unread,
Apr 8, 2026, 5:07:36 AM (9 days ago) Apr 8
to jgroup...@googlegroups.com
Hi Martin


On 08.04.2026 10:13, Martin Tauber wrote:
Good Morning Bela,

(1.a) I am  using one chanel for jgroups and jgroups-raft.


OK. Do you perform your own mux-demuxing? I.e., do you call JChannel.setReceiver() in the channel and multiplex/demultiplex your messages? The reason for me asking this is that I want to double-check that your applications are not 'eating' each other's messages.

Again, a reproducer (fully compilable, not just code snippets) would be helpful. You could modify it to for example send tons of large messages to 2 out of 3 nodes, to hopefully trigger the error sooner.



(1.b) I invoke the callMethodWithFuture() regularly. I have one method that is resposible for all the remote (jgroup) calls. I am not sure what you are looking for. Yes it could happen that I have two calls running at the same time.

That shouldn't be a problem, as the calls are asynchronous... at least from JGroups' perspective.

So IIRC your main issue was blocking? If this happens can you get a stack trace for all 3 nodes? Note that - if you use virtual threads - you should use jcmd (Thread.dump_to_file) to trigger the stack trace.


(1.c) It is very difficult to reproduce the problem, because the system can be running for days without having any problem ... i am trying to get closer to the cause ...

I suggest write a stress test with real big messages; then perhaps you can reproduce the error much sooner.



(2) ok ... Now i wonder if the multiple callMethodWithFuture ... could cause the issue and by making them responde faster just made it more unlikely that two run at the same time ...
(3.a) ok i did not see a warning
(3.b) I am using jgroups: 5.4.5.final and for raft 1.0.14.final

OK

Since i am configuring the stack via java and not via xml it is not so easy to share the configuration. here are some code snippets:


I'm not interested in the code snippets below. If you send a standalone compilable reproducer, or just a demo, I'll take a look. If I know what the demo does, I can modify it to become a perftest.
Cheers



--
You received this message because you are subscribed to the Google Groups "jgroups-raft" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jgroups-raft...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages