Sep-17 15:31:57.029 [tcp-disco-srvr-#3%nextflow%] INFO o.a.i.s.d.tcp.TcpDiscoverySpi - TCP discovery accepted incoming connection [rmtAddr=/10.10.3.217, rmtPort=34421]
Sep-17 15:31:57.037 [tcp-disco-srvr-#3%nextflow%] INFO o.a.i.s.d.tcp.TcpDiscoverySpi - TCP discovery spawning a new thread for connection [rmtAddr=/10.10.3.217, rmtPort=34421]
Sep-17 15:31:57.038 [tcp-disco-sock-reader-#5%nextflow%] INFO o.a.i.s.d.tcp.TcpDiscoverySpi - Started serving remote node connection [rmtAddr=/10.10.3.217:34421, rmtPort=34421]
Soon after, the .node-nextflow.log on the worker node reports that the message times out:
Sep-17 15:32:27.085 [main] WARN o.a.i.s.d.tcp.TcpDiscoverySpi - Timed out waiting for message delivery receipt (most probably, the reason is in long GC pauses on remote node; consider tuning GC and increasing 'ackTimeout' configuration property). Will retry to send messa
ge with increased timeout [currentTimeout=30000, rmtAddr=/
10.10.1.210:47500, rmtPort=47500]
I've already increased the cluster.ackTimeout to 30000 in the ~/.nextflow/config in both the main and the worker node.
Does anybody have any hints on how I might go about getting the worker node to join the cluster?
Thanks!
-Rob