akka.remote {
transport-failure-detector {
heartbeat-interval = 30 s # default 4s
acceptable-heartbeat-pause = 5 s # default 10s
}
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
Patrik Nordwall
Typesafe - Reactive apps on the JVM
Twitter: @patriknw
Hi Caoyuan,Do you see the same thing with Akka version 2.3.4 and changing the transport-failure-detector settings to default?
akka.remote {
transport-failure-detector {
heartbeat-interval = 30 s # default 4s
acceptable-heartbeat-pause = 10 s # default 10s
}
We have an akka cluster with 10 nodes. it works almost smoothly except periodic firing "Disassociated" WARN log, which seems cannot be recovered:The following is the log records.......
2014-08-10 00:00:09,253 WARN a.remote.ReliableDeliverySupervisor akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].2014-08-10 00:00:44,292 WARN a.remote.ReliableDeliverySupervisor akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].2014-08-10 00:01:49,332 WARN a.remote.ReliableDeliverySupervisor akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].2014-08-10 00:02:24,373 WARN a.remote.ReliableDeliverySupervisor akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].2014-08-10 00:02:59,412 WARN a.remote.ReliableDeliverySupervisor akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].2014-08-10 00:03:34,452 WARN a.remote.ReliableDeliverySupervisor akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
-EndreHi Caouyan,It is usually dangerous to set the heartbeat-pause to a lesser value than the heartbeat interval itself. If a heartbeat gets lost, then the next heartbeat will definitely not make the deadline. I recommend to set it to a larger value. Also, I would go with a lower heartbeat-interval setting, 10s seems more appropriate if you want low heartbeat traffic.
On Monday, August 25, 2014 6:31:15 PM UTC+8, Akka Team wrote:-EndreHi Caouyan,It is usually dangerous to set the heartbeat-pause to a lesser value than the heartbeat interval itself. If a heartbeat gets lost, then the next heartbeat will definitely not make the deadline. I recommend to set it to a larger value. Also, I would go with a lower heartbeat-interval setting, 10s seems more appropriate if you want low heartbeat traffic.
Got it now. Thanks.BTW, Our cluster has ran 15 days with 1 million long-connections, stable and consistent.
On Monday, August 25, 2014 6:31:15 PM UTC+8, Akka Team wrote:-EndreHi Caouyan,It is usually dangerous to set the heartbeat-pause to a lesser value than the heartbeat interval itself. If a heartbeat gets lost, then the next heartbeat will definitely not make the deadline. I recommend to set it to a larger value. Also, I would go with a lower heartbeat-interval setting, 10s seems more appropriate if you want low heartbeat traffic.
Got it now. Thanks.BTW, Our cluster has ran 15 days with 1 million long-connections, stable and consistent.
Hi Roland,The cluster is based on https://github.com/wandoulabs/spray-socketio
. We, Wandou Labs ( http://www.snappea.com/ ), are going to use it for at least 10+ millions persistent connections, from mobile devices to our service. These mobile devices can then, share status, push messages, fire real-time events, virtually connect to each others etc.