I typically cope with this by dispatching the message using the thread that enqueued it.
This gets a bit nasty though because it can end up taking way too much time to finish work on the original handler, so typically I keep a thread local queue per dispatcher thread and just enqueue these direct dispatch requests there.
When my handler is done processing the message from the global queue or from its own spsc queue it just checks the thread local one and dispatches all of those before polling from any other.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
That's definitely the best answer if you are allowed to drop messages.
It doesn't have to be circular for this to happen, it suffices to have one component which is substantially slower than the others, its queue will grow and end up blocking. Alas, that is a problem that can be mitigated once the system is designed to accommodate those kinds of components.
If your system is overwhelmed, you drop the message as far close as you can to where the client inputted it, send back an overload error message to the client, and increment a stat saying you lost it. This is the most common form of back-pressure. Most everything else will just cause your system to be come unresponsive in ways you can't expect, intend, or manage well.But you're talking about in the context of circular designs.Circular communication patterns are a, uh, anti-pattern. What are you trying to do?
On Sun, Jan 12, 2014 at 11:20 PM, Rajiv Kurian <geet...@gmail.com> wrote:
In producer-consumer topologies connected by bounded queues, how does one inherently avoid deadlocks? In systems with circular communication patterns (thread 1 -> thread 2 -> thread 1) deadlock could occur if the queues connecting the threads get full.One argument is that these buffers/queues should never be getting full if tested with realistic traffic patterns. Though the increased performance, predictability and back-pressure mechanisms of bounded queues is a great upside, the danger of deadlock in complex topologies makes them a tad risky.For threads that are both consumers and producers, if they are blocked for a significant number of cycles trying to enqueue, they could decide to fail the enqueue request and go back to consuming from their inbound queues. Is there a list of best practices, to handle or avoid deadlocks in these systems?
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
That's definitely the best answer if you are allowed to drop messages.
It doesn't have to be circular for this to happen, it suffices to have one component which is substantially slower than the others, its queue will grow and end up blocking. Alas, that is a problem that can be mitigated once the system is designed to accommodate those kinds of components.
On Jan 13, 2014 7:37 AM, "Jeff Hodges" <je...@somethingsimilar.com> wrote:
If your system is overwhelmed, you drop the message as far close as you can to where the client inputted it, send back an overload error message to the client, and increment a stat saying you lost it. This is the most common form of back-pressure. Most everything else will just cause your system to be come unresponsive in ways you can't expect, intend, or manage well.But you're talking about in the context of circular designs.Circular communication patterns are a, uh, anti-pattern. What are you trying to do?
On Sun, Jan 12, 2014 at 11:20 PM, Rajiv Kurian <geet...@gmail.com> wrote:
In producer-consumer topologies connected by bounded queues, how does one inherently avoid deadlocks? In systems with circular communication patterns (thread 1 -> thread 2 -> thread 1) deadlock could occur if the queues connecting the threads get full.One argument is that these buffers/queues should never be getting full if tested with realistic traffic patterns. Though the increased performance, predictability and back-pressure mechanisms of bounded queues is a great upside, the danger of deadlock in complex topologies makes them a tad risky.For threads that are both consumers and producers, if they are blocked for a significant number of cycles trying to enqueue, they could decide to fail the enqueue request and go back to consuming from their inbound queues. Is there a list of best practices, to handle or avoid deadlocks in these systems?
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
I am currently testing implementing the Actor model on a "CSP" base. In this model all queues are bounded (like 10.000) elements. The price is, one needs one thread per actor (even Callbacks get an own Thread). If JVM would support Continuations one could simply reschedule a "blocking" Actor call. Since the OS schedules Threads and balances load, the CSP-based Actors (using non-blocking queues) do not perform that bad. In fact in my tests they are doing pretty well except one uses a lot more actors than Cores are avaiable (e.g. 60 actors on 4 cores).
In the end, for many cases this performs better than actors (especially with many events and little processing/event), because allocation of huge Q's can be costly (its easy to get Q's >500k elements if programming naive)
(Note that Akka performance is similar to the blue line on Xeon processors) This test is from 2 socket opteron each 16 cores (=8 real cores per socket because of FP)
Am Montag, 13. Januar 2014 09:00:24 UTC+1 schrieb Rajiv Kurian:
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsubscribe...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsubscribe...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
All good points and I agree with what you are saying. Ideally you want a generic framework which can cater for a wide range of applications or even changing requirements.
This flexability of comes at a cost to performance and your ability to understand the system. If you are forced to understand your system from the start and specialise, at the cost of flexability, you can have a faster, more robust system (assuming your structure doesn't have to change too much/often, in which case you risk getting the worst of both scenarios)
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
In producer-consumer topologies connected by bounded queues, how does one inherently avoid deadlocks? In systems with circular communication patterns (thread 1 -> thread 2 -> thread 1) deadlock could occur if the queues connecting the threads get full.
One argument is that these buffers/queues should never be getting full if tested with realistic traffic patterns. Though the increased performance, predictability and back-pressure mechanisms of bounded queues is a great upside, the danger of deadlock in complex topologies makes them a tad risky.
For threads that are both consumers and producers, if they are blocked for a significant number of cycles trying to enqueue, they could decide to fail the enqueue request and go back to consuming from their inbound queues. Is there a list of best practices, to handle or avoid deadlocks in these systems?
I found some papers adressing this. An actor system using (blocking) bounded queues basically is a CSP system (a pure CSP system has Q size=1, in practice one uses (short) bounded queues).In order to avoid deadlocks, CSP systems introduce "non-determinism", which means in case they can't send (receiver Q full), they don't block but just try to receive a message instead.Technically this means (if continuations are not avaiable) one uses the stack as a queue, Additionally this has some implications regarding state of an Actor/Process as each time a message is sent, it may happen sender state is altered by receiving a message due to the receiver blocking.like:while (!receiver.offer(msg))myqueue.pollAndProcessMsg();So you end up with the options a) unbounded Q risking outofmemory with actors b) deadlock c) stackoverflow d) manual handling of Q overflow as pointed out above. Still I'd expect CSP to work better as imbalances in circular message flow is reduced inherently (so max stack size depends on the amount of imbalance), while in an unbounded actor system no inherent mechanism is present due to the absence of backpressure.
Since you can't control producers in a system how would you make such a guarantee? With unbounded queues all you are doing is increasing latency which kicks in timeouts and clogs the system with garbage, It also puts pressure on your memory subsystem, increasing paging and decreasing latency guarantees, so a bounded queue with timeouts will work well. IMHO you generally want a request waiting on the client's outbound queue rather than sitting in a queue on the server side so you can react faster to any change in the system that would make the request unnecessary or cause it to be augmented.
I would solve this issue using global signals. So you have two static atomic integers, lets call them QueueWarning and QueueFull signal. As soon as any inbound queue is filled to 3/5'th it increments the "QueueWarning" signal. As soon as the queue is filled to 4/5'th it increases the QueueFull. If the queue falls back below 3/5'th it decreases the QueueFull and if it falls below 2/5'th it decreases the QueueWarning, so 2/5'th to 3/5'th -> Warning, 3/5'th -> 4/5'th Full.
Since you can't control producers in a system how would you make such a guarantee? With unbounded queues all you are doing is increasing latency which kicks in timeouts and clogs the system with garbage, It also puts pressure on your memory subsystem, increasing paging and decreasing latency guarantees, so a bounded queue with timeouts will work well. IMHO you generally want a request waiting on the client's outbound queue rather than sitting in a queue on the server side so you can react faster to any change in the system that would make the request unnecessary or cause it to be augmented.(Blocking) Backpressure with bounded queues always induces risk of deadlocks in actor graphs containing circles. Simplest form is: A sends 10 messages to B, B replies with 10 messages to each message received from A. If Q's are bounded to say 5, they will deadlock as soon B's incoming queue reaches size 5. A is blocked, so does not accept messages anymore from B because its waiting for B, B cannot send to A anymore. Its not about external sources of input, but about deadlocks occuring in complex circular actor graphs. Unbounded Q's trade deadlock risk against out-of-mem risk.
--
3.5 The Problems of Queues
Queues typically use either linked-lists or arrays for the underlying storage of elements. If an in-memory queue is allowed to be unbounded then for many classes of problem it can grow unchecked until it reaches the point of catastrophic failure by exhausting memory. This happens when producers outpace the consumers. Unbounded queues can be useful in systems where the producers are guaranteed not to outpace the consumers and memory is a precious resource, but there is always a risk if this assumption doesn’t hold and queue grows without limit. To avoid this catastrophic outcome, queues are commonly constrained in size (bounded). Keeping a queue bounded requires that it is either array-backed or that the size is actively tracked.
Queue implementations tend to have write contention on the head, tail, and size variables. When in use, queues are typically always close to full or close to empty due to the differences in pace between consumers and producers. They very rarely operate in a balanced middle ground where the rate of production and consumption is evenly matched. This propensity to be always full or always empty results in high levels of contention and/or expensive cache coherence. The problem is that even when the head and tail mechanisms are separated using different concurrent objects such as locks or CAS variables, they generally occupy the same cache-line.
The concerns of managing producers claiming the head of a queue, consumers claiming the tail, and the storage of nodes in between make the designs of concurrent implementations very complex to manage beyond using a single large-grain lock on the queue. Large grain locks on the whole queue for put and take operations are simple to implement but represent a significant bottleneck to throughput. If the concurrent concerns are teased apart within the semantics of a queue then the implementations become very complex for anything other than a single producer – single consumer implementation.
In Java there is a further problem with the use of queues, as they are significant sources of garbage. Firstly, objects have to be allocated and placed in the queue. Secondly, if linked-list backed, objects have to be allocated representing the nodes of the list. When no longer referenced, all these objects allocated to support the queue implementation need to be re-claimed.