Thank you for the appreciation, it's nice to know Thespian has been useful.
The messages you are seeing indicate that the outbound message threshold is being reached. This threshold is designed to provide some back-propagation flow control within the system, although it's a bit of an optimistic approach.
To give more background for this, each Actor object is the main thread of execution for a process (when using one of the multiproc___ bases, which you are). The core algorithm for a Thespian Actor looks roughly something like this:
Actor starts
Open local inbound socket, set non-blocking mode
while select(all_open_sockets):
if socket_data_available and len(outbound_queue) < threshold:
read_data
if read_is_complete_message:
self.receiveMessage() # <-- your code gets called here
if socket_outbound_can_do_work:
non-blocking send data
if all_data_sent:
close_outbound_socket
self.send():
place send message on outgoing queue
open outbound socket to target, set non-blocking mode
The general concept here is that the receiveMessage() call to your code will not spend overly-long periods of time before returning to that core while loop where the sockets can get serviced. The backpressure flow control concept here (via `len(outbound_queue) < threshold)`) is that if your actor has a large number of outbound messages queued, it's a result of incoming requests to do work, so the incoming work is paused and the actor is put in tx only mode to flush the output; once the queue amount drops below a lower threshold, receives will be resumed (the above pseudo-algorithm is a simplification: there is actually an upper and lower threshold marking points where the actor becomes tx-only and full tx-rx, respectively, to provide some hysteresis for transitions between these modes).
The messages you are seeing are from when the threshold test in line 4 fails. There can be a couple of reasons why this would occur, including
1 - your receiveMessage() is calling send() enough times to reach the threshold
2 - item 1 is caused by either *receiving* a large number of messages, or a multiplier effect where an incoming message results in a large number of outgoing messages
3 - your receiveMessage() is busy-waiting/blocking and not returning to the surrounding scheduling loop to allow outbound non-blocking processing to run
4 - your actor is receiving a large number of requests, and the amount of time to process those requests in receiveMessage() is dominating the actor's runtime, which doesn't provide much "idle" time to process outbound requests in a non-blocking manner.
In the log messages you provided, the "p120979" means that that is the Actor running under process number 120979. You can examine that process to see which Actor this is (e.g. `ps -leafyw | grep 120979`), which will help you determine which Actor is being stressed.
My recommendations would be to:
A. review the implementation of that Actor to check for issues 1 or 2. If that Actor is generating very large numbers of sends, you may want to re-evaluate your design; if you decide this is appropriate for your design needs, then the threshold can be raised. Unfortunately, the threshold is hard-coded right now, so you would have to make a custom Thespian installation to effect this change (`MAX_QUEUED_TRANSMITS` on line 38 of thespian/system/transport/asyncTransportBase.py); let me know if this is what you end up doing so I know if this needs to be a user-configurable threshold in the future.
B. review the implementation of that Actor to check for issue 3. You mentioned "every 10 seconds": I don't know how you've implemented those timers, but I would advise using self.wakeupAfter()-->WakeupMessage instead of time.sleep() in your receiveMessage.
Let me know if this helps!
-Kevin