Broadcaster on docker

53 views
Skip to first unread message

ARNAB CHANDA

unread,
May 11, 2021, 11:59:14 PM5/11/21
to thespian.py
Greetings,

Does anyone have a working example of a broadcaster implementation with thespian actor system running within docker containers?

I am fairly new to thespian, and somehow I am unable to get this working.

@requireCapability("Primary")
class Broadcaster(ActorTypeDispatcher):
def receiveMessage(self, message, sender):
print("{ Primary")
print(message)
print("}")

@requireCapability("Secondary")
class Sender(ActorTypeDispatcher):
def receiveMessage(self, message, sender):
print("{ **Sender")
print(message)
print("}")
print("Registering...")
Secondary_actr = self.createActor(Secondary)
registry = self.createActor(Broadcaster, globalName='Broadcaster')
self.send(registry, Secondary_actr)
print("Registry obtained")

@requireCapability("Secondary")
class Secondary(Actor):
def receiveMessage(self, message, sender):
print("{ Secondary")
print(message)
print("}")

if __name__ == "__main__":
global_name = os.environ['ACTOR_NAME']
if global_name == "leader":
ActorSystem("multiprocTCPBase", {"Convention Address.IPv4": ('leader', 1900), "Primary": True })
curr_actr = ActorSystem().createActor(Broadcaster, globalName="Broadcaster")
#ActorSystem().tell(curr_actr, curr_actr)
else:
ActorSystem("multiprocTCPBase", {"Convention Address.IPv4": ('leader', 1900), "Secondary": True })
curr_actr = ActorSystem().createActor(Sender)
ActorSystem().tell(curr_actr, "Start")

When I see the thespian logs, I see print("Registering...") getting printed, but it never gives me back the Secondary_actr instance. I have tried this with 3.7.2, 3.82 as well as 3.10.3

What am I missing? Any sample code/lead would be appreciated.

Thanks in advance.
Arnab

ARNAB CHANDA

unread,
May 12, 2021, 3:03:14 AM5/12/21
to thespian.py
"When I see the thespian logs, I see print("Registering...") getting printed, but it never gives me back the Secondary_actr instance. I have tried this with 3.7.2, 3.82 as well as 3.10.3"

should read

"When I see the thespian logs, I see print("Registering...") getting printed, but it never gives me back the registry instance. I have tried this with 3.7.2, 3.82 as well as 3.10.3"

Kevin Quick

unread,
May 12, 2021, 3:17:14 AM5/12/21
to ARNAB CHANDA, thespian.py
Hi Arnab,

I haven't had time to run your sample yet, so I will probably do so tomorrow for more explicit verification, but my initial suspicion is that your problem is in the ActorSystem startup.  I assume you have two terminal sessions running and you run "python test.py leader" in one and then "python test.py other" in the other terminal window.  However, the ActorSystem call in the `else:` clause for the latter doesn't specify a different 'Admin Port' under the capabilities dictionary (see https://thespianpy.com/doc/using.html#hH-9d33a877-b4f0-4012-9510-442d81b0837c); it will default to the same port 1900 and the startup code will first attempt to connect to that port to see if there is an ActorSystem already running before starting a new one.  Since the "leader" is already running there, it simply connects to that ActorSystem and the capabilities you provided are ignored.  This means that no ActorSystem will have the `"Secondary": True` capability and so there is no Actor System where the Secondary Actor can be created.

If you specify an alternate 'Admin Port' for the second ActorSystem, this may help resolve the issue.  Other things to note:
* You specify the 'Convention Leader.IPv4' as a hostname of 'leader', so your DNS lookup will need to resolve "leader" to the proper IP address where you ran `python test.py leader`.
* An ActorSystem will continue running, even after the application exits, unless you use the `ActorSystem.shutdown()` command.  If the ActorSystem is still running, all of the ActorSystem calls will see that it is running and simply connect to the old instance.  This can sometimes be surprising if you (for example) change the Capabilities and then run the new code and it doesn't act as if anything has changed (because it's still running the previous version).

Regards,
  Kevin


--
You received this message because you are subscribed to the Google Groups "thespian.py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to thespianpy+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/thespianpy/756ae2d8-033f-4311-8581-18c39d5480fbn%40googlegroups.com.


--
-KQ

ARNAB CHANDA

unread,
May 13, 2021, 9:14:45 AM5/13/21
to thespian.py
Kevin,

Thanks for your kind responses off-line with a working code (terminal-based). However, after debugging this issue for last 2-3 days I have come to this observation that whenever we have actor systems running on multiple docker containers, this look-up via globalName does not seem to work.

1. Consider creating 3 containers, one primary/convention leader and two secondaries.
2. When Secondaries spin up, create primary actors on each and then send their addresses to the broadcast actor which would be on the primary.
3. I have tried this in various ways, and the messages from secondary containers do not reach broadcaster on time.

I do not have enough understanding of the source code at this time to figure out how thespian performs a look-up by globalName, but my questions are

1. If we can provide some kind of a time-out when using the look-up/creatActor with globalName
2. What if the same global actor is called simultaneously from more than one container, creating a race condition?
3. How do we effectively debug the calls going between multiple docket containers?

Please let me know. Thanks for all your help and support so far.

Arnab

Kevin Quick

unread,
May 13, 2021, 11:23:47 AM5/13/21
to ARNAB CHANDA, thespian.py
Hi Arnab,

> 1. If we can provide some kind of a time-out when using the look-up/creatActor with globalName

The createActor() call is asynchronous; this allows the requesting actor to continue to run holding a proxy address for the final actor creation.  Because this call has already completed, there is no mechanism for signalling an error directly to that call.  It might be possible to send some sort of ActorCreateFailed message, but this is not currently implemented.   Normally the createActor proceeds on a best effort and only fails if the Actor requested cannot actively be started, or if there is a communications issue preventing sending of the creation messages.

> 2. What if the same global actor is called simultaneously from more than one container, creating a race condition?

These should be handled by Thespian and only a single globalName actor should be created (in the target ActorSystem).  There is a small race condition that was just reported (https://github.com/kquick/Thespian/issues/75) and which I have fixed on master, but not yet released.  If the race occurs, two globalName actors will be created, but only one will actually receive traffic.

> 3. How do we effectively debug the calls going between multiple docket containers?

The internal Thespian log files (THESPLOG_FILE and THESPLOG_THRESHOLD=DEBUG) will be the best way to diagnose this.

Regards,
  Kevin





--
-KQ
Reply all
Reply to author
Forward
0 new messages