[Maya-Python] Service-oriented task distribution

235 views
Skip to first unread message

Marcus Ottosson

unread,
May 30, 2014, 7:57:54 AM5/30/14
to python_in...@googlegroups.com

Hi all,

I’m developing an example program to illustrate the benefits and disadvantages of a service-oriented approach to task-distribution; mimicking the instant message-approach to routing messages from one-to-many via a central broker, it’s not quite at the point where pros and cons start showing their true colors, but getting there.


First question - Routing multiple inputs to a single output

peer.py represents a running client capable of making requests to the server, or swarm.py. Both peer.py and swarm.py handles incoming requests/replies via rather long-winded if/else methods:

# Incoming messages
if type == 'letter':
        # do stuff

elif type == 'service':
        # do stuff

elif type == 'receipt':

            ...

What would be a better/neater approach to routing multiple inputs to a single output, in cases where there’d be a large number (50+) of branches in logic?

Go to method

Best,
Marcus

--
Marcus Ottosson
konstr...@gmail.com

Justin Israel

unread,
May 30, 2014, 8:04:07 AM5/30/14
to python_in...@googlegroups.com

You could just register all your different endpoints or types in a dict where the key is the type and the value is the handler function. And each handler function could do the custom logic that transforms the message into a conformned message for the output.

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/python_inside_maya/CAFRtmODUGwFEJDmR0qYEm1YHptPR-%2Ba-4dP_cFDEOWcfF__n3A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Marcus Ottosson

unread,
May 30, 2014, 8:12:26 AM5/30/14
to python_in...@googlegroups.com

Tony Barbieri

unread,
May 30, 2014, 8:57:37 AM5/30/14
to python_in...@googlegroups.com
Are you looking to write a chat like program or are you interested in having tasks executed by workers on the other side of the "swarm"?  Have you looked at celery?  It deals with sending tasks to a broker, routing them using exchanges + queues and having the tasks executed by workers and returning results through a promise.  It's very neat, but good luck trying to reverse engineer what's going on in the code :).





For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
May 30, 2014, 9:15:02 AM5/30/14
to python_in...@googlegroups.com

Both. Tasks are to be scheduled via a chat-like interface. Humans send tasks via a terminal whereas workers build messages programmatically.

For example, the command..

$ order coffee latte --no-milk

..should send a task to the swarm who will in turn delegate the task to workers capable of executing it.

On the other hand, tasks can be delegated to a group of workers:

$ peer barista order coffee latte --no-milk

In which case the swarm will still distribute the task, but in this case to a pre-determined group of workers. Who also provide an interface to available services:

$ peer barista --list-services

Barista services:
   Take order (order)

I’ve had a brief look at Celery, do you have any experience with it? How come it’d be difficult to reverse-engineer? Poor docs? Complicated behaviour?

Thanks




For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Tony Barbieri

unread,
May 30, 2014, 9:24:01 AM5/30/14
to python_in...@googlegroups.com
I've used Celery pretty extensively at this point.  We use it primarily for running various tasks that users don't have permissions to run locally.  I basically wrapped it all up into a library that makes it super simple to add new methods with optional tasks associated with them that offers both a json-rpc interface running in a tornado server as well as direct celery access.

I only mention it's difficult to reverse-engineer because there is a lot of indirection and dynamic inheritance going on in the internal code.  Celery also relies on a few different libraries that are written by the same developer so you have to follow the order of operations through quite a few places.  The code is actually really interesting but takes awhile to wrap your head around, and there is a lot of it.  However, the public facing API is quite nice and it's very well documented.  I've found it to be really nice to work with and it's also pretty robust at this point.  There have been quite a few years put into it with a ton of feedback and it seems to be pretty widely used.




For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
May 30, 2014, 9:44:05 AM5/30/14
to python_in...@googlegroups.com
That is excellent, Tony, thanks! I'll give that a go, look through the code, and hopefully that takes care of the routing and task-creation questions I had.

To be continued.



For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Marcus Ottosson

unread,
May 30, 2014, 10:09:25 AM5/30/14
to python_in...@googlegroups.com
There was actually one more thing, Tony.

Would you mind have a quick scan through the set of requirements I set out for Chat, and let me know if you find any that would be tricky or excellent with Celery?

--
Marcus Ottosson
konstr...@gmail.com

Tony Barbieri

unread,
May 30, 2014, 10:30:09 AM5/30/14
to python_in...@googlegroups.com
I guess if you are writing a true "chat" client it may not be the way to go, but if you are looking to run distributed tasks then it should fit the bill.  You may need to combine a few technologies together rather than just relying on Celery.  I'm not sure you'll have the control over sending a task to a specific "peer" unless you've set up the individual workers using specific routing.  It's really good at concurrency but I believe the concurrency is defined when the worker starts up, not on the fly.  I could be wrong about that as it isn't a requirement I've had so far.



For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
May 30, 2014, 10:41:13 AM5/30/14
to python_in...@googlegroups.com

I guess if you are writing a true “chat” client it may not be the way to go

How come?

The purpose of this experiment is to find differences in how a chat application deals with message-passing and how a cloud of workers is assigned tasks and so far I haven’t encountered any differences; only similarities.

For example, in an instant message application you’re got:

  1. A peer sending messages to one or more peers. (e.g. distribution of tasks/events)
  2. A peer receiving messages from one or more peers. (e.g. receiving work)
  3. A peer monitoring aliveness of one or more peers. (e.g. renderfarm monitoring)
  4. The human(s) on the receiving end processes the request (e.g. “what is the root of 59?”)
  5. The human(s) on the other side notifies you when complete.

I could go on, but I think you see my point.

I’m not sure you’ll have the control over sending a task to a specific “peer” unless you’ve set up the individual workers using specific routing

With this, I’m imaging something similar to a render-farm overview of available workers, where you could send a task to either the entire farm or a single worker.

This would involve routing tasks to groups of workers and ideally individual workers.

Are you working with a single cloud of uniform workers, each request being sent to all or any worker? Or do you have specialised groups, e.g. some dealing with image conversion, others with file-writes etc?

Best,
Marcus




For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Tony Barbieri

unread,
May 30, 2014, 11:09:27 AM5/30/14
to python_in...@googlegroups.com
I do see your point, what I meant was I'm not sure Celery is going to give you the granularity you may require.  I wasn't suggesting there aren't similarities conceptually.

I am working with specialized groups and single cloud.  Some workers will process anything, while other workers have been setup to specifically listen for certain "types" of work.

Celery could do the trick if you manage the routing and manage the workers through the API.  I haven't gotten that detailed though, so I can't say for sure.



For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
May 30, 2014, 11:22:36 AM5/30/14
to python_in...@googlegroups.com

I’m not sure Celery is going to give you the granularity you may require

Like what?

I am working with specialized groups and single cloud. Some workers will process anything, while other workers have been setup to specifically listen for certain “types” of work.

Excellent, thanks for that!




For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Tony Barbieri

unread,
May 30, 2014, 11:30:00 AM5/30/14
to python_in...@googlegroups.com
I'm not sure I have a specific example.  I did run into issues in the past when the number of queues got quite high.  Increasing communication granularity will most likely result in an increase in the number of queues.  Honestly the reasons I ran into those issues in the past could easily have nothing to do with Celery or RabbitMQ.  In fact they most likely didn't, so celery could fit the bill nicely.  You will have to choose a broker, we're currently using RabbitMQ.



For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
May 30, 2014, 11:53:33 AM5/30/14
to python_in...@googlegroups.com

Yes, I’m looking into it now and it seems RabbitMQ would be the default. It clashes some with my use of ZeroMQ for messaging, which assumes you’re writing your own broker. ZeromMQ overall seems better equipped for small messages which is my main requirement (e.g. file reads/writes and directory listings).

I’m really interested in Celery’s use of promises for return values though. How are you making use of promises in your code?

Something like this?

def func():
    promise = async_task('long_calculation')
    # do something else
    promise.join()
    # return

I’m thinking promises are good for in-process asynchronism and less so for the distributed kind, due to the overhead of making a remote request.

Are you using it mainly for RPC?




For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Tony Barbieri

unread,
May 30, 2014, 12:04:02 PM5/30/14
to python_in...@googlegroups.com
It looks like you can use ZeroMQ with celery as the transport and something else to handle the results (Redis or MongoDB?).

How I use promises depends.  I might wait for the result and block the code (folder creation), I might periodically check if the task has completed, or I might not care and just continue on my way.  Promises should work fine for distributed as well, you don't think they would be performant do the remote request for the status of the task?



For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
May 30, 2014, 12:42:46 PM5/30/14
to python_in...@googlegroups.com

It looks like you can use ZeroMQ with celery as the transport and something else to handle the results (Redis or MongoDB?)

I think I’ll have to wrap my head around how Celery works before I can digest this one (a separate MQ to handle return values?)

you don’t think they would be performant do the remote request for the status of the task?

Sorry, could you rephrase that?

Marcus Ottosson

unread,
May 30, 2014, 12:47:14 PM5/30/14
to python_in...@googlegroups.com
I just noticed the initial link to the method in question was wrong, here's is the actual line with if/else statements:
--
Marcus Ottosson
konstr...@gmail.com

Tony Barbieri

unread,
May 30, 2014, 1:01:48 PM5/30/14
to python_in...@googlegroups.com
Yes, it seems that celery does not support using ZeroMQ for both transport (sending the task over the pipe) and the results from said task.  A different result backend would need to be used (RabbitMQ, Redis, Mongodb).  There is a list of supported backends here: http://celery.readthedocs.org/en/latest/whatsnew-3.1.html#new-rpc-result-backend

Sorry I mangled that second part pretty good.  What I meant was, you don't think that retrieving or querying the status from a remote backend for the status of a distributed task would be performant?  I am referring to when you said:

I’m thinking promises are good for in-process asynchronism and less so for the distributed kind, due to the overhead of making a remote request.


 

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
-tony

Chad Dombrova

unread,
May 30, 2014, 1:17:08 PM5/30/14
to Maya Python Group
Hey tony,
how do you mange and monitor all of your celery workers?  for me that has been the main roadblock for celery adoption.  we started working on some modifications to flower, celery's web-based monitor app, to give it the ability to view task logs.  without that, it seems like the only option is to search the worker's local logs, which seems like a real PITA.

chad.

Tony Barbieri

unread,
May 30, 2014, 1:59:19 PM5/30/14
to python_in...@googlegroups.com
Hey Chad,

Yea, that is a big issue.  I am using flower for monitoring.  I only set that up recently, before that I was also using the logs which I agree is a huge PITA.  Flower works pretty well although if you shut down the web server, you lose the task history.  I'm sure it could be extended to save the history into a db somewhere if necessary.  That level of history hasn't been necessary for us at this point.

I was already running a Tornado JSON-RPC server to allow access to our API so it was fairly easy to shove flower in there.  I did end up having to modify some of the flower code to better support prefixes on the URL, but other than that it works pretty well.  Again the only issue is it won't retain historical data forever, only while it's running.  

The reason we run a JSON-RPC server is so our API can either be invoked client side using our python api which in turn calls the celery api directly or via json-rpc calls which in turn calls our python api on the server.  I've written a wrapper around Celery and a Tornado JSON-RPC library that allows a clean abstract interface to our API.  To make a method on a "route" a celery task, a developer only has to use a decorator.  The JSON-RPC server also allows us to make remote calls to our API in other offices which have their own RabbitMQ brokers setup.  So from NY I can invoke a folder creation call via json-rpc to our LA office which in turn calls a celery task that gets transported through the LA RabbitMQ broker and handled by workers running on VM's in LA.

I've also configured celery in such a way that each time a worker picks up work a fresh python instance is started and sets it's context to the same project context that the call was originally invoked from.  The celery tasks themselves are very thin wrappers around another API where all of the actual logic is kept which allows us to very rarely have to restart the celery workers for API updates.  The only time we have to restart the celery workers/server is if we need to add additional routes or methods to existing routes.  The abstract API interface has also been written so that we can have different versions available.  Each version has it's own queues setup and when I start a worker up I can specify which versions of the API and which queues (or routes as I call them in the API) the worker should handle.

Our interface to the API looks something like (this code is trimmed, it's only an example):


import psyapi
from psyop.api import filesystem

...

fsr = filesystem.FileSystemRequest()
fsr.add_action("folder", path=project_root_path)

fsr.add_action("create_file", path=pc_path, contents=data, overwrite=True)

fsr.add_action("folder",
                     path=source_path,
                     symlink_path=target_path)

fsr.add_action("copy",
                     source_path=env.get_project_branch_path(),
                     target_path=new_env.get_project_branch_path(),
                     ignore_patterns=ignore_patterns)

# get an api instance for version 1
api_instance = psyapi.get_api_version("v1")

# get a json client for version 1 of the api in the la office
json_client = psyapi.get_json_client("v1", host="lam")

# the api_instance and the client instance both have the same interface
api_instance.filesystem.execute_filesystem_request(filesystem_request=fsr.encode("json"))
json_client.filesystem.execute_filesystem_request(filesystem_request=fsr.encode("json"))

FileSystemRequest is a custom class that you can add file system actions to which will be executed on the server and execute_filesystem_request is a celery task.  When called using the api_instance, it will invoke the celery task client side whereas when called using json_client, the celery task is invoked by the json-rpc server.  I'm using a RabbitMQ exchange of type topic, the Celery routing key looks something like psyapi.v1.filesystem.# and the RabbitMQ queue name that's bound to this key is named something like psyapi.v1.filesystem.

Technically for a filesystem request you wouldn't have to access the psyapi module directly.  FileSystemRequest uses psyapi internally and has an execute method that you can pass an argument to telling it to run locally or on the server:


from psyop.api import filesystem

fsr = filesystem.FileSystemRequest()
fsr.add_action("folder", path=project_root_path)

fsr.execute(local=False)

This is way more information that you were asking for...but once I started writing a reply I figured I'd just explain kind of how we have everything setup :).  Hopefully it makes some sense!




--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
-tony

Chad Dombrova

unread,
May 30, 2014, 2:31:13 PM5/30/14
to Maya Python Group
cool stuff, tony.  do you use celery to do multi-worker distributed disk crawls, like finding all the files below a given directory?   this is another use case that I'm very interested in.  

also, btw, there is an option in flower to persist the tasks states to disk, but there's a simple bug that needs to be fixed to get it to work.

chad.


Tony Barbieri

unread,
May 30, 2014, 2:43:13 PM5/30/14
to python_in...@googlegroups.com
Thanks, chad!

We haven't used it for that although it's a very interesting idea.  We've just started writing some tools for crawling hierarchies, using Celery to distribute the work sounds like a very interesting idea...

Also good to know about the persistent option,  I'll give that a look if we end up needing it.  For now keeping long-term historical data for the types of tasks we're running hasn't been necessary.  Typically the historical data is only used for fixing immediate issues that may crop up.


--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
-tony

Justin Israel

unread,
May 30, 2014, 6:02:23 PM5/30/14
to python_in...@googlegroups.com
Hey Tony,

I haven't used celery for more than a super simple addition as a local background task queue to a Django server, so I thought I would ask specifically related to the render farm concept that Marcus mentioned, and celery's granularity. Does celery give you the control to define the scheduler? In the case where you want to direct tasks to the best fitting workers, does celery take into account available resources on the workers vs requested resources of the task? In situations where you want to direct to a subset of workers or a specific worker, does that equal a new queue? Do workers have to be preconfigured with "slots" and do tasks consume N available slots?

I can see how well suited celery would be for generalized task distribution, but I am curious how much support is included vs how much has to be developed as another layer to handle the type of profile of a render queue, where the focus is trying to reach and maintain maximum resource utilization and task throughput, and the least amount of task re-queues. And also with the need for priority sorting. But again maybe that is something that has to be a scheduler that sits in front of it actually going into the celery queues. 

Just curious what your thoughts are on the celery architecture for this kind of application, from your experience with it? 

 


Marcus Ottosson

unread,
May 31, 2014, 5:22:25 AM5/31/14
to python_in...@googlegroups.com

you don’t think that retrieving or querying the status from a remote backend for the status of a distributed task would be performant?

I think it’d be about one billion times slower than querying anything local. :)

What I’m referring to though is the design aspect of RPC. I’ve been reading some rather discouraging information about it lately (this one sums up it up rather well, and this one goes through their differences) and have been staying clear for the sake of finding out exactly what can be gained by doing something else; in this case - messaging.

The argument basically boils down to the fact that making a local call is faster (by the billions) than making a remote one, and that dressing a remote call up to look local encourages bad design. I’ll try and illustrate, although I’m still looking to find exactly what those pros and cons are:

import studiox

def rpc_publish(asset):
    """Example of RPC hiding slow calls. Which are local and which are remote?""" 
    path = studiox.publisher.get_path(asset)
    variant = studiox.path.dirname(path)

    # Perform quality checks
    assert studiox.qna(variant)
    assert not studiox.islocked(asset)


    studiox.commit(asset)
    studiox.push(asset)

    # Notify subscribers (database, peers)
    studiox.publish(asset)

Compared to a message-based one, where each function - or “service” - is de-coupled, including handling of errors and distributed logging:

import studiox

def soa_publish(asset):
    path = studiox.publisher.get_path(asset)  # Local

    studiox.messaging.Request(service='dirname', payload=path).send()  # Remote 
    variant = studiox.messaging.recv()  # Blocking

    studiox.messaging.Push(service='asset.qna',
                           payload=asset, 
                           reply_to='islocked').send()  # Asynchronous

def soa_islocked(asset):
    result = studiox.islocked(asset)
    if result is not None: 
        studiox.messaging.Push(service='asset.commit',
                               payload=asset,
                               reply_to='push').send()
    else: 
        studiox.messaging.Push(service='asset.error',
                               payload='%s is locked' % asset).send()

def soa_commit(asset): 
    studiox.commit(asset)
    studiox.push(asset)

    studiox.messaging.Publish(service='log.published',
                              payload=asset).send()

Clearly more verbose, and this is where I suspect convenience may influence a design, potentially for the worse.

how do you mange and monitor all of your celery workers?

At first, this question stuck me as odd. But from what I gather, RabbitMQ acts as a broker, in which case you’re relying on existing implementation for features such as logging and monitoring.

Ultimately, RabbitMQ (and others) are higher-level than ZeroMQ and in this particular example swarm.py is playing the role of RabbitMQ’s “server” application.

So, the reason I found it odd was that, having written swarm.py, logging is merely an additional call from the broker to another worker; a logging worker. Monitoring it yet another call and so forth. At this point, both of those are rudimentary and aligns with existing functionality.

“broker” of swarm.py

Simple, unless there’s something I’m missing.

Tony Barbieri

unread,
May 31, 2014, 4:27:11 PM5/31/14
to python_in...@googlegroups.com
Hey Justin,

This is basically what I was hinting at above about granularity.  I'm really not sure if Celery would be the right technology for this type of application, at least on it's own.  I haven't had to deal with much granularity at this point other than configuring which queues workers will receive work from so I haven't put much thought into using celery for an application like this.  Originally I just wanted to point out it's existence to Marcus and see if it could help with what he is trying to do :).

Does celery give you the control to define the scheduler?

You can define how tasks are routed both by default and on the fly.  There are quite a few options for dealing with this: http://celery.readthedocs.org/en/latest/userguide/routing.html#id2

In the case where you want to direct tasks to the best fitting workers, does celery take into account available resources on the workers vs requested resources of the task?

Not by default, I don't believe it does.  This would most likely have to be written in the routing logic.

In situations where you want to direct to a subset of workers or a specific worker, does that equal a new queue?

Yes, I believe it does.  A combination of Queues, Exchange types and routing keys would need to be configured to determine which workers/consumers should pick up the tasks.

Do workers have to be preconfigured with "slots" and do tasks consume N available slots?

Depends.  You can let celery decide what kind of concurrency a worker should have when the worker starts up or you can configure it in the celery "app" settings.  I believe you can also communicate with the consumers after they have already started and shrink/grow their process pools.


In the end I think you would wrap your own setup around celery.  I believe this extra layer would be necessary for some components.  Also the way the exchanges, queues and routes interact would have to be designed based on all of the various needs.  

Celery is pretty nice to work with once you understand how it all works, it's flexible and the developer of it is very active.  It would be really interesting to see how far celery could get you writing an application like this.





For more options, visit https://groups.google.com/d/optout.



--
-tony

Tony Barbieri

unread,
May 31, 2014, 4:39:17 PM5/31/14
to python_in...@googlegroups.com
I think it’d be about one billion times slower than querying anything local. :)

Lol, yes it is definitely slower than querying local...but when working with distributed tasks I'm not sure how else you can track progress/failures/results without using some sort of promise system :).

When you asked me originally am I mainly using it for RPC, did you mean am I using Celery for RPC or promises?  I'm a bit confused I guess as to the original question.  Celery is a messaging system much like the later example you listed above.  When you send an asynchronous message, a promise is returned.  You can decide what you want to do with it.  Ignore it, wait for a result, periodically check the status, etc.  I did wrap up some of the messaging features into a simpler interface, but it's purely optional to use it that way.  The core messaging features of celery are still available for the developers to use.  

The JSON-RPC interface is another optional interface to using our API.  This is more for interacting with additional web services and communicating with our other offices.  The "direct" api interface could also be configured to work with our other offices if necessary.  It's basically just changing the RabbitMQ broker url :).

I do see how writing code to appear as if it's running locally could lead to confusion or ignorance as to what is actually happening.
 


--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
-tony

Justin Israel

unread,
May 31, 2014, 4:46:42 PM5/31/14
to python_in...@googlegroups.com
On Sun, Jun 1, 2014 at 8:39 AM, Tony Barbieri <grea...@gmail.com> wrote:
I think it’d be about one billion times slower than querying anything local. :)

Lol, yes it is definitely slower than querying local...but when working with distributed tasks I'm not sure how else you can track progress/failures/results without using some sort of promise system :).

When you asked me originally am I mainly using it for RPC, did you mean am I using Celery for RPC or promises?  I'm a bit confused I guess as to the original question.  Celery is a messaging system much like the later example you listed above.  When you send an asynchronous message, a promise is returned.  You can decide what you want to do with it.  Ignore it, wait for a result, periodically check the status, etc.  I did wrap up some of the messaging features into a simpler interface, but it's purely optional to use it that way.  The core messaging features of celery are still available for the developers to use.  

The JSON-RPC interface is another optional interface to using our API.  This is more for interacting with additional web services and communicating with our other offices.  The "direct" api interface could also be configured to work with our other offices if necessary.  It's basically just changing the RabbitMQ broker url :).

I do see how writing code to appear as if it's running locally could lead to confusion or ignorance as to what is actually happening.
 

Ya, that is also why I disagreed a bit with that blog post that Marcus referenced. Someone in the comments made a similar statement, that it is the fault of the implementation doing the hiding rather that just saying RPC is bad. If it is very explicit, in the implementation, that you are doing a remote call, and not trying to mix it into behavior behind other functionality, then I don't really understand the problem with it. RPC doesn't also necessarily imply that you must wait on the return value. Again, at least not at the implementation level. Thrift, for instance, has oneway services where you fire and forget. The only thing that you wait on is the act of placing the network call. So it would only raise an exception if your connection was dead. Once the message is sent, your client no longer cares. You can have a oneway calls on each side of the RPC endpoints like sendMessage() and receiveResult()

 

Justin Israel

unread,
May 31, 2014, 4:59:58 PM5/31/14
to python_in...@googlegroups.com
On Sun, Jun 1, 2014 at 8:27 AM, Tony Barbieri <grea...@gmail.com> wrote:
You can define how tasks are routed both by default and on the fly.  There are quite a few options for dealing with this: http://celery.readthedocs.org/en/latest/userguide/routing.html#id2

In the case where you want to direct tasks to the best fitting workers, does celery take into account available resources on the workers vs requested resources of the task?

Not by default, I don't believe it does.  This would most likely have to be written in the routing logic.

In situations where you want to direct to a subset of workers or a specific worker, does that equal a new queue?

Yes, I believe it does.  A combination of Queues, Exchange types and routing keys would need to be configured to determine which workers/consumers should pick up the tasks.

Do workers have to be preconfigured with "slots" and do tasks consume N available slots?

Depends.  You can let celery decide what kind of concurrency a worker should have when the worker starts up or you can configure it in the celery "app" settings.  I believe you can also communicate with the consumers after they have already started and shrink/grow their process pools.


In the end I think you would wrap your own setup around celery.  I believe this extra layer would be necessary for some components.  Also the way the exchanges, queues and routes interact would have to be designed based on all of the various needs.  

Celery is pretty nice to work with once you understand how it all works, it's flexible and the developer of it is very active.  It would be really interesting to see how far celery could get you writing an application like this.



Right ya, so it sounds like it would purely be based on ultimately having 1 queue per worker in order to have the ability to dispatch a task to what the queue manager (broker) determines to be the best fit for the task (right amount of ram and cpu free). So if the system scales well to, say, more than 1000 queues then it might be feasible. Otherwise being dependent on the RabbitMQ queues as the transport mechanism for a render farm application may not be scalable. Not saying you ever claimed it was, but I was just interested in vetting the concept of it being sustainable as a scalable render farm platform. 
The broadcast thing sounds interesting though, as you could technically send a message to the entire subgroup of workers, and which ever worker has the matching node id of the message does the actual work. But that sounds like unnecessary chatter anyways. 

I think that wraps up my question. Celery may be usable as a small render queue for maybe a farm with less than 100 nodes. And it would require writing a bit of routing logic for the broker, and logic for the celery app, in order to support direct worker assignments and accounting the available resources at the broken level for all the workers. 
 

Marcus Ottosson

unread,
May 31, 2014, 5:10:39 PM5/31/14
to python_in...@googlegroups.com
Well, I take it you are both familiar with working with RPC, but are you also familiar working without?

I think to make a fair judgement, one would have to at least try both to an equal degree. I've had a hard time finding any benefits of using it other than convenience, and I'm not quite convinced.

Justin Israel

unread,
May 31, 2014, 5:24:27 PM5/31/14
to python_in...@googlegroups.com
I've done RPC, and I have done message passing. I don't really think about it as one being better than the other. I just think about it as the right one for the job. But like I said, depending on the implementation, RPC can really be just message passing with syntactic sugar that makes the interface more like a function call as opposed to a send operation with a data structure. If you aren't waiting on the result, how different are the two really? I get that it can be more flexible to not have predefined interfaces for all your RPC functions, whereas you can just work with a single send() and pass data structures in message passing. But then again, if you made an RPC function called send() which is oneway and takes some data structure, again what is the difference between the two? The RPC version just has a predefined contract about what send() is and what it accepts. 

If my system is primarily concerned with just sending oneway messages all the time to be queued/routed/dispatched, then I would probably lean more towards a pure zeromq lower-level approach. But I can't talk about it in a generalized conversation, with a generalized application, about which is best. I think both ways work great. 




On Sun, Jun 1, 2014 at 9:10 AM, Marcus Ottosson <konstr...@gmail.com> wrote:
Well, I take it you are both familiar with working with RPC, but are you also familiar working without?

I think to make a fair judgement, one would have to at least try both to an equal degree. I've had a hard time finding any benefits of using it other than convenience, and I'm not quite convinced.

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

Tony Barbieri

unread,
May 31, 2014, 5:34:20 PM5/31/14
to python_in...@googlegroups.com
Well, I take it you are both familiar with working with RPC, but are you also familiar working without?

Yes I am familiar with RPC and I have worked without it.  Celery is written using a message-based approach like you described and I have written API's that use a messaging approach as well.

How I end up designing some piece of code depends on the problem I'm trying to solve, the environment the code is meant to run within and the user base that may end up interacting with the codebase.  

Are you referring to the use of promises as RPC?

Best,



On Sat, May 31, 2014 at 5:10 PM, Marcus Ottosson <konstr...@gmail.com> wrote:
Well, I take it you are both familiar with working with RPC, but are you also familiar working without?

I think to make a fair judgement, one would have to at least try both to an equal degree. I've had a hard time finding any benefits of using it other than convenience, and I'm not quite convinced.

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
May 31, 2014, 5:39:45 PM5/31/14
to python_in...@googlegroups.com

I think both ways work great

Aw, that’s no good. :) I’m looking for actual cases where one is more appropriate than the other, not which one is the silver bullet of computing.

It’s a discussion on distributing work via a chat-like interface, not very generalised I’d think, but if you’d like let’s throw in some numbers:

In the conversation, there’d be around:

  • 500 peers in total
  • 50 of them being active within any given second
  • within which 2 tasks are being distributed continually
  • Tasks are at the size of “hello world”, “create directory”, “list directory”, “write metadata”, “add 1 to 1” etc..
  • ..each taking up a maximum of 1 second each.

Pros RPC:

  • Familiar, little initial learning curve

Cons RPC:

  • Hockey-stick complexity (easy at first, difficult at last (e.g. debugging when routes extend past point-to-point))

But then again, if you made an RPC function called send() which is oneway and takes some data structure, again what is the difference between the two?

Yes, precisely. What is the difference? That’s what I’m looking to find out. :)




For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Marcus Ottosson

unread,
May 31, 2014, 5:42:11 PM5/31/14
to python_in...@googlegroups.com

Are you referring to the use of promises as RPC?

Not sure I understand. :S Calling a promise as a remote procedure call?

--
Marcus Ottosson
konstr...@gmail.com

Justin Israel

unread,
May 31, 2014, 5:49:04 PM5/31/14
to python_in...@googlegroups.com
On Sun, Jun 1, 2014 at 9:42 AM, Marcus Ottosson <konstr...@gmail.com> wrote:

Pros RPC:

  • Familiar, little initial learning curve
Not sure what the learning curve is, unless you are referring to specific products that come with different degrees of set up to implement the RPC 
 

Cons RPC:

  • Hockey-stick complexity (easy at first, difficult at last (e.g. debugging when routes extend past point-to-point))
Why is this hard to debug? It would be the same as a message being sent. It goes to some other endpoint. All that is different is the interface for which the call is made.
 

But then again, if you made an RPC function called send() which is oneway and takes some data structure, again what is the difference between the two?

Yes, precisely. What is the difference? That’s what I’m looking to find out. :)

The question was rhetorical. I don't see the difference other than some sugar on the call site. 
 

Tony Barbieri

unread,
May 31, 2014, 5:52:24 PM5/31/14
to python_in...@googlegroups.com
Not sure I understand. :S Calling a promise as a remote procedure call?

Ah I was referring to your question above that sparked the RPC discussion:

I’m thinking promises are good for in-process asynchronism and less so for the distributed kind, due to the overhead of making a remote request.
 
Are you using it mainly for RPC?



 



For more options, visit https://groups.google.com/d/optout.



--
-tony

Justin Israel

unread,
May 31, 2014, 5:58:37 PM5/31/14
to python_in...@googlegroups.com
On Sun, Jun 1, 2014 at 9:39 AM, Marcus Ottosson <konstr...@gmail.com> wrote:

I think both ways work great

Aw, that’s no good. :) I’m looking for actual cases where one is more appropriate than the other, not which one is the silver bullet of computing.

It’s a discussion on distributing work via a chat-like interface, not very generalised I’d think, but if you’d like let’s throw in some numbers:

In the conversation, there’d be around:

  • 500 peers in total
  • 50 of them being active within any given second
  • within which 2 tasks are being distributed continually
  • Tasks are at the size of “hello world”, “create directory”, “list directory”, “write metadata”, “add 1 to 1” etc..
  • ..each taking up a maximum of 1 second each.
Personally I would prefer not to talk in terms of "patterns" as that sounds very java-minded (command pattern, actor pattern, ...), and boxing you into thinking about what you can and cannot do. I see RPC as just a formalized layer of message passing. Under the hood you have a socket sending a message, and someone on the other side receiving the message, and sending a reply. The difference is that RPC puts you firmly into a request-reply situation, where the reply may not even be for the computed answer. The reply could just be an id for which the caller could use as a promise to then poll for the computed result at a later date. 
Using a pure message passing framework like ZeroMQ, as you already know, gives you the tools to implement more communication types like push-pull, and pub-sub. If these types of communication are important to your application, then RPC is probably not the single solution. It can definitely be used for a client to talk to a server, and then a server can use features of zmq to talk to workers. But in terms of the client talking to the server, I would see RPC or lower level message passing being pretty much the same camp. Either you are directly sending the structured message, or you are using a predefined interface that will send your message based on parameters. Either you want to wait for the answer or you don't. 



Marcus Ottosson

unread,
Jun 1, 2014, 4:33:46 AM6/1/14
to python_in...@googlegroups.com

Ah I was referring to your question above that sparked the RPC discussion - Tony

Hmmmmmm. :) Ok, for this, let’s try and define what we mean with RPC. Here’s what I mean:

RPC call, where proxy represents a remote machine

# Local
>>> proxy.log.info('hello world')

# Remote
>>> log.info('hello world')

Here, log.info is the name of the function called on the other side. If the function does not exist, you get an AttributeError. There is a 1-1 correspondence between caller and receiver. Like we would expect, from a local call in traditional, imperative programming languages such as Python.

Tony, when you say you’ve worked with messaging without RPC, how does something like that look? And Justin, how does it look for you?

What do you guys think about this for differences?
http://www.inspirel.com/articles/RPC_vs_Messaging.html

Personally I would prefer not to talk in terms of “patterns” as that sounds very java-minded (command pattern, actor pattern, …), and boxing you into thinking about what you can and cannot do. - Justin

I’m not sure you’ve got the right idea here. Patterns have little to do with languages, nor about what you can or cannot do. A pattern, as far as I can tell, is a description of a scenario, coupled with pros and cons and a name so that we can refer to it in general conversation.

Are we talking about something like this?
http://www.amazon.co.uk/Design-patterns-elements-reusable-object-oriented/dp/0201633612

Also, where did “patterns” enter into the discussion? Do you consider RPC a pattern? On the contrary, patterns can be used to implement RPCs, like the Proxy Pattern and the Abstract Factory Pattern.



--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Justin Israel

unread,
Jun 1, 2014, 5:10:44 PM6/1/14
to python_in...@googlegroups.com
On Sun, Jun 1, 2014 at 8:33 PM, Marcus Ottosson <konstr...@gmail.com> wrote:

Ah I was referring to your question above that sparked the RPC discussion - Tony

Hmmmmmm. :) Ok, for this, let’s try and define what we mean with RPC. Here’s what I mean:

RPC call, where proxy represents a remote machine

# Local
>>> proxy.log.info('hello world')

# Remote
>>> log.info('hello world')

Here, log.info is the name of the function called on the other side. If the function does not exist, you get an AttributeError. There is a 1-1 correspondence between caller and receiver. Like we would expect, from a local call in traditional, imperative programming languages such as Python.

That could be one version of an RPC function call, where the call completely blocks on the final computed result, and the RPC implementation translates exceptions across the wire and raises the equivalent exception on the other side, to make it look 100% local. But it doesn't have to be this way, and if you compare them specifically as this being the definition, then I agree you are more limited with the RPC approach and it does hide the fact that it is a remote call.
 

Tony, when you say you’ve worked with messaging without RPC, how does something like that look? And Justin, how does it look for you?

Maybe Tony wants to give example of this, but I feel it is a redundant question because we have both worked with zmq and know what it looks like to send an arbitrary python object that represents a 'message' or a 'command' across the wire to be evaluated on the other end. It could go into a queue and be picked up by workers, it could be broadcasted out to a number of peers, and the sender could either directly wait on a reply, or receive an async reply on another socket/channel. 
 

What do you guys think about this for differences?
http://www.inspirel.com/articles/RPC_vs_Messaging.html

I agree with the parts about it potentially hiding remote calls to look local, but again I feel it is an implementation detail and not the sole definition of RPC. As I have described before, you can have an RPC call that is very explicitly presented as being a remote call, and it can:
  • Return the final computer value immediately, blocking
  • Return a promise type immediately, allowing the calling to poll the result later
  • Return nothing, immediately
  • Translate exceptions from the remote side to the local side, in any format it wants. Even exceptions called RemoteException() (or not raise exceptions)
But to me, the RPC aspect is that it presents a predefined interface. A function with a signature. This signature is validated as part of the RPC implementation before it goes onto the wire. I'm not really a fan either of trying to hide and mix in RPC calls with other objects to make it ambiguous. 
 

Personally I would prefer not to talk in terms of “patterns” as that sounds very java-minded (command pattern, actor pattern, …), and boxing you into thinking about what you can and cannot do. - Justin

I’m not sure you’ve got the right idea here. Patterns have little to do with languages, nor about what you can or cannot do. A pattern, as far as I can tell, is a description of a scenario, coupled with pros and cons and a name so that we can refer to it in general conversation.

Are we talking about something like this?
http://www.amazon.co.uk/Design-patterns-elements-reusable-object-oriented/dp/0201633612

Also, where did “patterns” enter into the discussion? Do you consider RPC a pattern? On the contrary, patterns can be used to implement RPCs, like the Proxy Pattern and the Abstract Factory Pattern.


Yea that one. What I mean is that usually when you hear talks of "Abstract Factory Pattern", "Command Pattern", etc, it is from people with Java backgrounds. That is all I meant. Not that patterns are a language-specific thing. I'm not saying you specifically mentioned patterns in this particular conversation, but I am just pointing out that I feel it is a bit limiting to think of the elements of this topic in concrete definitions. i.e. "This article defines RPC as being X, and message passing as being Y, and this is why one is better than the other". I would take it with a grain of salt (as I am sure you are already doing by asking opinions here). My point that I keep going back to is that I don't think RPC, to me, is exactly and solely what you have represented in your articles. I think it can overlap with message passing and is basically a representation of the same message passing principles under the hood. 
 


On 31 May 2014 22:58, Justin Israel <justin...@gmail.com> wrote:

On Sun, Jun 1, 2014 at 9:39 AM, Marcus Ottosson <konstr...@gmail.com> wrote:

I think both ways work great

Aw, that’s no good. :) I’m looking for actual cases where one is more appropriate than the other, not which one is the silver bullet of computing.

It’s a discussion on distributing work via a chat-like interface, not very generalised I’d think, but if you’d like let’s throw in some numbers:

In the conversation, there’d be around:

  • 500 peers in total
  • 50 of them being active within any given second
  • within which 2 tasks are being distributed continually
  • Tasks are at the size of “hello world”, “create directory”, “list directory”, “write metadata”, “add 1 to 1” etc..
  • ..each taking up a maximum of 1 second each.
Personally I would prefer not to talk in terms of "patterns" as that sounds very java-minded (command pattern, actor pattern, ...), and boxing you into thinking about what you can and cannot do. I see RPC as just a formalized layer of message passing. Under the hood you have a socket sending a message, and someone on the other side receiving the message, and sending a reply. The difference is that RPC puts you firmly into a request-reply situation, where the reply may not even be for the computed answer. The reply could just be an id for which the caller could use as a promise to then poll for the computed result at a later date. 
Using a pure message passing framework like ZeroMQ, as you already know, gives you the tools to implement more communication types like push-pull, and pub-sub. If these types of communication are important to your application, then RPC is probably not the single solution. It can definitely be used for a client to talk to a server, and then a server can use features of zmq to talk to workers. But in terms of the client talking to the server, I would see RPC or lower level message passing being pretty much the same camp. Either you are directly sending the structured message, or you are using a predefined interface that will send your message based on parameters. Either you want to wait for the answer or you don't. 



--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/python_inside_maya/CAPGFgA2SdTHud6bCeipDSScTkUei-MpEz3uXJOC1AX94Z2Kadg%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

Colas Fiszman

unread,
Jun 2, 2014, 12:59:15 AM6/2/14
to python_in...@googlegroups.com
I've also configured celery in such a way that each time a worker picks up work a fresh python instance is started and sets it's context to the same project context that the call was originally invoked from.

Hi Tony,
Can you give more info on how you are doing that?
Thanks,
Colas

Marcus Ottosson

unread,
Jun 2, 2014, 2:10:21 AM6/2/14
to python_in...@googlegroups.com

I feel it is an implementation detail and not the sole definition of RPC

I think this is where we went off the rails.

It enables a system to make calls to programs such as NFS across the network transparently, enabling each system to interpret the calls as if they were local. - Definition of RPC

From now on, let’s refer to RPC as being this, ok? :)

Tony Barbieri

unread,
Jun 2, 2014, 9:29:02 AM6/2/14
to python_in...@googlegroups.com
Hi Colas,

In order to get Celery to not reuse an existing python instance you just have to set the following celery setting to 1:

celery.conf.CELERYD_MAX_TASKS_PER_CHILD = 1

By setting that it tells Celery to only ever run a single task in a child before shutting that child down.  It may impact performance, but I wanted to ensure the environment was always clean and set to the correct project context.

Hopefully that makes sense!  If you need more details just ask.

Best,



--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
-tony

Tony Barbieri

unread,
Jun 2, 2014, 9:44:05 AM6/2/14
to python_in...@googlegroups.com
It enables a system to make calls to programs such as NFS across the network transparently, enabling each system to interpret the calls as if they were local. - Definition of RPC

Not to continue the debate but I am using messaging to perform tasks much like that definition ;).  For example, when a worker picks up a task it will execute the code on the machine the worker is running on which may end up invoking the procedure call remotely if the worker is running on a remote machine.

I think the biggest difference between RPC and messaging is what Justin outlined above:

But to me, the RPC aspect is that it presents a predefined interface. A function with a signature. This signature is validated as part of the RPC implementation before it goes onto the wire.

With messaging, no validation happens in regards to the message contents.  The message is sent off to the broker and if there is a worker/consumer present that can deal with that message it does so.  If there is not a consumer present that knows how to deal with the message then nothing happens unless either a timeout occurs or a consumer comes online that can deal with it.  With RPC you have an explicit contract and at the time of invoking the RPC call, the call signature will be validated either client side or server side.  If a contract was never created for the call signature an exception will be raised right then.

This is that what I consider the major differences between RPC and messaging are...
 


--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
Jun 2, 2014, 10:26:35 AM6/2/14
to python_in...@googlegroups.com

Cool, thanks Tony. I think it’s perfectly fine to have your own definitions, but for the sake of this conversation, it would be really helpful if we can all referred to the same thing.

Sounds like we’ve got two definitions going on, let’s find a more appropriate wording for them:

  1. Signature defined prior to sending a message across the wire, pre-validated
  2. Signature defined after being sent, post-validated.

pre-validated

Where the signature must match the Python stdlib “json” module

>>> proxy.json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True)

post-validated

Where signature may differ, so as to be fit for multiple languages even those without the ability to sort the keys. The receiver decides what to use.


>>> proxy.send(
    {'address': 'json',
     'payload': {"c": 0, "b": 0, "a": 0},
     'sortKeys': True}
)

It’s an amazingly interesting topic, as it relates to where to put the responsibility; on the user, or the recipient.

Do the examples make sense, is this what we’re referring to?




For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Marcus Ottosson

unread,
Jun 2, 2014, 10:38:30 AM6/2/14
to python_in...@googlegroups.com

Dynamic Registration versus if/else

A few posts back, we spoke about how to simplify long (50+) if/else clauses. We had two approaches:

if/else

if something == this:
    then do that
elif something == this_here:
    then do this other thing

hashmap

As suggested by Justin (hope I understood you correctly)

map = {
    'this': then do that,
    'this_here': then do this other thing
}

map[something]()

dynamic registration

I tested a third alternative, involving metaclasses. I’m generally not a fan of metaclasses and tend to stay clear, but in this case the gain may outweight the hassle.

  1. Use

  2. Baseclass

  3. Implementation

In a nutshell, each handler is a subclass of Factory which, upon subclassing, registers said subclass and provides an interface for it. At this point, there is no additional if/else statement, no hashmap to update, just subclass the Factory, and it’s handled. Including logging, error handling and anything surrounding it. Basically resulting in leaner code, at the expense of being more difficult to understand (you’ll need to understand metaclasses, for starters)

Let me know what you think.

--
Marcus Ottosson
konstr...@gmail.com

Tony Barbieri

unread,
Jun 2, 2014, 10:43:32 AM6/2/14
to python_in...@googlegroups.com
I think the difference between RPC and a message based system is not whether pre or post validation occurs, but rather if validation ever occurs.  

When using messaging, the message will go to the broker no matter what.  What happens to that message once it lands in the broker is up to the broker settings and if there are any consumers subscribed to the broker in such a way that they would be notified of the message.  pre/post validation does not come into play in that scenario.  

It does come into play when using RPC in that either the client will validate if the server can handle the request or the server will figure out if it can handle the request.  If either determines the request can not be handled, an exception will be raised immediately.

In a messaging system the client is required to care if the message was handled or not either via timeouts or by pinging for the result/status of the message.



For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
Jun 2, 2014, 11:00:47 AM6/2/14
to python_in...@googlegroups.com
Ok, how about this.

At one point or another, a message is translated into a procedure (let's assume the procedure exists). In the case of RPC, the translation isn't dealt with until the signature is validated from the client. I.e. the client must be aware of the signature, prior to allowing anything to be sent across the wire.

In the other case, a message is sent regardless and the client has no prior knowledge of what is valid and what is not, but will be informed of it after the fact. In this scenario, a message is always sent across the wire, and the client may or may not get a return value or exception.

Does that sum it up?



For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Marcus Ottosson

unread,
Jun 2, 2014, 11:07:25 AM6/2/14
to python_in...@googlegroups.com
Sorry, I shouldn't have used the word RPC. :) I get that you're using Promises in your message handling, which blurs the line between what is messaging and what is pure RPC. According to Google, an RPC looks *exactly* like a local call. But, in the case of getting a promise in return, this is completely unique to the messaging framework (Celery, in this case) and is not what you would get if the call were local (if we stay simple, I know there are uses for promises in local calls too).

Does that make sense tho?
--
Marcus Ottosson
konstr...@gmail.com

Tony Barbieri

unread,
Jun 2, 2014, 11:12:15 AM6/2/14
to python_in...@googlegroups.com
When using RPC the server may also validate it when the call comes in.  Your example above of pre/post validation basically refers to RPC and only RPC.  If the client can validate a RPC call before sending it will (in the case of some python RPC frameworks, this is the case).  If the client can't validate the call then the server will validate it when it receives the request.

In the case of messaging there is never any validation.  A consumer MAY throw an exception of some kind, but first a consumer would have to pick up the message and do something with it.  Messages may not even be tasks to be executed.  In the case of Celery they are tasks that have methods/functions that will be executed on a worker but Celery is leveraging message based systems to work that way and it is not inherit to a message based system.




For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
Jun 2, 2014, 11:19:12 AM6/2/14
to python_in...@googlegroups.com

Hmm. Do you include “validation of messages” when you mean “validation”?

Consider this:

>>> proxy.send({'do': 'make me a sandwich', 'with': 'mustard, and tomatoes', 'toasted?': 'yes'})

Here, a message is being sent, but not to a procedure. Could we consider this a non-RPC call?

In this case, there will be a receiver (the router) and the router MAY forward the call to a worker. But let’s say it doesn’t.

Still, the message will have to be interpreted. The message is routed based on the routers interpretation, no? This interpretation is what I’d consider a form of validation.

A router may be able to accept any message, but it will always try and make sense of it, before discarding it.

It feels like we’re talking about the same thing, but if you’d rather not call it “validation”, what would you call it?




For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Tony Barbieri

unread,
Jun 2, 2014, 11:24:44 AM6/2/14
to python_in...@googlegroups.com
Yes, what you wrote makes sense.  Promises are part of the Celery framework and not part of the specifications of messaging.

Promises are there, but they don't have to be used.  Celery could be thought of as a type of RPC I suppose in that each celery task does refer to a procedure that will be most likely be executed by some remote worker.  With standard RPC, the calls are typically validated either client-side or server-side and then executed on the machine running the RPC server if found valid.  This is not necessarily the case with messaging in that there is no concept of validation and the message just goes to the broker which will decide where it should go.

The promises used by celery are a convenience that may be used to react to some result of a task being processed.  They can just as easily be dropped on the floor and completely forgotten about.  They can be used in cases where you need to perform a synchronous call and wait on a result (folder creation is an example).  Another use case is to use them in a thread where you want to display to the user some result of a message by polling the promises periodically to check if there is a result/error.  You could also use a callback to immediately be notified when a resolution has occurred.

Promises offer an extra layer to be used as you will, but you are correct in saying they are part of the Celery framework in this instance.



For more options, visit https://groups.google.com/d/optout.



--
-tony

Tony Barbieri

unread,
Jun 2, 2014, 11:37:53 AM6/2/14
to python_in...@googlegroups.com
I suppose when I think of validation, I think of something as either being correct or not.  With RPC it's immediately clear when a client call does not fulfill the contract with the server. 

In the case of messaging, a message with "incorrect routing information" would just not reach a destination.  Even saying routing information is incorrect is not really...correct.  I would say the message routing is parsed and it either gets directed to a destination or it doesn't...It's not valid or invalid really, it just doesn't meet any criteria to reach a destination.  It's not going to raise an error on the broker, it's just going to be ignored or may end up in a default destination if one has been configured.  It's a subtle difference but it's part of the "forgiving" nature of messaging.



For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
Jun 2, 2014, 11:52:41 AM6/2/14
to python_in...@googlegroups.com

Ok, how about this.

>>> proxy.send({'do': 'make me a sandwich', 'with': 'mustard, and tomatoes', 'toasted?': 'yes'})

For me to know that ‘do’ is a valid key to send to my router, wouldn’t I first have to know about it? That there is a key called ‘do’? And wouldn’t I also have to know what can be stored as a value for that key?

If I mispelt ‘do’, wouldn’t my message be “incorrect”?

What I’m trying to get at, is that, regardless of how forgiving messaging is, or your router, you would at some point need to know what you can send to retrieve the results or have the actions performed that you are looking for.

At some point, you will have to type a carefully formatted message somewhere. And, like with physical mail, you can forget to put the postcode in, and you can forget to put a stamp on it. In this case, wouldn’t the message be “incorrect”? As you knew where you wanted it to go, but it won’t get there.

This, what would you call this?




For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Tony Barbieri

unread,
Jun 2, 2014, 12:02:56 PM6/2/14
to python_in...@googlegroups.com
This is a pretty cool implementation.  Another option if you are supporting python 2.6 or greater would be to use class decorators.  They may make it more apparent that something special is happening with that class rather than having to know the details of metaclasses.

class Factory(object):

    @classmethod
    def register_handler(cls, handler_cls):
        if not hasattr(cls, 'registry'):
           cls.registry = dict()
           key = getattr(cls, 'key', handler_cls.__name__.lower())
           cls.registry[key] = handler_cls

@Factory.register_handler
class Letter(object):

    def execute(self, receiver, envelope):
        pass
Just another option :).



For more options, visit https://groups.google.com/d/optout.



--
-tony

Tony Barbieri

unread,
Jun 2, 2014, 12:08:53 PM6/2/14
to python_in...@googlegroups.com
If I mispelt ‘do’, wouldn’t my message be “incorrect”?

In the eyes of your framework?  Sure.  In the eyes of a traditional message broker?  Nope.  It just wouldn't reach a destination but the broker isn't going to consider it invalid.  In the case of RPC if you tried to do:

rpc_server.doo(...)
That would be considered against contract (as doo doesn't exist) and throw an exception right away.  Again, this is all semantics!  However you end up doing it to fit the needs of your project is your call.  At this point I'm not even sure what we're talking about any longer :).

This, what would you call this?

From the perspective of your framework this malformed call would be considered invalid.




For more options, visit https://groups.google.com/d/optout.



--
-tony

Marcus Ottosson

unread,
Jun 2, 2014, 12:27:14 PM6/2/14
to python_in...@googlegroups.com

Another option if you are supporting python 2.6 or greater would be to use class decorators. They may make it more apparent that something special is happening with that class rather than having to know the details of metaclasses.

Yes! That’s a good point. Relieves me from having to expose people from the horrors of metaclasses. Thanks.

At this point I’m not even sure what we’re talking about any longer :)

I feel the same way.. I’m trying to get some wording going so we I can ask questions about certain things, but it isn’t going too well! At this point, RPC is the same as messaging, and no message is invalid, and RPC has a contract, messages do not (even though they do with protobuf) etc etc.

Let’s skip that, thanks for sticking with me anyways. :)

ps. and I like your code-theme. didn't know you could customize it like that. ;) (monokai sublime ftw)



For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Marcus Ottosson

unread,
Jun 3, 2014, 4:07:01 PM6/3/14
to python_in...@googlegroups.com
Hey Tony, I just saw your presentation here.

Good stuff :)
--
Marcus Ottosson
konstr...@gmail.com

Tony Barbieri

unread,
Jun 3, 2014, 4:14:04 PM6/3/14
to python_in...@googlegroups.com
Thanks!  A lot of it is already outdated :).



For more options, visit https://groups.google.com/d/optout.



--
-tony
Reply all
Reply to author
Forward
0 new messages