ZeroRPC for heavy machine learning tasks

Guy Pikachu

unread,

May 1, 2014, 10:34:51 PM5/1/14

to zer...@googlegroups.com

Hi everyone!

I am using ZeroRPC to act as a communication path for submitting heavy machine learning analytics jobs.
I am quite new to scaling applications. My concern is whether the code below is scalable for many
job processes and many servers.

More specifically: on the second script below, when someone calls client.invoke method, it will take a very long time
to process (since it is a heavy machine learning task). If someone calls the client.invoke method, a thousand times,
 my concern is that my application will go really slow, because there are 1000 hanging requests which are waiting for responses
from the server.

Thanks for any help! Let me know if I should clarify anything.

import zerorpc
import this_takes_really_long_to_process from sklearn

class HelloRPC(object):
    def cluster(self, data):
        #Some crazy analytics will be performed, then the results of this will be sent to Node!
        return this_takes_really_long_to_process(data)

s = zerorpc.Server(HelloRPC())
s.bind("tcp://0.0.0.0:4242")
s.run()

var zerorpc = require("zerorpc");

var client = new zerorpc.Client();
client.connect("tcp://127.0.0.1:4242");

//Hmm, what if the code below gets executed a thousand times? Will there be performance issues due to many hanging requests?
client.invoke("cluster", "RPC", function(error, res, more) {
    console.log(res);
});

François-Xavier Bourlet

unread,

May 2, 2014, 3:00:38 PM5/2/14

to zer...@googlegroups.com

I assume your heavy processing is CPU bound (the opposite, IO bound
would spend its time waiting, thus available to process more task side
by side).

- Every request will have to wait after each others (makes sense
after all). If you want more truly concurrent execution, you will have
to throw more CPU cores at it: that's it, run more instance of your
server.

- On the server in Python, watch out for blocking too long the gevent
loop. If you do heaving processing that never cooperates with the
gevent loop, zerorpc wont be executing, which means loss of
heartbeats.
Depending of your project, you could cooperatively yield via
gevent.sleep(0). But most often this is not practicable and breaks
your code encapsulation. You could disable zerorpc heartbeats, but
that would be a shame. I will propose a more scalable solution below.

problem: Since you want to be able to compute tons of request
simultaneously, besides using faster algorithm and caching the best as
you can, you need to run many instances of your server.

possible solution:

On all your servers machines you have:
- A manager service:
- with a zerorpc server exposing your service on the network
- and a zerorpc client that binds to a local unix socket with
heartbeat disabled.

This proxy simply forward any request to the zerorpc client, thus
effectively spending all its time waiting for I/Os. So heartbeat works
wonder. You can cap the number of request/s as you wish, do any
assertion and I/O bound operation there directly etc.

You can spawn and baby sit all your CPU bound processes directly
(think about gevent.subprocess). Spawn as many task as you want,
restart them if they crash (hey don't forget to log what happened too
;)). Congrat, now you have a multi-process Python service!

- The heavy task process:
- with a zerorpc server connected to the local unix socket of the
manager, heartbeat disabled as well.

Receives the request from the local manager, execute, return
result. Its your current server code, except instead of binding to a
tcp port, it connects to a unix socket.

- On the client (or even an intermediate broker):
- you manage one zerorpc client per manager service. a simple
round-robin should do the trick. Using an intermediate broker or some
naming service can help managing the list of manager to connect to.

- You have to implement the round-robin/balance between all the
managers yourself because of a limitation of zmq<=4.1. (Maybe one day
zerorpc will be updated to the new features of zmq>=4.1).

Here you have it, some opinionated idea to scale your app.

Depending of your need of a high-level RPC or not, using one of the
zmq pattern directly might be easier/more scalable:
http://zguide.zeromq.org/page:all#toc86

Best,
fx

> --
> You received this message because you are subscribed to the Google Groups
> "zerorpc" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to zerorpc+u...@googlegroups.com.
> To post to this group, send email to zer...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/zerorpc/f76bccca-9546-4b6c-bea8-3608d8c50afb%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Andreu Correa Casablanca

unread,

Jun 17, 2014, 10:40:23 AM6/17/14

to zer...@googlegroups.com

Hi,

I'm also using ZeroRPC to create a machine learning cluster with an RPC API.

The way I handle is creating one worker (independent process) per CPU/core and using
Greenlets (which are greatly integrated with Gevent, a library used inside ZeroRPC for Python).

The way I'm launching the workers is described in my last question, the way I use greenlets is something
like:

# Method exposed in the RPC service
def gameplay_end(self, subcontent_uuid, player_uuid, os_uuid):
        """
        Handles the notification of an end of a game play session.
        """
        subcontent_uuid = _check_uuid(subcontent_uuid) # Don't care about this
        player_uuid = _check_uuid(player_uuid)         # Don't care about this
        os_uuid = _check_uuid(os_uuid)                 # Don't care about this

        # We compute new recommendations in a different NON-BLOCKING greenlet :D
        _NBCRA1(subcontent_uuid, player_uuid, os_uuid).start()


        return # I don't return nothing, but you can return what you want

The Greenlet object that I use to compute thinks in background is something like...

class NBCRA1(Greenlet):
    """
    This greenlet handles the computation of the
    Naive Baseline Content Recommendation Algorithm 1.
    """

    # Local cache for an immutable mongodb document
    _indicator_clusters = None

    def __init__(self, subcontent_uuid, player_uuid, os_uuid):
        # Variables initialization
        self.subcontent_uuid = subcontent_uuid
        # and so on...

        Greenlet.__init__(self) # It's important to call the Greenlet constructor

    def _run(self):
        # Here you should put the "heavy" tasks, that have access to the properties
        # that you've initializated in the __init__ method.
        pass

I hope this will be useful to you :) .

François-Xavier Bourlet

unread,

Jun 17, 2014, 2:30:57 PM6/17/14

to zer...@googlegroups.com

I think you should keep your objects independent of the intrinsic of
greenlet. NBCRA1 doesn't have to inherit from Greenlet.

Then you simply do gevent.spawn(NBCRA1_instance.compute())

Which will run the method in a gevent greenlet (not a "simple"
greenlet), and makes it obvious that you are deferring the completion
of the method. Note that if the code being deferred is CPU bound, this
doesn't give you any advantage at all. It just makes your code more
complicated to reason about (because now you have to take into account
async execution).

> --
> You received this message because you are subscribed to the Google Groups
> "zerorpc" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to zerorpc+u...@googlegroups.com.
> To post to this group, send email to zer...@googlegroups.com.
> To view this discussion on the web visit

> https://groups.google.com/d/msgid/zerorpc/991ab51e-e491-45c7-b48e-e7dbddab341f%40googlegroups.com.

Reply all

Reply to author

Forward