Using pysage for production?

21 views
Skip to first unread message

Glazner

unread,
Sep 21, 2010, 10:15:18 AM9/21/10
to pysage
Hi,

Do you think pysage is production ready?
I need to rewrite software that distributes (CPU bound) computations
across computers.
Anyone has a an example of such? (i.e. Prime calculation)

Many Thanks,

Yoav

John Yang

unread,
Sep 21, 2010, 6:35:12 PM9/21/10
to pys...@googlegroups.com
Hi Yoav:

I personally use pysage in production and I know pysage is used in
production sites like crowdspring.com.

From the networking perspective, pysage is known to perform well in
production with 100+ concurrent clients with low memory and CPU
footprint. I believe it can handle much more. But I haven't heard it
being used in that kind of environment yet.

Whether pysage will be production ready for your specific case depends
on your requirements. For prime calculation, I would assume you would
use reliable transports, a central pysage node that dispatches
reliable messages to all your clients and gathers result back
reliably. That's very easily done with pysage.

Pysage makes parallel computing with multi-core machines very easy.
For example, in your server, you can spin up different pysage groups
for networking messaging, task distribution, and result processing.
All three groups will operate in their own independent OS process and
can easily communicate with each other. In your client, you could
spin up multiple worker group to do the computing (if you have
multiple cores, or if your computation blocks on certain IO).

What do you expect the size of messages be? Do you need any kind of
security built on top of the networking?

Tell me a bit more about your requirement, I'll be happy to provide an example.

John

> --
> You received this message because you are subscribed to the Google Groups "pysage" group.
> To post to this group, send email to pys...@googlegroups.com.
> To unsubscribe from this group, send email to pysage+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/pysage?hl=en.
>
>

yoav glazner

unread,
Sep 22, 2010, 2:49:43 AM9/22/10
to pys...@googlegroups.com
Hi!

On Wed, Sep 22, 2010 at 12:35 AM, John Yang <bigj...@gmail.com> wrote:
Hi Yoav:

I personally use pysage in production and I know pysage is used in
production sites like crowdspring.com.

From the networking perspective, pysage is known to perform well in
production with 100+ concurrent clients with low memory and CPU
footprint.  I believe it can handle much more.  But I haven't heard it
being used in that kind of environment yet.

Whether pysage will be production ready for your specific case depends
on your requirements.  For prime calculation, I would assume you would
use reliable transports, a central pysage node that dispatches
reliable messages to all your clients and gathers result back
reliably.  That's very easily done with pysage.
 
Yeah, reliability is most important


Pysage makes parallel computing with multi-core machines very easy.
For example, in your server, you can spin up different pysage groups
for networking messaging, task distribution, and result processing.
All three groups will operate in their own independent OS process and
can easily communicate with each other.  In your client, you could
spin up multiple worker group to do the computing (if you have
multiple cores, or if your computation blocks on certain IO).

What do you expect the size of messages be?  Do you need any kind of
security built on top of the networking?

Tell me a bit more about your requirement, I'll be happy to provide an example.

John

Thanks for the quick reply!
I'm producing crypto-keys in a private network with 50~ machines (each with one core)
I don't need to send big data to the "workers", and currectly the workers will write the results to a network shared hardisk.

Do you have any close example? the network example on the site is so simple... (no result gathering...) I want to see how remote nodes knows who is the sender of a message so they can reply.

Yoav

John Yang

unread,
Sep 23, 2010, 3:25:18 PM9/23/10
to pys...@googlegroups.com
> I don't need to send big data to the "workers", and currectly the workers
> will write the results to a network shared hardisk.
> Do you have any close example? the network example on the site is so
> simple... (no result gathering...) I want to see how remote nodes knows who
> is the sender of a message so they can reply.

OK. I'm working on extending the network example in the doc.
Meanwhile, there are some example code in the unit tests that may help
a bit, particularly:

http://code.google.com/p/pysage/source/browse/trunk/tests/test_network.py

In generally, you clients should know which server to connect to ahead
of time. The server, upon receiving the request from the client to
act as a computing "slave", will distinguish them by their
"sender/address" info. I will try to have an example for you on this
tonight.

John

yoav glazner

unread,
Sep 23, 2010, 4:56:41 PM9/23/10
to pys...@googlegroups.com
Thanks! I would love to see it!
one more thing, what about reliability in the sense of a dead "slave"?(computer exploded or something)

John Yang

unread,
Sep 24, 2010, 12:54:06 AM9/24/10
to pys...@googlegroups.com
> one more thing, what about reliability in the sense of a dead
> "slave"?(computer exploded or something)

The "reliability" used here is strictly in the sense of a network
protocol. Reliable messaging often refers to when you've sent a
series of messages to another node, you expect them to arrive reliably
and you expect them to arrive in order.

In your server code, you can detect that if a slave hasn't responded
within 1 hour or if the slave had closed its connection without
sending back the result, it is considered "dead". Therefore, you can
re-distribute that piece of work to some other "slave".

HTH,

John

John Yang

unread,
Sep 24, 2010, 2:00:09 AM9/24/10
to pys...@googlegroups.com
Hi Yoav:

As promised, here is an example illustrating how you can distribute
computation using pysage:

http://code.google.com/p/pysage/source/browse/#svn/trunk/example/cluster

There are three modules:

server, client, and common

They are very short and they do simple additions from 1 to 1000
distributing half of the work to each slave and combine the results.
But I believe they illustrate in the simplest way what you are trying
to achieve.

Run 1 instance of the server and 2 instances of the client on the same machine:

Result:

* got a new slave from ('127.0.0.1', 46776)
* got a new slave from ('127.0.0.1', 46777)
* got two slaves, now distributing work
* slave ('127.0.0.1', 46776) got done with his work: 374750
* slave ('127.0.0.1', 46777) got done with his work: 124750
* work is all done, the final result is: 499500

HTH,

John

John Yang

unread,
Sep 24, 2010, 2:01:06 AM9/24/10
to pys...@googlegroups.com
When I get the chance, I will add this to the documentation as well.
But for now, just play around with the code.

yoav glazner

unread,
Sep 24, 2010, 2:42:32 AM9/24/10
to pys...@googlegroups.com
On Fri, Sep 24, 2010 at 8:00 AM, John Yang <bigj...@gmail.com> wrote:
Hi Yoav:

As promised, here is an example illustrating how you can distribute
computation using pysage:

http://code.google.com/p/pysage/source/browse/#svn/trunk/example/cluster

There are three modules:

server, client, and common

They are very short and they do simple additions from 1 to 1000
distributing half of the work to each slave and combine the results.
But I believe they illustrate in the simplest way what you are trying
to achieve.

Run 1 instance of the server and 2 instances of the client on the same machine:

Result:

* got a new slave from ('127.0.0.1', 46776)
* got a new slave from ('127.0.0.1', 46777)
* got two slaves, now distributing work
* slave ('127.0.0.1', 46776) got done with his work: 374750
* slave ('127.0.0.1', 46777) got done with his work: 124750
* work is all done, the final result is: 499500

HTH,

John
Cool! it looks good, I will speak to my team when i get back to work, mean while I'll play with this.

Many thanks,

Yoav. 

yoav glazner

unread,
Oct 3, 2010, 7:53:25 AM10/3/10
to pys...@googlegroups.com
Hi,

On Fri, Sep 24, 2010 at 8:00 AM, John Yang <bigj...@gmail.com> wrote:
Hi Yoav:

As promised, here is an example illustrating how you can distribute
computation using pysage:

http://code.google.com/p/pysage/source/browse/#svn/trunk/example/cluster

There are three modules:

server, client, and common

They are very short and they do simple additions from 1 to 1000
distributing half of the work to each slave and combine the results.
But I believe they illustrate in the simplest way what you are trying
to achieve.

Run 1 instance of the server and 2 instances of the client on the same machine:

Result:

* got a new slave from ('127.0.0.1', 46776)
* got a new slave from ('127.0.0.1', 46777)
* got two slaves, now distributing work
* slave ('127.0.0.1', 46776) got done with his work: 374750
* slave ('127.0.0.1', 46777) got done with his work: 124750
* work is all done, the final result is: 499500

Your example is a bit weird to me, the clients are the slaves, I would make the servers slaves so any client can use their processing power.

One more thing, do you use threads to keep connections? (that could affect scalability)

 

John Yang

unread,
Oct 3, 2010, 12:33:42 PM10/3/10
to pys...@googlegroups.com
Hi Yoav:

> Your example is a bit weird to me, the clients are the slaves, I would make
> the servers slaves so any client can use their processing power.

Yea, you can easily do the reverse. You may still need to implement a
way to keep track of which servers are under load, so your client
knows which other servers to send the work to.

> One more thing, do you use threads to keep connections? (that could
> affect scalability)

Pysage doesn't use threads, period. There hasn't been a need to. By
default, pysage is single threaded and the default TCP transport is
using async sockets. This is ideal for processing fast tasks. For
processing things that block, you can easily fire up a separate pysage
group that exclusively handles "computing". So you will have one
process doing IO and another process for computing. Internally, the
two OS processes communicate with each other via a domain socket.
Nothing is ever stepping on each other's toes if you don't design them
to.

yoav glazner

unread,
Oct 5, 2010, 2:22:50 AM10/5/10
to pys...@googlegroups.com
Hi,

On Sun, Oct 3, 2010 at 6:33 PM, John Yang <bigj...@gmail.com> wrote:
Hi Yoav:

> Your example is a bit weird to me, the clients are the slaves, I would make
> the servers slaves so any client can use their processing power.

Yea, you can easily do the reverse.  You may still need to implement a
way to keep track of which servers are under load, so your client
knows which other servers to send the work to.

I was thinking of a pull mode approach such as the slave asks for work. But it seems like I'll need to keep an eye for the slave in case of a crash :(
So I guess the "supervisor" should keep an eye for slaves that are registered for my client(this make sense?)

Do pySage have an autodicovery feature, or will I have to know who are the possible Servers(slaves)?
 
> One more thing, do you use threads to keep connections? (that could
> affect scalability)

Pysage doesn't use threads, period.  There hasn't been a need to.  By
default, pysage is single threaded and the default TCP transport is
using async sockets.  This is ideal for processing fast tasks.  For
processing things that block, you can easily fire up a separate pysage
group that exclusively handles "computing".  So you will have one
process doing IO and another process for computing.  Internally, the
two OS processes communicate with each other via a domain socket.
Nothing is ever stepping on each other's toes if you don't design them
to.

Thats great, that means one client could have 100,000* slaves  with no hassle :)

Many Thanks!
  Yoav

*totally made up this number

John Yang

unread,
Oct 7, 2010, 12:54:38 AM10/7/10
to pys...@googlegroups.com
On Tue, Oct 5, 2010 at 1:22 AM, yoav glazner <yoavg...@gmail.com> wrote:
> Hi,
> On Sun, Oct 3, 2010 at 6:33 PM, John Yang <bigj...@gmail.com> wrote:
>>
>> Hi Yoav:
>>
>> > Your example is a bit weird to me, the clients are the slaves, I would
>> > make
>> > the servers slaves so any client can use their processing power.
>>
>> Yea, you can easily do the reverse.  You may still need to implement a
>> way to keep track of which servers are under load, so your client
>> knows which other servers to send the work to.
>>
> I was thinking of a pull mode approach such as the slave asks for work. But
> it seems like I'll need to keep an eye for the slave in case of a crash :(
> So I guess the "supervisor" should keep an eye for slaves that are
> registered for my client(this make sense?)
> Do pySage have an autodicovery feature, or will I have to know who are the
> possible Servers(slaves)?
>

I think I can make a case for either approach. The important thing is
that you at least need to have one master server that has a permanent
"address" that other client or servers connect to. The master server
can then introduce these "dynamic" nodes to each other.

Pysage does not provide an autodiscovery mechanism. Although,
depending on what you need, such a feature isn't hard to implement.

>>
>> > One more thing, do you use threads to keep connections? (that could
>> > affect scalability)
>>
>> Pysage doesn't use threads, period.  There hasn't been a need to.  By
>> default, pysage is single threaded and the default TCP transport is
>> using async sockets.  This is ideal for processing fast tasks.  For
>> processing things that block, you can easily fire up a separate pysage
>> group that exclusively handles "computing".  So you will have one
>> process doing IO and another process for computing.  Internally, the
>> two OS processes communicate with each other via a domain socket.
>> Nothing is ever stepping on each other's toes if you don't design them
>> to.
>
> Thats great, that means one client could have 100,000* slaves  with
> no hassle :)

That number will mostly depend on how you implement your actors. If
you don't block, pysage won't block for you :).

> Many Thanks!
>   Yoav
> *totally made up this number
>

yoav glazner

unread,
Oct 7, 2010, 1:08:31 AM10/7/10
to pys...@googlegroups.com
> I was thinking of a pull mode approach such as the slave asks for work. But
> it seems like I'll need to keep an eye for the slave in case of a crash :(
> So I guess the "supervisor" should keep an eye for slaves that are
> registered for my client(this make sense?)
> Do pySage have an autodicovery feature, or will I have to know who are the
> possible Servers(slaves)?
>

I think I can make a case for either approach.  The important thing is
that you at least need to have one master server that has a permanent
"address" that other client or servers connect to.  The master server
can then introduce these "dynamic" nodes to each other.

Pysage does not provide an autodiscovery mechanism.  Although,
depending on what you need, such a feature isn't hard to implement.

Yes I see, such naming service Actor ain't hard to implement
 
>>
>> > One more thing, do you use threads to keep connections? (that could
>> > affect scalability)
>>
>> Pysage doesn't use threads, period.  There hasn't been a need to.  By
>> default, pysage is single threaded and the default TCP transport is
>> using async sockets.  This is ideal for processing fast tasks.  For
>> processing things that block, you can easily fire up a separate pysage
>> group that exclusively handles "computing".  So you will have one
>> process doing IO and another process for computing.  Internally, the
>> two OS processes communicate with each other via a domain socket.
>> Nothing is ever stepping on each other's toes if you don't design them
>> to.
>
> Thats great, that means one client could have 100,000* slaves  with
> no hassle :)

That number will mostly depend on how you implement your actors.  If
you don't block, pysage won't block for you :).

cool!
Thanks a lot for all your answers I hope this thread will be useful for others that are checking this issue.
Reply all
Reply to author
Forward
0 new messages