I personally use pysage in production and I know pysage is used in
production sites like crowdspring.com.
From the networking perspective, pysage is known to perform well in
production with 100+ concurrent clients with low memory and CPU
footprint. I believe it can handle much more. But I haven't heard it
being used in that kind of environment yet.
Whether pysage will be production ready for your specific case depends
on your requirements. For prime calculation, I would assume you would
use reliable transports, a central pysage node that dispatches
reliable messages to all your clients and gathers result back
reliably. That's very easily done with pysage.
Pysage makes parallel computing with multi-core machines very easy.
For example, in your server, you can spin up different pysage groups
for networking messaging, task distribution, and result processing.
All three groups will operate in their own independent OS process and
can easily communicate with each other. In your client, you could
spin up multiple worker group to do the computing (if you have
multiple cores, or if your computation blocks on certain IO).
What do you expect the size of messages be? Do you need any kind of
security built on top of the networking?
Tell me a bit more about your requirement, I'll be happy to provide an example.
John
> --
> You received this message because you are subscribed to the Google Groups "pysage" group.
> To post to this group, send email to pys...@googlegroups.com.
> To unsubscribe from this group, send email to pysage+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/pysage?hl=en.
>
>
Hi Yoav:
I personally use pysage in production and I know pysage is used in
production sites like crowdspring.com.
From the networking perspective, pysage is known to perform well in
production with 100+ concurrent clients with low memory and CPU
footprint. I believe it can handle much more. But I haven't heard it
being used in that kind of environment yet.
Whether pysage will be production ready for your specific case depends
on your requirements. For prime calculation, I would assume you would
use reliable transports, a central pysage node that dispatches
reliable messages to all your clients and gathers result back
reliably. That's very easily done with pysage.
Pysage makes parallel computing with multi-core machines very easy.
For example, in your server, you can spin up different pysage groups
for networking messaging, task distribution, and result processing.
All three groups will operate in their own independent OS process and
can easily communicate with each other. In your client, you could
spin up multiple worker group to do the computing (if you have
multiple cores, or if your computation blocks on certain IO).
What do you expect the size of messages be? Do you need any kind of
security built on top of the networking?
Tell me a bit more about your requirement, I'll be happy to provide an example.
John
OK. I'm working on extending the network example in the doc.
Meanwhile, there are some example code in the unit tests that may help
a bit, particularly:
http://code.google.com/p/pysage/source/browse/trunk/tests/test_network.py
In generally, you clients should know which server to connect to ahead
of time. The server, upon receiving the request from the client to
act as a computing "slave", will distinguish them by their
"sender/address" info. I will try to have an example for you on this
tonight.
John
The "reliability" used here is strictly in the sense of a network
protocol. Reliable messaging often refers to when you've sent a
series of messages to another node, you expect them to arrive reliably
and you expect them to arrive in order.
In your server code, you can detect that if a slave hasn't responded
within 1 hour or if the slave had closed its connection without
sending back the result, it is considered "dead". Therefore, you can
re-distribute that piece of work to some other "slave".
HTH,
John
As promised, here is an example illustrating how you can distribute
computation using pysage:
http://code.google.com/p/pysage/source/browse/#svn/trunk/example/cluster
There are three modules:
server, client, and common
They are very short and they do simple additions from 1 to 1000
distributing half of the work to each slave and combine the results.
But I believe they illustrate in the simplest way what you are trying
to achieve.
Run 1 instance of the server and 2 instances of the client on the same machine:
Result:
* got a new slave from ('127.0.0.1', 46776)
* got a new slave from ('127.0.0.1', 46777)
* got two slaves, now distributing work
* slave ('127.0.0.1', 46776) got done with his work: 374750
* slave ('127.0.0.1', 46777) got done with his work: 124750
* work is all done, the final result is: 499500
HTH,
John
Hi Yoav:
As promised, here is an example illustrating how you can distribute
computation using pysage:
http://code.google.com/p/pysage/source/browse/#svn/trunk/example/cluster
There are three modules:
server, client, and common
They are very short and they do simple additions from 1 to 1000
distributing half of the work to each slave and combine the results.
But I believe they illustrate in the simplest way what you are trying
to achieve.
Run 1 instance of the server and 2 instances of the client on the same machine:
Result:
* got a new slave from ('127.0.0.1', 46776)
* got a new slave from ('127.0.0.1', 46777)
* got two slaves, now distributing work
* slave ('127.0.0.1', 46776) got done with his work: 374750
* slave ('127.0.0.1', 46777) got done with his work: 124750
* work is all done, the final result is: 499500
HTH,
John
Hi Yoav:
As promised, here is an example illustrating how you can distribute
computation using pysage:
http://code.google.com/p/pysage/source/browse/#svn/trunk/example/cluster
There are three modules:
server, client, and common
They are very short and they do simple additions from 1 to 1000
distributing half of the work to each slave and combine the results.
But I believe they illustrate in the simplest way what you are trying
to achieve.
Run 1 instance of the server and 2 instances of the client on the same machine:
Result:
* got a new slave from ('127.0.0.1', 46776)
* got a new slave from ('127.0.0.1', 46777)
* got two slaves, now distributing work
* slave ('127.0.0.1', 46776) got done with his work: 374750
* slave ('127.0.0.1', 46777) got done with his work: 124750
* work is all done, the final result is: 499500
> Your example is a bit weird to me, the clients are the slaves, I would make
> the servers slaves so any client can use their processing power.
Yea, you can easily do the reverse. You may still need to implement a
way to keep track of which servers are under load, so your client
knows which other servers to send the work to.
> One more thing, do you use threads to keep connections? (that could
> affect scalability)
Pysage doesn't use threads, period. There hasn't been a need to. By
default, pysage is single threaded and the default TCP transport is
using async sockets. This is ideal for processing fast tasks. For
processing things that block, you can easily fire up a separate pysage
group that exclusively handles "computing". So you will have one
process doing IO and another process for computing. Internally, the
two OS processes communicate with each other via a domain socket.
Nothing is ever stepping on each other's toes if you don't design them
to.
Hi Yoav:
Yea, you can easily do the reverse. You may still need to implement a
> Your example is a bit weird to me, the clients are the slaves, I would make
> the servers slaves so any client can use their processing power.
way to keep track of which servers are under load, so your client
knows which other servers to send the work to.
> One more thing, do you use threads to keep connections? (that couldPysage doesn't use threads, period. There hasn't been a need to. By
> affect scalability)
default, pysage is single threaded and the default TCP transport is
using async sockets. This is ideal for processing fast tasks. For
processing things that block, you can easily fire up a separate pysage
group that exclusively handles "computing". So you will have one
process doing IO and another process for computing. Internally, the
two OS processes communicate with each other via a domain socket.
Nothing is ever stepping on each other's toes if you don't design them
to.
I think I can make a case for either approach. The important thing is
that you at least need to have one master server that has a permanent
"address" that other client or servers connect to. The master server
can then introduce these "dynamic" nodes to each other.
Pysage does not provide an autodiscovery mechanism. Although,
depending on what you need, such a feature isn't hard to implement.
>>
>> > One more thing, do you use threads to keep connections? (that could
>> > affect scalability)
>>
>> Pysage doesn't use threads, period. There hasn't been a need to. By
>> default, pysage is single threaded and the default TCP transport is
>> using async sockets. This is ideal for processing fast tasks. For
>> processing things that block, you can easily fire up a separate pysage
>> group that exclusively handles "computing". So you will have one
>> process doing IO and another process for computing. Internally, the
>> two OS processes communicate with each other via a domain socket.
>> Nothing is ever stepping on each other's toes if you don't design them
>> to.
>
> Thats great, that means one client could have 100,000* slaves with
> no hassle :)
That number will mostly depend on how you implement your actors. If
you don't block, pysage won't block for you :).
> Many Thanks!
> Yoav
> *totally made up this number
>
> I was thinking of a pull mode approach such as the slave asks for work. ButI think I can make a case for either approach. The important thing is
> it seems like I'll need to keep an eye for the slave in case of a crash :(
> So I guess the "supervisor" should keep an eye for slaves that are
> registered for my client(this make sense?)
> Do pySage have an autodicovery feature, or will I have to know who are the
> possible Servers(slaves)?
>
that you at least need to have one master server that has a permanent
"address" that other client or servers connect to. The master server
can then introduce these "dynamic" nodes to each other.
Pysage does not provide an autodiscovery mechanism. Although,
depending on what you need, such a feature isn't hard to implement.
>>That number will mostly depend on how you implement your actors. If
>> > One more thing, do you use threads to keep connections? (that could
>> > affect scalability)
>>
>> Pysage doesn't use threads, period. There hasn't been a need to. By
>> default, pysage is single threaded and the default TCP transport is
>> using async sockets. This is ideal for processing fast tasks. For
>> processing things that block, you can easily fire up a separate pysage
>> group that exclusively handles "computing". So you will have one
>> process doing IO and another process for computing. Internally, the
>> two OS processes communicate with each other via a domain socket.
>> Nothing is ever stepping on each other's toes if you don't design them
>> to.
>
> Thats great, that means one client could have 100,000* slaves with
> no hassle :)
you don't block, pysage won't block for you :).