Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

distributed computing implementations

1,162 views
Skip to first unread message

robin

unread,
Apr 2, 2003, 11:29:15 AM4/2/03
to
I am attempting to summarise the distributed computing implementations
available to Python programmers. My conclusions so far are as follows:

If you want the biggest, boldest approach and don't care about
overhead, use CORBA.

If you want a simpler approach but with performance and implementation
difficulties, use SOAP or XML-RPC.

If you want the ultimate in simplicity and are willing to foresake
multi-language support, use Dopy, Pyro, or Twisted Spread.

Comments? Specifically, what are the relative advantages and
disadvantages of the last three mentioned products?

-- robin

Andrew Dalke

unread,
Apr 2, 2003, 12:25:20 PM4/2/03
to
robin:

> If you want the biggest, boldest approach and don't care about
> overhead, use CORBA.

What overhead would this be? From what I see of omniORB,
there isn't really that much. Also, CORBA is the most complete,
eg, it allows callbacks and passing around object references
the others don't have.

I've wanted to do something in CORBA for years. I've
never gotten there. One problem is that I'm used to Python,
where I don't need to describe the interface beforhand.
CORBA wants that IDL, and a change in the object's interface
must be reflected in the IDL. That just seems tedious to
me now.

I also do nearly everything in Python, so don't need the
ability for different langauges to interoperate. I just pass
around Python objects.

I keep hoping that one of the component systems (like
in GNOME or KDE) takes off, so that "scripting" a la
COM takes off in unix systems, but I've been hoping
for that for the last 5 years.

> If you want a simpler approach but with performance and
> implementation difficulties, use SOAP or XML-RPC.

I wouldn't put SOAP as simple, and I've had problems with
interoperability between various packages. If I needed to pass
simple data around, I would use XML-RPC.

> If you want the ultimate in simplicity and are willing to foresake
> multi-language support, use Dopy, Pyro, or Twisted Spread.

One reason I'm looking at Twisted is because it handles other
interfaces as well. I need to talk to SQL databases, SOAP and
XML-RPC servers, straight HTTP, and spawned off external
processes.

There is also older interfaces, like PVM and MPI from the
high-performance computing world, and Linda, and more,
but I haven't looked into them for years.

Andrew
da...@dalkescientific.com


Cameron Laird

unread,
Apr 2, 2003, 1:00:36 PM4/2/03
to
In article <b6f66v$28k$1...@slb3.atl.mindspring.net>,

Andrew Dalke <ada...@mindspring.com> wrote:
>robin:
>> If you want the biggest, boldest approach and don't care about
>> overhead, use CORBA.
>
>What overhead would this be? From what I see of omniORB,
>there isn't really that much. Also, CORBA is the most complete,
>eg, it allows callbacks and passing around object references
>the others don't have.
At a technical level, CORBA's as easy for an application
developer as, say, SOAP. It has a lot of overhead, though,
in the muddy thoughts of managers.
.
.

.
>I wouldn't put SOAP as simple, and I've had problems with
>interoperability between various packages. If I needed to pass
>simple data around, I would use XML-RPC.
My first instinct is to write my own tiny protocol, and
pass strings around. Often enough, that's all I need--
and look! we're back to language-neutrality.

>
>> If you want the ultimate in simplicity and are willing to foresake
>> multi-language support, use Dopy, Pyro, or Twisted Spread.
>
>One reason I'm looking at Twisted is because it handles other
>interfaces as well. I need to talk to SQL databases, SOAP and
>XML-RPC servers, straight HTTP, and spawned off external
>processes.
>
>There is also older interfaces, like PVM and MPI from the
>high-performance computing world, and Linda, and more,
>but I haven't looked into them for years.
Tuple space and parallel architectures can be FAR more
"efficient" for scientific and engineering problems.
SOAP and such are just concessions to commercial misunder-
standings about what business needs.

I write this as a SOAP specialist and advocate.
.
.
.

--

Cameron Laird <Cam...@Lairds.com>
Business: http://www.Phaseit.net
Personal: http://phaseit.net/claird/home.html

Jp Calderone

unread,
Apr 2, 2003, 12:51:56 PM4/2/03
to
On Wed, Apr 02, 2003 at 10:25:20AM -0700, Andrew Dalke wrote:
> robin:
> > If you want the biggest, boldest approach and don't care about
> > overhead, use CORBA.
>
> What overhead would this be? From what I see of omniORB,
> there isn't really that much. Also, CORBA is the most complete,
> eg, it allows callbacks and passing around object references
> the others don't have.
>

I'm not sure about XML-RPC, SOAP, or Dopy, but I know Twisted Spread can
pass references around like this (and you can tell it just -how- you want
them passed), and I think Pyro can too.

> I've wanted to do something in CORBA for years. I've
> never gotten there. One problem is that I'm used to Python,
> where I don't need to describe the interface beforhand.
> CORBA wants that IDL, and a change in the object's interface
> must be reflected in the IDL. That just seems tedious to
> me now.
>

Right. I think this is at least part of the overhead Robin was talking
about. Several of the other schemes don't require this.

> I also do nearly everything in Python, so don't need the
> ability for different langauges to interoperate. I just pass
> around Python objects.
>
> I keep hoping that one of the component systems (like
> in GNOME or KDE) takes off, so that "scripting" a la
> COM takes off in unix systems, but I've been hoping
> for that for the last 5 years.

That'd be nice.

>
> > If you want a simpler approach but with performance and
> > implementation difficulties, use SOAP or XML-RPC.
>
> I wouldn't put SOAP as simple, and I've had problems with
> interoperability between various packages. If I needed to pass
> simple data around, I would use XML-RPC.

Personally, I think SOAP is worth ignoring. ;) XML-RPC is what I would
choose if I were going for that sort of solution, too.

>
> > If you want the ultimate in simplicity and are willing to foresake
> > multi-language support, use Dopy, Pyro, or Twisted Spread.
>
> One reason I'm looking at Twisted is because it handles other
> interfaces as well. I need to talk to SQL databases, SOAP and
> XML-RPC servers, straight HTTP, and spawned off external
> processes.

This is definitely a benefit. (Hooray, integration).

From what you've said here and in your original post, I think Spread will
probably be a pretty good fit for you.

HTH,

Jp

--
"The problem is, of course, that not only is economics bankrupt but it has
always been nothing more than politics in disguise ... economics is a form
of brain damage." -- Hazel Henderson
--
up 13 days, 14:00, 7 users, load average: 0.15, 0.14, 0.07

Martin v. Löwis

unread,
Apr 2, 2003, 4:42:14 PM4/2/03
to
escala...@yahoo.com (robin) writes:

> If you want the biggest, boldest approach and don't care about
> overhead, use CORBA.

I can't agree with that analysis: CORBA has, of all your alternatives,
the least network bandwidth requirements.

Regards,
Martin

Graham Dumpleton

unread,
Apr 2, 2003, 10:16:59 PM4/2/03
to
escala...@yahoo.com (robin) wrote in message news:<a626bbd7.03040...@posting.google.com>...

You might want to consider looking at OSE as well. It isn't pure Python
but then that can be a good thing depending on what you are doing.

The web site for OSE is:

http://ose.sourceforge.net

Have a look through the "Python Manual" link on the web site.

Tim Hoffman

unread,
Apr 3, 2003, 6:29:02 AM4/3/03
to
For simple cases you might like to look at ZODB

Tim

robin

unread,
Apr 3, 2003, 2:46:39 PM4/3/03
to
I wrote:
> If you want the biggest, boldest approach and don't care about
> overhead, use CORBA.

mar...@v.loewis.de (Martin v. Löwis) wrote:
> I can't agree with that analysis: CORBA has, of all your alternatives,
> the least network bandwidth requirements.

I was referring to overhead like:
a) writing the interface definition and dealing with ID strings
b) more code
c) size of the "library"

I am quite sure the other implementations of DC are lighter in these
areas, though for some that may not be an issue. I was not considering
network bandwidth, but am glad to hear CORBA is efficient in that
respect.

-- robin

robin

unread,
Apr 3, 2003, 2:49:05 PM4/3/03
to
cla...@lairds.com (Cameron Laird) wrote:

> SOAP and such are just concessions to commercial misunder-
> standings about what business needs.

I would like to know more about what you mean by this.

Jp Calderone <exa...@intarweb.us> wrote:

> Personally, I think SOAP is worth ignoring. ;) XML-RPC is what I would
> choose if I were going for that sort of solution, too.

I'd like more details here too. Why ignore SOAP?

-- robin

Martin v. Löwis

unread,
Apr 3, 2003, 5:57:25 PM4/3/03
to
escala...@yahoo.com (robin) writes:

> mar...@v.loewis.de (Martin v. Löwis) wrote:
> > I can't agree with that analysis: CORBA has, of all your alternatives,
> > the least network bandwidth requirements.
>
> I was referring to overhead like:
> a) writing the interface definition and dealing with ID strings

It is true that you have to write interface definitions. However,
considering that you are doing distributed computing, I can't really
see this as "overhead". Overhead compared to what?

What are ID strings, and why do you need them in CORBA?

> b) more code

Code written by yourself? Compared to what? A CORBA client in Python
is really short. A CORBA server is larger, but then, the servers
for other distributed computing infrastructures are also larger.

> c) size of the "library"

That depends on the implementation you use. For a client, it is
certainly true that XML-RPC libraries are significantly (factor 10)
smaller than the Fnorb libraries.

Regards,
Martin

Cameron Laird

unread,
Apr 5, 2003, 9:01:53 AM4/5/03
to
In article <a626bbd7.03040...@posting.google.com>,

robin <escala...@yahoo.com> wrote:
>cla...@lairds.com (Cameron Laird) wrote:
>
>> SOAP and such are just concessions to commercial misunder-
>> standings about what business needs.
>
>I would like to know more about what you mean by this.
.
.
.
I'll do this in an abbreviated form.

SOAP's s'posed to be the "Simple Object Access Protocol".
It's defining document begins, "SOAP is a lightweight
protocol ..."

It's a bad sign that it's fiction from the start. SOAP
isn't lightweight or simple, and it doesn't particularly
access objects.

SOAP is an RPC implementation. I'm fine with RPC, and I
like SOAP--'hope I get more jobs to do it during the next
year. However, I think commercial experience has demon-
strated adequately that RPC isn't safe in the hands of
the programming fraternity at large. It's something medi-
ocre programmers do wrong.

Businesses *think* they want their development crews to
standardize on an RPC, and XML is a good thing, isn't it?,
but they're wrong. RPC across organizational boundaries
turns out to be somewhere between difficult and a disaster.

Businesses that are happy with SOAP are actually using it
as a messaging service for asynchronous transmission of
XMLified documents with business content.

I repeat: for a mixture of correct and incorrect reasons,
XML, RPC, and so on are believed to be good things for
business. People conclude that SOAP must be a super-
technology, solving whole layers of issues at once. It's
not. It's OK, and, with enough support from Microsoft,
IBM, Oracle, and a few others, it certainly can dominate.
In truth, though, it answers the wrong question.

Python's own Paul Prescod has plenty to say about SOAP's
technical flaws. Check out <URL: http://
mail.python.org/pipermail/xml-sig/2002-February/007183.html >
and other references available through <URL: http://prescod.com >.

Irmen de Jong

unread,
Apr 5, 2003, 2:35:19 PM4/5/03
to
Jp Calderone wrote:
> On Wed, Apr 02, 2003 at 10:25:20AM -0700, Andrew Dalke wrote:
>
>>robin:
>>
>>>If you want the biggest, boldest approach and don't care about
>>>overhead, use CORBA.
>>
>>What overhead would this be? From what I see of omniORB,
>>there isn't really that much. Also, CORBA is the most complete,
>>eg, it allows callbacks and passing around object references
>>the others don't have.
>>
>
>
> I'm not sure about XML-RPC, SOAP, or Dopy, but I know Twisted Spread can
> pass references around like this (and you can tell it just -how- you want
> them passed), and I think Pyro can too.

Certainly. You can pass around the actual Pyro proxy objects, without
bothering about object location, ID, etc, or pass around the object's UID.

Pyro also allows callbacks, although you have to do a little bit of extra
coding to enable them.


>>I've wanted to do something in CORBA for years. I've
>>never gotten there. One problem is that I'm used to Python,
>>where I don't need to describe the interface beforhand.
>>CORBA wants that IDL, and a change in the object's interface
>> must be reflected in the IDL. That just seems tedious to
>>me now.
>>
>
>
> Right. I think this is at least part of the overhead Robin was talking
> about. Several of the other schemes don't require this.

>
>
>>I also do nearly everything in Python, so don't need the
>>ability for different langauges to interoperate. I just pass
>>around Python objects.

I'd say: use Pyro (but I'm biased ofcourse ;=)
You won't have to specify an interface other than your regular
Python class, and it's designed for a Python-only environment.

--Irmen de Jong.

robin

unread,
Apr 5, 2003, 3:33:03 PM4/5/03
to
cla...@lairds.com (Cameron Laird) wrote:

>I'll do this in an abbreviated form.

Thank you. That was very informative. At least it confirms my
suspicions.

I suppose now I'll be looking closer at Dopy, Pyro, and Twisted
Spread, so anything that will help me distinguish between them is
welcome.

-- robin

Duncan Grisby

unread,
Apr 6, 2003, 4:34:36 PM4/6/03
to

>I am attempting to summarise the distributed computing implementations
>available to Python programmers. My conclusions so far are as follows:
>
>If you want the biggest, boldest approach and don't care about
>overhead, use CORBA.

You might want to read the slides to the presentation I gave at the UK
Python conference last week. The title is "CORBA? Isn't that
Obsolete?". You can get it here:

http://www.grisby.org/presentations/accu2003.pdf

Cheers,

Duncan.

--
-- Duncan Grisby --
-- dun...@grisby.org --
-- http://www.grisby.org --

Duncan Grisby

unread,
Apr 6, 2003, 4:37:03 PM4/6/03
to
In article <m3of3n6...@mira.informatik.hu-berlin.de>,

Martin v. Löwis <mar...@v.loewis.de> wrote:

>> c) size of the "library"
>
>That depends on the implementation you use. For a client, it is
>certainly true that XML-RPC libraries are significantly (factor 10)
>smaller than the Fnorb libraries.

It's also worth pointing out that, although Fnorb and omniORB are both
a megabyte or two in size, that's still significantly smaller than
Python itself, so size is not an issue for the vast majority of
applications written in Python.

Uche Ogbuji

unread,
Apr 10, 2003, 11:04:40 AM4/10/03
to
mar...@v.loewis.de (Martin v. Löwis) wrote in message news:<m3r88kk...@mira.informatik.hu-berlin.de>...

True, and Mike Olson did some pretty thorough analysis on this. See:

http://www-106.ibm.com/developerworks/library/ws-pyth9/


--Uche
http://uche.ogbuji.net

Robin Becker

unread,
Apr 10, 2003, 2:21:32 PM4/10/03
to
In article <d116fbae.03041...@posting.google.com>, Uche
Ogbuji <uc...@ogbuji.net> writes
>mar...@v.loewis.de (Martin v. Löwis) wrote in message news:<m3r88kk24p.fsf@mira.

>informatik.hu-berlin.de>...
>> escala...@yahoo.com (robin) writes:
>>
>> > If you want the biggest, boldest approach and don't care about
>> > overhead, use CORBA.
>>
>> I can't agree with that analysis: CORBA has, of all your alternatives,
>> the least network bandwidth requirements.
>
>True, and Mike Olson did some pretty thorough analysis on this. See:
>
>http://www-106.ibm.com/developerworks/library/ws-pyth9/
>
>
>--Uche
>http://uche.ogbuji.net

I'm interested to find out why the python server/client pair is so
abominably slow when going across our lightly loaded 100/10 Mbs
ethernet.


I modified the time-client.py script to allow setting of the server name
using an environ script.

When using the time-client on the same machine I see
C:\Python\tmp\test_servers>time-client.py
Connecting to ('localhost', 8080)
Time to connect: 0.040000

Sending a long string to the server
Time to send a string of 21000 chars, 0.000000

Recieving a long stirng from the server
Time to receive a string of 22000 chars, 0.000000

Sending lots of ints to the server
Time to send 5000 ints, 32.297000 (0.006459 per call)

On a machine on the same local net I see
R:\Python\tmp\test_servers>time-client.py
Connecting to ('192.168.0.3', 8080)
Time to connect: 0.000000

Sending a long string to the server
Time to send a string of 21000 chars, 0.000000

Recieving a long stirng from the server
Time to receive a string of 1455 chars, 0.000000

Sending lots of ints to the server
Time to send 5000 ints, 1007.937000 (0.201587 per call)


What am I missing that causes such painfully slow connections?
--
Robin Becker

Duncan Grisby

unread,
Apr 10, 2003, 8:03:33 PM4/10/03
to
In article <d116fbae.03041...@posting.google.com>,
Uche Ogbuji <uc...@ogbuji.net> wrote:

>True, and Mike Olson did some pretty thorough analysis on this. See:
>
>http://www-106.ibm.com/developerworks/library/ws-pyth9/

Interesting article. Unfortunately, the first two timings of the CORBA
client aren't timing what Mike thinks they are. The first time, for
"connecting to server" doesn't actually connect to the server at all.
It just creates an object reference for it.

The second time, for "send string" _is_ timing sending a string, but
that is also when the TCP connection to the server is made. A
significant portion of the time is spent setting up the connection,
not transferring the string. If I modify the client to do the first
call twice, I get

Connecting to server
Time to connect to server, 0.000386

Sending a long string to the server

Time to send a string of 21000 chars, 0.002099

Sending a long string to the server

Time to send a string of 21000 chars, 0.001034

Recieving a long stirng from the server

Time to receive a string of 22000 chars, 0.001088

Sending lots of ints to the server

Time to send 5000 ints, 0.921309 (0.000184 per call)


So you see that the first call takes about twice as long as the second
one. Another interesting thing if you care about raw speed is that
CORBA strings are not allowed to have embedded nulls in them, and
undergo code set conversion, and checking these things slows them
down. Using a sequence of octets, which doesn't have any checks or
conversions brings the time down to 0.000827 and 0.000839 for sending
and receiving respectively.

For comparison, the raw socket client has these times on my machine:

Connecting to server
Time to connect to server, 0.001273

Sending a long string to the server

Time to send a string of 21000 chars, 0.000988

Recieving a long stirng from the server

Time to receive a string of 22000 chars, 0.000873

Sending lots of ints to the server

Time to send 5000 ints, 10.290595 (0.002058 per call)


Lies, damn lies, and statistics...

Irmen de Jong

unread,
Apr 11, 2003, 12:39:08 PM4/11/03
to
Duncan Grisby wrote:

> Interesting article. Unfortunately, the first two timings of the CORBA
> client aren't timing what Mike thinks they are. The first time, for
> "connecting to server" doesn't actually connect to the server at all.
> It just creates an object reference for it.
>
> The second time, for "send string" _is_ timing sending a string, but
> that is also when the TCP connection to the server is made. A
> significant portion of the time is spent setting up the connection,
> not transferring the string. If I modify the client to do the first
> call twice, I get
>
> Connecting to server
> Time to connect to server, 0.000386
>
> Sending a long string to the server
> Time to send a string of 21000 chars, 0.002099
>
> Sending a long string to the server
> Time to send a string of 21000 chars, 0.001034
>
> Recieving a long stirng from the server
> Time to receive a string of 22000 chars, 0.001088
>
> Sending lots of ints to the server
> Time to send 5000 ints, 0.921309 (0.000184 per call)
>


To add some more to the mix, I benchmarked Pyro (3.2) :

Connecting to server
Time to connect to server, 0.026004

Sending a long string to the server

Time to send a string of 21000 chars, 0.003536

Recieving a long stirng from the server

Time to receive a string of 22000 chars, 0.003387

Sending lots of ints to the server

Time to send 5000 ints, 10.057741 (0.002012 per call)

I find this very fast for a pure Python solution....

I also measured the message sizes with tcpdump as mentioned in the article:

Actual message size sending 1,000 characters: 1390
Actual message size sending 100 integers: 27698 (with CORBA amongst the
smallest)


> Lies, damn lies, and statistics...

Amen.

--Irmen de Jong

Robin Becker

unread,
Apr 11, 2003, 12:47:11 PM4/11/03
to
In article <3e96efac$0$49107$e4fe...@news.xs4all.nl>, Irmen de Jong
<irmen@-NOSPAM-REMOVE-THIS-xs4all.nl> writes
>Duncan Grisby wrote:
......

>
>To add some more to the mix, I benchmarked Pyro (3.2) :
>
>Connecting to server
>Time to connect to server, 0.026004
>
>Sending a long string to the server
>Time to send a string of 21000 chars, 0.003536
>
>Recieving a long stirng from the server
>Time to receive a string of 22000 chars, 0.003387
>
>Sending lots of ints to the server
>Time to send 5000 ints, 10.057741 (0.002012 per call)
>
>I find this very fast for a pure Python solution....
>
>I also measured the message sizes with tcpdump as mentioned in the article:
>
>Actual message size sending 1,000 characters: 1390
>Actual message size sending 100 integers: 27698 (with CORBA amongst the
>smallest)
.....was this on the same machine? I find things seem to be awful off
the machine even if I pass in a dotted numeric address. I'm getting the
idea that I'm doing name lookups even if there's no apparent need.
>--Irmen de Jong
>

--
Robin Becker

Irmen de Jong

unread,
Apr 11, 2003, 12:59:45 PM4/11/03
to
Robin Becker wrote:
> In article <3e96efac$0$49107$e4fe...@news.xs4all.nl>, Irmen de Jong
> <irmen@-NOSPAM-REMOVE-THIS-xs4all.nl> writes
>>To add some more to the mix, I benchmarked Pyro (3.2) :
[...]

> .....was this on the same machine? I find things seem to be awful off
> the machine even if I pass in a dotted numeric address. I'm getting the
> idea that I'm doing name lookups even if there's no apparent need.

Yep that was on the same machine. Running the Pyro server on a different
machine on my 100 Mbit lan gives:

Connecting to server
Time to connect to server, 0.024740

Sending a long string to the server

Time to send a string of 21000 chars, 0.003866

Recieving a long stirng from the server

Time to receive a string of 22000 chars, 0.003579

Sending lots of ints to the server

Time to send 5000 ints, 4.586772 (0.000917 per call)

So you see, it's even faster when the server is running on another machine.
This can be explained because the pickling/unpickling is split across two
CPUs, when you run client+server on a single CPU they are fighting for cycles.

--Irmen de Jong

Robin Becker

unread,
Apr 11, 2003, 1:58:08 PM4/11/03
to
In article <3e96f481$0$49099$e4fe...@news.xs4all.nl>, Irmen de Jong
<irmen@-NOSPAM-REMOVE-THIS-xs4all.nl> writes

>Sending lots of ints to the server
>Time to send 5000 ints, 4.586772 (0.000917 per call)
>
is that 5000 connections with new sockets? Seems really fast to me.

>So you see, it's even faster when the server is running on another machine.
>This can be explained because the pickling/unpickling is split across two
>CPUs, when you run client+server on a single CPU they are fighting for cycles.
>
>--Irmen de Jong
>

--
Robin Becker

Irmen de Jong

unread,
Apr 11, 2003, 2:34:36 PM4/11/03
to
Robin Becker wrote:
> In article <3e96f481$0$49099$e4fe...@news.xs4all.nl>, Irmen de Jong
> <irmen@-NOSPAM-REMOVE-THIS-xs4all.nl> writes
>
>>Sending lots of ints to the server
>>Time to send 5000 ints, 4.586772 (0.000917 per call)
>>
>
> is that 5000 connections with new sockets? Seems really fast to me.

No, Pyro reuses socket connections for method calls.

--Irmen

Robin Becker

unread,
Apr 11, 2003, 3:08:03 PM4/11/03
to
In article <3e970abc$0$49101$e4fe...@news.xs4all.nl>, Irmen de Jong

<irmen@-NOSPAM-REMOVE-THIS-xs4all.nl> writes
>Robin Becker wrote:
>> In article <3e96f481$0$49099$e4fe...@news.xs4all.nl>, Irmen de Jong
>> <irmen@-NOSPAM-REMOVE-THIS-xs4all.nl> writes
>>
>>>Sending lots of ints to the server
>>>Time to send 5000 ints, 4.586772 (0.000917 per call)
>>>
>>
>> is that 5000 connections with new sockets? Seems really fast to me.
>
>No, Pyro reuses socket connections for method calls.
>
>--Irmen
>
sigh :(
for real distributed computing I wouldn't want to hold the sockets open.

I just tried the original server.py/time-client.py pair again this time
on a freeBSD system

freeBSD-->localhost 500 secs
freeBSD-->192.168.0.3 50 secs (a local win32 machine).


I guess I'm just not doing what I think I'm doing.
--
Robin Becker

Irmen de Jong

unread,
Apr 11, 2003, 5:04:01 PM4/11/03
to
Robin Becker wrote:

> sigh :(
> for real distributed computing I wouldn't want to hold the sockets open.

Sorry if I missed that requirement. I was just responding to Duncan's
CORBA timings...

> I just tried the original server.py/time-client.py pair again this time
> on a freeBSD system
>
> freeBSD-->localhost 500 secs
> freeBSD-->192.168.0.3 50 secs (a local win32 machine).
>
>
> I guess I'm just not doing what I think I'm doing.

I really can't explain these numbers either? Localhost is 10 times slower
than over the network???

Unless your client & server both struggle for CPU cycles on the same machine
at the same time...

--Irmen

David Mertz

unread,
Apr 11, 2003, 6:46:09 PM4/11/03
to
As much as I like Pyro, it doesn't quite fit into the roundup Mike and
Uche wrote. Or at least not entirely. Raw sockets, CORBA, XML-RPC, and
SOAP are all *language independent*, while Pyro is Python-specific. For
many projects, accessing the same services in different languages is not
an issue, but in some projects it matters. Unfortunately, if that
specification matters, Pyro is ruled out (as is, for example, Java RMI).

This isn't to say that Irmen's time and size data isn't interesting and
worth knowing. But just to point out that the article wasn't simply
snubbing Pyro for no reason.

Yours, David...

--
mertz@ _/_/_/_/_/_/_/ THIS MESSAGE WAS BROUGHT TO YOU BY:_/_/_/_/ v i
gnosis _/_/ Postmodern Enterprises _/_/ s r
.cx _/_/ MAKERS OF CHAOS.... _/_/ i u
_/_/_/_/_/ LOOK FOR IT IN A NEIGHBORHOOD NEAR YOU_/_/_/_/_/ g s


Bengt Richter

unread,
Apr 11, 2003, 7:06:32 PM4/11/03
to
On Fri, 11 Apr 2003 23:04:01 +0200, Irmen de Jong <irmen@-NOSPAM-REMOVETHIS-xs4all.nl> wrote:

>Robin Becker wrote:
>
>> sigh :(
>> for real distributed computing I wouldn't want to hold the sockets open.
>
>Sorry if I missed that requirement. I was just responding to Duncan's
>CORBA timings...
>
>> I just tried the original server.py/time-client.py pair again this time
>> on a freeBSD system
>>
>> freeBSD-->localhost 500 secs
>> freeBSD-->192.168.0.3 50 secs (a local win32 machine).
>>
>>
>> I guess I'm just not doing what I think I'm doing.
>
>I really can't explain these numbers either? Localhost is 10 times slower
>than over the network???

IIRC, there was some discussion some time ago about this being a particular
FreeBSD pessimality problem that wasn't true of Linux, but someone was fixing it?
But FreeBSD has good networking reputation I thought, so this needs double check,
for sure. Maybe it was for an old version?

>
>Unless your client & server both struggle for CPU cycles on the same machine
>at the same time...

Or the struggle involves unnecessary scheduling waits of some kind?

Regards,
Bengt Richter

Robin Becker

unread,
Apr 12, 2003, 5:18:57 AM4/12/03
to
In article <b77hpo$59a$0...@216.39.172.122>, Bengt Richter <bo...@oz.net>
writes
.....

>>I really can't explain these numbers either? Localhost is 10 times slower
>>than over the network???
>
>IIRC, there was some discussion some time ago about this being a particular
>FreeBSD pessimality problem that wasn't true of Linux, but someone was fixing
>it?
>But FreeBSD has good networking reputation I thought, so this needs double
>check,
>for sure. Maybe it was for an old version?
>
....this is version 4.5. I guess things would be easier if one could be
sure that it's not a dns type thing, but I see from the dev list that
there's already some discussion about how/where dns lookups can be
avoided/speeded up with sockets which have a supposedly known address.
Naively I thought it would be relatively easy to just get a 32 / 64 bit
int back from some socket module function and then shove that into the
address part of the socket address.

Given the above perhaps it's not dns at all, but some other things that
get in the way. Do OSes impose limits on number of ports or
connections/second etc?
>Regards,
>Bengt Richter

--
Robin Becker

Steve Holden

unread,
Apr 13, 2003, 9:58:56 AM4/13/03
to
"Robin Becker" <ro...@jessikat.fsnet.co.uk> wrote in message
news:6ZY$RKABo9...@jessikat.fsnet.co.uk...

Seems that the easiest way to get insight into the network traffic you're
generating is to install Ethereal or some similar software and simply look
to see what's going over the wire.

regards
--
Steve Holden http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/pwp/
Did you miss PyCon DC 2003? Would you come to PyCOn DC 2004?

Duncan Grisby

unread,
Apr 13, 2003, 1:48:38 PM4/13/03
to
In article <71M$aHCTKx...@jessikat.demon.co.uk>,
Robin Becker <ro...@jessikat.fsnet.co.uk> wrote:

>sigh :(
>for real distributed computing I wouldn't want to hold the sockets open.

Why ever not!? Most protocols people use for distributed computing do
hold sockets open. As long as connections are seen as a cache, and can
be closed if necessary, it's a good thing. TCP start-up overhead is
huge, especially if the network has high latency.

Robin Becker

unread,
Apr 13, 2003, 2:14:29 PM4/13/03
to
In article <1e740$3e99a2f6$516049d2$15...@nf1.news-service.com>, Duncan
Grisby <dunca...@grisby.org> writes

>In article <71M$aHCTKx...@jessikat.demon.co.uk>,
> Robin Becker <ro...@jessikat.fsnet.co.uk> wrote:
>
>>sigh :(
>>for real distributed computing I wouldn't want to hold the sockets open.
>
>Why ever not!? Most protocols people use for distributed computing do
>hold sockets open. As long as connections are seen as a cache, and can
>be closed if necessary, it's a good thing. TCP start-up overhead is
>huge, especially if the network has high latency.
>
>Cheers,
>
>Duncan.
>
Well after thinking about it I guess you're right. I guess I'm harking
back to times when only small numbers of file handles could be held and
thinking that sockets might be subject to the same kind of limits.
--
Robin Becker

Martin v. Löwis

unread,
Apr 13, 2003, 4:35:51 PM4/13/03
to
Robin Becker <ro...@jessikat.fsnet.co.uk> writes:

> Well after thinking about it I guess you're right. I guess I'm harking
> back to times when only small numbers of file handles could be held and
> thinking that sockets might be subject to the same kind of limits.

You are right that file descriptors are a limited resource, and any
carefully designed distributed-computing library needs to take this
into account. *However*, if taken into account, you can make good use
of this limited resource by nearly exhausting it:

Allow your library to consume a certain number of file descriptors,
just about as much so that enough are left for local file IO. Then the
library should implement some collection mechanism for unused socket
connections, and shut down anything that has not been used for a
while. The protocol needs to allow shut-downs from either side, as
both partners may experience file descriptor exhaustion.

In CORBA, you need to keep open all connections from which you expect
a response. So if your allocated socket descriptor pool is exhausted,
you first close those connections that have no outstanding requests.
If that is still insufficient, you also close the socket on which you
have been waiting for a response longest, and tell the application
that this connection has timed out.

Regards,
Martin

Robin Becker

unread,
Apr 13, 2003, 6:06:57 PM4/13/03
to
In article <m37k9y2...@mira.informatik.hu-berlin.de>, Martin v.
Löwis <mar...@v.loewis.de> writes
.....

>In CORBA, you need to keep open all connections from which you expect
>a response. So if your allocated socket descriptor pool is exhausted,
>you first close those connections that have no outstanding requests.
>If that is still insufficient, you also close the socket on which you
>have been waiting for a response longest, and tell the application
>that this connection has timed out.
>
>Regards,
>Martin
so how do master slave implementations handle thousands of slaves? If
the master is really a 'master' and not a SETI type server it would seem
that these resource limits might play a part. How do GRID systems work
this?
--
Robin Becker

Martin v. Löwis

unread,
Apr 14, 2003, 1:12:13 AM4/14/03
to
Robin Becker <ro...@jessikat.fsnet.co.uk> writes:

> so how do master slave implementations handle thousands of slaves? If
> the master is really a 'master' and not a SETI type server it would seem
> that these resource limits might play a part. How do GRID systems work
> this?

In a compute-intensive application, it is sensible to close the TCP
connection after you have communicated the job. Keeping the connection
alive is important to avoid the TCP connection setup, however, that is
neglible if you spend several minutes or more in computation.

I don't actually know how grid computing protocols work, but I believe
they are not concerned about networking performance, as they use HTTP
and SOAP to communicate jobs.

Regards,
Martin

0 new messages