omniORB misses GIOPs

Thomas Richter

unread,

Jan 6, 2005, 10:36:58 AM1/6/05

to

Hi,

maybe someone's able to figure out what's happening here, I'm getting out of
luck. The problem seems to be that the omniORB seems to miss a lot of
GIOPs when under load. The situation is as follows:

I've one server running omniORB as the ORB, serving my objects. There
is one heavily used object that provides a Get() method to retrieve
some data from. This method usually blocks the client call until some
data becomes available at the server, it is in my situation heavily
called from two clients running concurrently, each of them launches
several threads putting the server under load; omniORB launches around
110 threads to handle each of the request. Interestingly, the server
stops reacting to *any* message somewhere in the middle of the
requests coming from the client started last. At that point, no single
message makes it thru to the server.

I've enabled GIOP dumping at both the server and the clients side; from what
I see there, the GIOPs are sent out correctly on both client sides, but
the omniORB at that point dumps nothing, it just sits there and waits. I
also run ethereal to sniff the network traffic, and the GIOPs from both
clients make it in fact to the interface. They appear on the network, let
it be eth0 or a local loopback interface.

I see no interesting output from omniORB at the point this happens:

omniORB: recieve codeset service context and set TCS to (ISO-8859-1,UTF-16)
omniORB: AsyncInvoker: thread id = 107 has started. Total threads = 106
omniORB: giopWorker task execute.
omniORB: inputMessage: from giop:tcp:130.149.13.150:51635 88 bytes
omniORB:
4749 4f50 0102 0000 0000 004c 0000 00c8 GIOP.......L....
0300 0000 0000 0000 0000 000e fed1 2cdd ..............,.
4100 0027 5600 0000 001d 0000 0000 0004 A..'V...........
4765 7400 0000 0002 0000 0001 0000 000c Get.............
0000 0000 0001 0001 0001 0109 4e45 4f00 ............NEO.
0000 0002 000a 0000 ........
omniORB: recieve codeset service context and set TCS to (ISO-8859-1,UTF-16)
omniORB: AsyncInvoker: thread id = 108 has started. Total threads = 107
omniORB: giopWorker task execute.
omniORB: inputMessage: from giop:tcp:130.149.13.150:51635 88 bytes
omniORB:
4749 4f50 0102 0000 0000 004c 0000 00c9 GIOP.......L....
0300 0000 0000 0000 0000 000e fed1 2cdd ..............,.
4100 0027 5600 0000 001d 0000 0000 0004 A..'V...........
4765 7400 0000 0002 0000 0001 0000 000c Get.............
0000 0000 0001 0001 0001 0109 4e45 4f00 ............NEO.
0000 0002 000a 0000 ........
omniORB: recieve codeset service context and set TCS to (ISO-8859-1,UTF-16)
omniORB: Scan for idle connections (1105014032,917168000)
omniORB: Scan for idle connections done (1105014032,917168000).
omniORB: Scan for idle connections (1105014037,927326000)
omniORB: Scan for idle connections done (1105014037,927326000).

After the Get request with request id 0xc9 dumped above the server
just waits idling, though requests continue to come in, visible on
the network.

Any idea what this could be? I'm badly stuck here...

So long,
Thomas

Thomas Richter

unread,

Jan 7, 2005, 4:25:39 AM1/7/05

to

Hi,

> maybe someone's able to figure out what's happening here, I'm getting out of
> luck. The problem seems to be that the omniORB seems to miss a lot of
> GIOPs when under load. The situation is as follows:

/* snip */

Found it. The omniORB just run out of threads and couldn't handle additional
incoming requests. Raising this number made the program work again. However,
I would prefer if the debug output of the ORB would possibly say a word
about this when it happens - this would have speed up my debugging
enourmously. (-;

So long,
Thomas

Duncan Grisby

unread,

Jan 7, 2005, 5:02:32 AM1/7/05

to

In article <crlkij$1mk$2...@mamenchi.zrz.TU-Berlin.DE>,
Thomas Richter <th...@cleopatra.math.tu-berlin.de> wrote:

Unless you are doing oneway requests, omniORB should never drop
requests, even if it runs out of threads. It should just queue them
up. What version of omniORB are you using? Please can you try the
current cvs version, since that has at least one fix that might help
you.

If you can reproduce the problem with the current cvs version, please
send me a complete trace from -ORBtraceLevel 40 -ORBtraceThreadId 1
-ORBtraceInvocations 1. Don't post it here, because it'll be huge.

Cheers,

Duncan.

--
-- Duncan Grisby --
-- dun...@grisby.org --
-- http://www.grisby.org --

Ciaran McHale

unread,

Jan 7, 2005, 5:44:39 AM1/7/05

to

Thomas Richter (th...@cleopatra.math.tu-berlin.de) writes:
>[an omniORB server hangs when it receives lots of requests from clients
>and each request takes a long time to execute]

I suspect that your server is blocking because it is running out of threads
in a thread pool that has an upper limit on its size. Look in the omniORB
configuration file (or the discussion about configuration enties in the
omniORB manual) and see if increasing the size of some config entries
resolves the problem.

Regards,
Ciaran.

Thomas Richter

unread,

Jan 7, 2005, 8:01:24 AM1/7/05

to

Hi Duncan,

>>Found it. The omniORB just run out of threads and couldn't handle additional
>>incoming requests. Raising this number made the program work again. However,
>>I would prefer if the debug output of the ORB would possibly say a word
>>about this when it happens - this would have speed up my debugging
>>enourmously. (-;

> Unless you are doing oneway requests, omniORB should never drop
> requests, even if it runs out of threads. It should just queue them
> up. What version of omniORB are you using?

After reading your last post above, I upgraded to the latest I could
find, the 4.0.5 on the sourceforge page. The problem - if it is one -
remained, so to say. Currently, I would believe that this is rather
a result of the current design than a bug in omniORB. No, these requests
are not oneway.

> Please can you try the current cvs version, since that has at least
> one fix that might help you.

> If you can reproduce the problem with the current cvs version, please
> send me a complete trace from -ORBtraceLevel 40 -ORBtraceThreadId 1
> -ORBtraceInvocations 1. Don't post it here, because it'll be huge.

Thanks, I'll try that. Would this command queuing appear as debug
output in the GIOP dumping, or is the GIOP only dumped when the queue
finds a thread that is able to handle the command? In the latter case,
I wouldn't be able to tell command dropping and command queuing apart
as my client cannot continue if some other more vital commands don't
make it to the server core because there are no spare threads.

The main problem I've here is that the clients may run a lot of (not
that important) "Get()" requests that have to block until new data
arrives. Unfortunately, these eat up the threads in the thread pool,
stealing threads for other, more vital activities.

Is there some way to assign a thread-pool separately to each object, or
each POA? In the latter case, I would handle these "not so vital"
objects in some other POA with its own thread pool?

So long,
Thomas

Michi Henning

unread,

Jan 7, 2005, 4:28:41 PM1/7/05

to

"Thomas Richter" <th...@cleopatra.math.tu-berlin.de> wrote in message
news:crm174$9o8$3...@mamenchi.zrz.TU-Berlin.DE...

>
> The main problem I've here is that the clients may run a lot of (not
> that important) "Get()" requests that have to block until new data
> arrives. Unfortunately, these eat up the threads in the thread pool,
> stealing threads for other, more vital activities.

That's a design issue that's fundamental in CORBA. If you have
long-running requests, such as your Get() requests, that block
the client for a long time, each requests ties up a thread in the
server. That's a fundamentally non-scalable thing, because
you won't get all that many threads before your server runs
out of resources. (The number threads will typically be in the
hundreds, rather than in the thousands or ten thousands, unless
you have a machine with *huge* amounts of memory.)
Unfortunately, that also is the reason why the pull model of
the event service can't scale.

CORBA doesn't offer the equivalent of AMI for the server side,
so there really is no way to fix this (other than to split your Get()
calls into an asynchronous StartGet() call and GetCompleted()
callback from the server. But doing that is awkward and raises
other problems, such as how to deal with disconnected/non-responsive
clients and how to reap their state.

If you are interested, have a look at Ice. It offers asynchronous method
dispatch for the server side, which allows you to service an arbitrary
number of concurrent client requests with a single thread.

> Is there some way to assign a thread-pool separately to each object, or
> each POA? In the latter case, I would handle these "not so vital"
> objects in some other POA with its own thread pool?

Not in any standardized way -- the CORBA spec doesn't talk about threads
(well, not quite, there is a bit of talk about threads, but nowhere near enough
for your purposes). But most ORBs have proprietary means to allow you
configure thread pool sizes and such.

Cheers,

Michi.

--
Michi Henning Ph: +61 4 1118-2700
ZeroC, Inc. http://www.zeroc.com

Ciaran McHale

unread,

Jan 7, 2005, 5:00:41 PM1/7/05

to

Thomas Richter (th...@cleopatra.math.tu-berlin.de) wrote:
>The main problem I've here is that the clients may run a lot of (not
>that important) "Get()" requests that have to block until new data
>arrives. Unfortunately, these eat up the threads in the thread pool,
>stealing threads for other, more vital activities.
>

>Is there some way to assign a thread-pool separately to each object,
or
>each POA? In the latter case, I would handle these "not so vital"
>objects in some other POA with its own thread pool?

The CORBA specification does not define clear semantics for the
ORB_CTRL_MODEL POA policy, which means that different CORBA products
can implement different threading models (for example, thead-pool,
thread-per-request, ...). And, of course, a CORBA product could define
some proprietary POA policies that allow other threading models. For
example, a proprietary POA policy in Orbix allows you to obtain a
thread-pool-per-poa model. It's been a while since I looked at what
flexibility omniORB has to offer in this regard so I don't know if it
can offer something similar.

However, switching to a thread-pool-per-poa or a
thread-pool-per-object model (either within omniORB or in a different
CORBA product) is a bad idea, in my opinion. This is because the
design of your client-server interaction seems to be to be
intrinsically unscalable. Throwing more and more threads at it will
help it scale up a bit more but probably not by much. Instead, I think
you should consider alternative, more scalable designs. One that
springs to mind is that of using callback objects. In other words, the
client implements a callback object and passes the reference of this
callback object to the server. The server then invokes an operation on
the callback object whenever new data becomes available. The server
has to maintain a list of callback objects, but doing that is less
overhead than having lots of threads block waiting a long time for
data to become available.

Regards,
Ciaran.

Ciaran McHale

unread,

Jan 8, 2005, 10:24:38 AM1/8/05

to

Michi Henning (mi...@triodia.com) wrote:
>[...] CORBA doesn't offer the equivalent of AMI for the server side,

>so there really is no way to fix this (other than to split your Get()
>calls into an asynchronous StartGet() call and GetCompleted()
>callback from the server. But doing that is awkward and raises
>other problems, such as how to deal with disconnected/non-responsive
>clients and how to reap their state.
>
>If you are interested, have a look at Ice. It offers asynchronous
method
>dispatch for the server side, which allows you to service an
arbitrary
>number of concurrent client requests with a single thread.

If I remember correctly, TAO has a proprietary enhancement that is
basically a server-side equivalent of AMI. So, yes, there is no
portable way to do this in CORBA, but if Thomas wants to do it rather
than use callback objects then it would probably be less work to port
his existing code from omniORB to TAO than from omniORB to a non-CORBA
system.

Thomas, if you do decide to switch from omniORB to TAO then I sugges
that you download the CORBA Utilities package, which is available
from:

www.iona.com/devcenter/corba/utilities.htm

In the PDF documentation, there is a chapter called "Portability of
C++ CORBA Applications" that will make the port easier.

Regards,
Ciaran.

Thomas Richter

unread,

Jan 9, 2005, 7:47:17 AM1/9/05

to

Ciaran McHale wrote:

> However, switching to a thread-pool-per-poa or a
> thread-pool-per-object model (either within omniORB or in a different
> CORBA product) is a bad idea, in my opinion. This is because the
> design of your client-server interaction seems to be to be
> intrinsically unscalable. Throwing more and more threads at it will
> help it scale up a bit more but probably not by much. Instead, I think
> you should consider alternative, more scalable designs. One that
> springs to mind is that of using callback objects.

*Sigh*. *Big Sigh*. Yes, I know. That's the way I had it before I made a
change in the design. Unfortunately, callback objects just open another
can of worms. The problem is that this requires the client of my
application also act as a CORBA server (for the callbacks, that is), but
if a callback object is called from the server into the client, every
firewall I tried (WinXP built-in, iptables on Linux...) screams out loud
by detecting an ingoing connection without a related outgoing
connection. In other words, even though this is a more scaleable option,
this is a show-stopper in my application as well because I cannot expect
that all my customers - students, mostly - are capable enough of
configuring the Win-XP or related firewalls such that these connections
are allowed. And even if they are capable enough, they might be
suspicious about what the heck is messing with their system (at least I
would if something wants to connect to it). It makes things more
complicated for my audience, and I have to avoid that.

> In other words, the
> client implements a callback object and passes the reference of this
> callback object to the server. The server then invokes an operation on
> the callback object whenever new data becomes available. The server
> has to maintain a list of callback objects, but doing that is less
> overhead than having lots of threads block waiting a long time for
> data to become available.

Yes, exactly my old design. Unfortunately, not suitable. )-:

Thus, thanks a lot! But - any other clever ideas? *Sigh*.

So long,
Thomas

Rob Ratcliff

unread,

Jan 9, 2005, 3:21:25 PM1/9/05

to

Hi Thomas,

I noticed that OmniOrb supported bidirectional GIOP, did you try that
for your firewall problem?
(I'd be curious to hear about any issues related to that and firewalls.
A cost-effective, general-purpose,
simple firewall solution for CORBA has always been it's Achilles' heel. )

Another approach could be to poll the server for a response; it's not
the prettiest approach but
perhaps it is more scalable than adding more threads? (This is what
you'd have to do with HTTP or
RMI-based solutions by the way. Of course, there is always the IIOP
proxy, but that really isn't
cost-effective or simple enough for a typical home user behind a firewall. )

Look forward to hearing the solution to your dilemna!

Rob

> *Sigh*. *Big Sigh*. Yes, I know. That's the way I had it before I made
> a change in the design. Unfortunately, callback objects just open
> another can of worms. The problem is that this requires the client of
> my application also act as a CORBA server (for the callbacks, that
> is), but if a callback object is called from the server into the
> client, every firewall I tried (WinXP built-in, iptables on Linux...)
> screams out loud by detecting an ingoing connection without a related
> outgoing connection. In other words, even though this is a more
> scaleable option, this is a show-stopper in my application as well
> because I cannot expect that all my customers - students, mostly - are
> capable enough of configuring the Win-XP or related firewalls such
> that these connections are allowed. And even if they are capable
> enough, they might be suspicious about what the heck is messing with
> their system (at least I would if something wants to connect to it).
> It makes things more complicated for my audience, and I have to avoid
> that.
>
>
>

Stelios G. Sfakianakis

unread,

Jan 10, 2005, 3:00:46 AM1/10/05

to

Michi Henning wrote:
> [snip]

>
> CORBA doesn't offer the equivalent of AMI for the server side, so
> there really is no way to fix this (other than to split your Get()
> calls into an asynchronous StartGet() call and GetCompleted()
> callback from the server. But doing that is awkward and raises other
> problems, such as how to deal with disconnected/non-responsive
> clients and how to reap their state.
>
> If you are interested, have a look at Ice. It offers asynchronous
> method dispatch for the server side, which allows you to service an
> arbitrary number of concurrent client requests with a single thread.
>

If using TAO then AMH (Asynchronous Method Handling) could be useful:

"The TAO asynchronous method handling (AMH) is a mechanism, which
extends the concepts of AMI from clients to servers. Servers with AMH
capability can return immediately from (potentially) long, blocking
requests. This makes the servers capable of higher throughput."

(from
<http://www.cs.wustl.edu/~schmidt/ACE_wrappers/TAO/docs/releasenotes/amh.html>)

Check also: <http://www.cs.wustl.edu/~schmidt/PDF/AMH.pdf>,
<http://www.cs.wustl.edu/~schmidt/PDF/DOA-02.pdf>

Cheers
Stelios

--
Stelios G. Sfakianakis | Center of Medical Informatics
Voice: +30-2810-391650 | Institute of Computer Science
PGP Key ID: 0x5F30AAC2 | FORTH, http://www.ics.forth.gr/

Duncan Grisby

unread,

Jan 10, 2005, 5:29:28 AM1/10/05

to

In article <crm174$9o8$3...@mamenchi.zrz.TU-Berlin.DE>,
Thomas Richter <th...@cleopatra.math.tu-berlin.de> wrote:

>The main problem I've here is that the clients may run a lot of (not
>that important) "Get()" requests that have to block until new data
>arrives. Unfortunately, these eat up the threads in the thread pool,
>stealing threads for other, more vital activities.

Ah, ok. Your initial post made it sound as though omniORB was
completely dropping the requests. It sounds like it is actually just
queueing them behind other requests, which is the correct behaviour.

>Is there some way to assign a thread-pool separately to each object, or
>each POA? In the latter case, I would handle these "not so vital"
>objects in some other POA with its own thread pool?

Not in omniORB, no. omniORB allocates threads to calls before it knows
which POA / object the request is for, so that isn't feasible.

As others have said, trying to use blocking requests for these Get()
calls is not a very good way to do it. The three better approaches are
callbacks (and bidirectional GIOP should help with your firewall
issues), explicit polling by your clients (which is ugly but would
work), or to implement some kind of asynchronous message handling in
omniORB. The latter is the clean way to do it, but is clearly the most
work.

Michi Henning

unread,

Jan 10, 2005, 7:49:48 AM1/10/05

to

"Thomas Richter" <th...@math.tu-berlin.de> wrote in message
news:crr941$g1n$1...@mamenchi.zrz.TU-Berlin.DE...

>
> *Sigh*. *Big Sigh*. Yes, I know. That's the way I had it before I made a
> change in the design. Unfortunately, callback objects just open another
> can of worms. The problem is that this requires the client of my
> application also act as a CORBA server (for the callbacks, that is), but
> if a callback object is called from the server into the client, every
> firewall I tried (WinXP built-in, iptables on Linux...) screams out loud
> by detecting an ingoing connection without a related outgoing
> connection. In other words, even though this is a more scaleable option,
> this is a show-stopper in my application as well because I cannot expect
> that all my customers - students, mostly - are capable enough of
> configuring the Win-XP or related firewalls such that these connections
> are allowed. And even if they are capable enough, they might be
> suspicious about what the heck is messing with their system (at least I
> would if something wants to connect to it). It makes things more
> complicated for my audience, and I have to avoid that.

I hate to sound like a broken record... You might want to have a look
at Ice. Bidirectional support is built-in and, together with asynchronous
method dispatch for the server side, neatly side-steps this quite
common problem with CORBA.

Thomas Richter

unread,

Jan 10, 2005, 8:14:20 AM1/10/05

to

Hi Duncan,

> Ah, ok. Your initial post made it sound as though omniORB was
> completely dropping the requests. It sounds like it is actually just
> queueing them behind other requests, which is the correct behaviour.

Sorry, I understand the problem much better now than I did before. omniORB
is completely innocent about it; "dropping" is not the right word. Just the
requests never made it even to the debug output, so I had no chance to
know what was going on here... (-;

>>Is there some way to assign a thread-pool separately to each object, or
>>each POA? In the latter case, I would handle these "not so vital"
>>objects in some other POA with its own thread pool?

> Not in omniORB, no. omniORB allocates threads to calls before it knows
> which POA / object the request is for, so that isn't feasible.

Sigh, too bad.

> As others have said, trying to use blocking requests for these Get()
> calls is not a very good way to do it. The three better approaches are
> callbacks (and bidirectional GIOP should help with your firewall
> issues), explicit polling by your clients (which is ugly but would
> work), or to implement some kind of asynchronous message handling in
> omniORB. The latter is the clean way to do it, but is clearly the most
> work.

Well, see below. I *had* callbacks in the first release, but that caused
a lot of trouble with firewalls, and if I want easy installation, this is
a show-stopper. Polling causes a lot of network traffic even though not
really anything is going on - requests would return basically empty. The
requests block because they wait on some kind of user interaction, inter-
action of a user connected from a distinct client, so there is not even a
way of knowing how long a "Get()" might have to wait. Internally, this is
already handled by asynchronous messages, sent from an administrator to
the callers of the "Get()" method, but there is no "natural" CORBA model
for that.

My current idea would be to have some kind of time-out for the
request, thus making the request look like a poll, more or
less. Clearly, this doesn't solve the problem at its root and just
works around it in a pretty limited way, but as far as I understand
it, the problem is rather a shortcoming of the CORBA model and I can't
fix *that*. /-:

Anyhow, thanks a lot for your experise and your advice; it's really
welcome!

So long,
Thomas

Thomas Richter

unread,

Jan 10, 2005, 8:34:41 AM1/10/05

to

Hi Michi,

> I hate to sound like a broken record... You might want to have a look
> at Ice. Bidirectional support is built-in and, together with asynchronous
> method dispatch for the server side, neatly side-steps this quite
> common problem with CORBA.

Sounds really like what I'd like to have here, and I should take
a serious look at Ice. I did a bit of browsing, though.

Just allow me to ask a couple of questions:

o) As far as I know, Ice does support C++ in its language mappings.
Additionally, we would need java and python as additional languages,
minimum. Java for the clients, python as a scripting language for
"quick'n dirty" setups. How well does Ice integrate into python and
Java? Despite that, we need to support at least Linux and Win32 as
targets. The TU infrastructure is completely Linux based, students
often use Win32 (urgh) pre-installed machines.

o) It is - to me - unclear how much it would cost to move from one
middleware to another. IOW, how much men power had to be invested
to understand and switch to Ice? There's of course no definite answer
to that; the question rather is "how much different" does Ice look
like?

o) How does its future look like? We're building here an eLearning
platform for the Technical University of Berlin, and I have to ensure
that I get some kind of stability for the middleware I'm using so my
"customers" (spell: students) are able to use this platform even in
a couple of years to come. Corba, with all its shortcommings, is in
a pretty "stable" situation here.

o) Licences: Yes, that's about money. /-: As you might know, the city
of Berlin is more or less bankrupt (no kidding). There's not much we
might be able to spend. Even though this project is going to be
open-sourced, it will definitely *not* use the GPL. Something
Mozilla-like sounds attractive (weak "copy-left"). It is not too unlikely
that we will try to sell the complete software, and forcing a licence
taker to buy another licence doesn't make this easier.

This is a real "University problem": There's mentime to get the job
done, but arguing about spending money for some kind of middleware is
not so easy.

Thus, after all, I do believe that Ice does a very nice job, though I
afraid that there might be enough problems elsewhere, not on the
technical side.

So long,
Thomas

Rob Ratcliff

unread,

Jan 10, 2005, 10:47:27 AM1/10/05

to

Michi, I am curious, does the implementation of ICE's asynchronous
method dispatch essentially
use a similar mechanism to the callback reference approach underneath
the covers? The benefit
being that the user doesn't have to explicitly specify a callback
object? Also, does the client
side block until it gets a response from the server?
(That'd be another benefit to simplify the client side logic.) Would the
"job" get added
to a queue on the server side in order to eliminate the need for a
separate worker thread
per requested job? (I imagine that'd be up to the implementor.)

Thomas, I guess another approach to this kind of thing is to use a
messaging service like the
Event or Notification service. Of course, those would require
bidirectional GIOP as well
if you didn't use the polling for events approach, but would give you
the flexibility to
user polling or asynchronous callbacks depending on the situation.

BTW, how long does it usually take to process each request that you are
talking about?

Thomas Richter

unread,

Jan 10, 2005, 12:26:39 PM1/10/05

to

Hi,

> Michi, I am curious, does the implementation of ICE's asynchronous
> method dispatch essentially use a similar mechanism to the callback
> reference approach underneath the covers?

Me, too. And that because "callbacks" might open other problems as
I've found.

> Thomas, I guess another approach to this kind of thing is to use a
> messaging service like the Event or Notification service.
> Of course, those would require bidirectional GIOP as well
> if you didn't use the polling for events approach, but would give you
> the flexibility to
> user polling or asynchronous callbacks depending on the situation.

Yes, seems so. Just allow me to ask a question: The pull model event
handling is more or less what I'm doing right now manually. Given that,
would using the push model of the OMG Event Service help me to work
around the firewall problematic I had when programming something like
that manually? That is, if the push model would (internally) work with
callback objects where the server calls into the client, I would have
the same problems back I had before I switched to blocking "Get()"
methods. A second problem with my "push" approach was that I had
huge problems in case the network communication to just one client
broke down. Because the server sent its messages manually to each
client, the communication hung as soon as only one of the many
connections blocked. To avoid this, I launched side-threads to
provide data for each connection - having exactly the same problem
that I have right now: I could have run out of threads. )-:

> BTW, how long does it usually take to process each request that you are
> talking about?

These methods should really be seen as part of an "event handling"
mechanism. It is a matter of user interaction how long it takes until
a blocking "Get()" returns from a call, thus no prediction is possible.
*If* it returns, it is however very likely that the next Get will also
return without blocking. What a user (at any client) is able to is more
or less to switch this event source on or off. However, it is
unpredictable when this happens. Of course I could create a "higher level"
event in case the user turns the data transmission on, but this would
just move the problem and won't solve it.

So long,
Thomas

Michi Henning

unread,

Jan 10, 2005, 3:30:37 PM1/10/05

to

"Thomas Richter" <th...@cleopatra.math.tu-berlin.de> wrote in message

news:cru09h$7c5$1...@mamenchi.zrz.TU-Berlin.DE...

>
> Just allow me to ask a couple of questions:
>
> o) As far as I know, Ice does support C++ in its language mappings.
> Additionally, we would need java and python as additional languages,
> minimum. Java for the clients, python as a scripting language for
> "quick'n dirty" setups. How well does Ice integrate into python and
> Java? Despite that, we need to support at least Linux and Win32 as
> targets. The TU infrastructure is completely Linux based, students
> often use Win32 (urgh) pre-installed machines.

Ice supports C++, Java, Python, C#, Visual Basic .NET, and PHP.
It runs on any Windows platform from Win98 onwards, as well as
on Linux, AIX, Solaris, HP-UX, and MaxOS X. There are also
ports to FreeBSD and similar (although those are not officially
supported by us).

> o) It is - to me - unclear how much it would cost to move from one
> middleware to another. IOW, how much men power had to be invested
> to understand and switch to Ice? There's of course no definite answer
> to that; the question rather is "how much different" does Ice look
> like?

If you understand CORBA, you will have no problems understanding
Ice. The Ice object model and APIs are much simpler and the learning
curve is very flat if you have used CORBA before. But, obviously,
there is no source-code compatibility with CORBA.

> o) How does its future look like? We're building here an eLearning
> platform for the Technical University of Berlin, and I have to ensure
> that I get some kind of stability for the middleware I'm using so my
> "customers" (spell: students) are able to use this platform even in
> a couple of years to come. Corba, with all its shortcommings, is in
> a pretty "stable" situation here.

Ice is no less stable, I believe. You get the source, and there is no reason
for why Ice won't be around for a long time.

> o) Licences: Yes, that's about money. /-: As you might know, the city
> of Berlin is more or less bankrupt (no kidding). There's not much we
> might be able to spend. Even though this project is going to be
> open-sourced, it will definitely *not* use the GPL. Something
> Mozilla-like sounds attractive (weak "copy-left"). It is not too unlikely
> that we will try to sell the complete software, and forcing a licence
> taker to buy another licence doesn't make this easier.

Ice is free if you GPL your code. Ice is also free if you don't make money
because Ice royalties are based on your revenues. I'm afraid though that,
if you don't want to GPL your code and sell your program, Ice is not
an option. While cheap, Ice is not entirely free.

> Thus, after all, I do believe that Ice does a very nice job, though I
> afraid that there might be enough problems elsewhere, not on the
> technical side.

Have a look at www.zeroc.com and give it a try. You'll will be pleasantly
surprised by how easy it is to get started with Ice. And ZeroC is a flexible
company when it comes to licensing -- there are all sorts of creative
options that can be discussed.

Michi Henning

unread,

Jan 10, 2005, 3:39:54 PM1/10/05

to

"Rob Ratcliff" <rrr...@futuretek.com> wrote in message
news:jsxEd.22842$q4....@fe1.texas.rr.com...

> Michi, I am curious, does the implementation of ICE's asynchronous
> method dispatch essentially
> use a similar mechanism to the callback reference approach underneath
> the covers? The benefit
> being that the user doesn't have to explicitly specify a callback
> object?

When you compile your interface definitions, you specify whether you want
synchronous or asynchronous dispatch. You can specify this on a per-interface
or on a per-method basis. On the server side, AMD works by generating
a class with a virtual method for each AMD operation. As for synchronous
dispatch, you specialize that class to connect the thread of control to your
own code.

When a client invokes an AMD operation, the Ice run time ends up calling
the method you have specialized, passing it all in-parameters and a reference
to a callback object. What you do inside the method is up to you. The
salient point is that you can return from the method without this causing
the RPC in the client to complete. In other words, you can release the
thread of control in the server for this particular invocation.

The callback object has ice_response() and ice_exception()
methods. You complete the AMD operation by eventually calling one of those
two methods. (The ice_response() method requires you to pass any out-parameters
and return value, and the ice_exception() method allows you raise exceptions.)
The callback can be called from the invocation thread or another thread -- it
doesn't
matter.

> Also, does the client
> side block until it gets a response from the server?

To the client, the entire thing is transparent -- the client has no idea
whether the server implements an operation using synchronous or
asynchronous dispatch, so the client will block until the RPC completes
as usual (unless the client uses asynchronous invocation, of course).

> (That'd be another benefit to simplify the client side logic.) Would the
> "job" get added
> to a queue on the server side in order to eliminate the need for a
> separate worker thread
> per requested job? (I imagine that'd be up to the implementor.)

Exactly. A typical implementation would be to bundle the parameters
that are passed to a method into some sort of job record and to put
that on the queue. Some other threads then fetch items from the queue
and process them and, when a method is complete, invoke ice_response()
or ice_exception().

> Thomas, I guess another approach to this kind of thing is to use a
> messaging service like the
> Event or Notification service. Of course, those would require
> bidirectional GIOP as well
> if you didn't use the polling for events approach, but would give you
> the flexibility to
> user polling or asynchronous callbacks depending on the situation.

I'm not sure that this would work. The CORBA notification and event
service offer the pull model, and that pull model suffers from the same
original problem: each invocation hogs a thread in the server for the
duration of the invocation and, if lots of invocations block for any
length of time, that's non-scalable because the server runs out of threads.
So, a CORBA event or notification service does not fix this problem
but just pushes it to a different part of the system.

Michi Henning

unread,

Jan 10, 2005, 3:45:49 PM1/10/05

to

"Thomas Richter" <th...@cleopatra.math.tu-berlin.de> wrote in message

news:crudsf$g6p$1...@mamenchi.zrz.TU-Berlin.DE...

> Hi,
>
> > Michi, I am curious, does the implementation of ICE's asynchronous
> > method dispatch essentially use a similar mechanism to the callback
> > reference approach underneath the covers?
>
> Me, too. And that because "callbacks" might open other problems as
> I've found.

Be careful -- these aren't callbacks on the wire. AMI and AMD in Ice
happen purely at the API (language mapping) level and AMI and AMD
are client-server transparent (the exact same bits go over the wire
in either case, so the client-server contract is never invalidated by changing
from synchronous to asynchronous invocation or dispatch). With Ice,
you definitely do not have to worry about garbage collecting callback objects
that have gone stale, or dealing with broken references to such objects.

> Just allow me to ask a question: The pull model event
> handling is more or less what I'm doing right now manually. Given that,
> would using the push model of the OMG Event Service help me to work
> around the firewall problematic I had when programming something like
> that manually?

No. Under the covers, whenever client and server roles are reversed,
a separate connection is required. If the client is behind a firewall that
disallows incoming connections, the server cannot connect back to
the client to invoke the push() operation. Whether the server is written
by you or is a CORBA event service is irrelevant. The only way around
this is to use bidirectional IIOP (but not all ORBs support that).

Karl Waclawek

unread,

Jan 10, 2005, 4:04:38 PM1/10/05

to

Michi Henning wrote:
>
> Ice is free if you GPL your code. Ice is also free if you don't make money
> because Ice royalties are based on your revenues. I'm afraid though that,
> if you don't want to GPL your code and sell your program, Ice is not
> an option. While cheap, Ice is not entirely free.

How does this work for a typical corporate siutation where ICE would be used
to integrate different systems (some could be legacy systems)?

The integration code would not be for sale, and it would be pretty hard
to determine to which degree the system integration contributes to the
revenue of the corporate business units using these systems, if they
are directly revenue generating at all.

Karl

Michi Henning

unread,

Jan 10, 2005, 4:24:28 PM1/10/05

to

"Karl Waclawek" <ka...@waclawek.net> wrote in message
news:H5CEd.88141$vO1.5...@nnrp1.uunet.ca...

If you don't move the code to a physically different location (that is,
distribute it to third parties), you don't owe us anything. A license
fee is due if you sell a product that uses Ice and you do not GPL
the code for that product.

See http://www.zeroc.com/licensing.html for more info.

Rob Ratcliff

unread,

Jan 11, 2005, 10:30:21 AM1/11/05

to

Michi Henning wrote:

>
>I'm not sure that this would work. The CORBA notification and event
>service offer the pull model, and that pull model suffers from the same
>original problem: each invocation hogs a thread in the server for the
>duration of the invocation and, if lots of invocations block for any
>length of time, that's non-scalable because the server runs out of threads.
>So, a CORBA event or notification service does not fix this problem
>but just pushes it to a different part of the system.
>
>
>

What I had in mind would be to start the long running task using a
typical method call on the custom
server that would just start the task and return immediately, but the
final response/responses would be received
asynchronously from the CORBA event service using push or pull models.
That way there would only
be one callback object required and perhaps less code would have to be
written to support push and pull
models. The GUI would display the results whenever they arrived. The GUI
could also always block until it received the asynchronous response if
it was required to perform
a next step. The negative is that you have demultiplex the various events.
(Of course, this could be done with custom callback objects as well.)

(I've used this approach for starting tasks that require hours to days
and the GUI just monitored the progress
using the event service.)

Michi Henning

unread,

Jan 11, 2005, 5:12:23 PM1/11/05

to

"Rob Ratcliff" <rrr...@futuretek.com> wrote in message

news:hiSEd.7346$RW3...@fe1.texas.rr.com...

> Michi Henning wrote:
>
> >
> >I'm not sure that this would work. The CORBA notification and event
> >service offer the pull model, and that pull model suffers from the same
> >original problem: each invocation hogs a thread in the server for the
> >duration of the invocation and, if lots of invocations block for any
> >length of time, that's non-scalable because the server runs out of threads.
> >So, a CORBA event or notification service does not fix this problem
> >but just pushes it to a different part of the system.
> >
> >
> What I had in mind would be to start the long running task using a
> typical method call on the custom
> server that would just start the task and return immediately, but the
> final response/responses would be received
> asynchronously from the CORBA event service using push or pull models.

Ah, I see what you mean. The custom server would work as a demultiplexer
for events. I guess that would reduce the load on the event service because
a single thread in the custom server will be sufficient pass events to many
clients.

> That way there would only
> be one callback object required and perhaps less code would have to be
> written to support push and pull
> models. The GUI would display the results whenever they arrived. The GUI
> could also always block until it received the asynchronous response if
> it was required to perform
> a next step. The negative is that you have demultiplex the various events.
> (Of course, this could be done with custom callback objects as well.)

Well, it also means that you have to bite into all the other associated
problems.
For example, it forces an explicit asynchronous programming style on clients
when the natural style would be to use a blocking pull model. What is worse,
you have to write your IDL definitions to make things explictly asynchronous,
that is, the clients have to provide callback objects to your custom server,
and those callback objects have to be explictly be described in IDL. Having
to do this is nasty because it forces you to change your conceptually
synchronous type model to an asynchronous one in order to accommodate
a limitation of the platform.

Moreover, those clients that still would like to call synchronously can no
longer do that, unless you provide both synchronous and asynchronous
interfaces to your custom server. But that breaks transparency, because
the server is now aware of whether a client is calling synchronously or
asynchronously and executes different code in either case. The beauty
of using AMI and AMD in Ice is that it is transparent: you can use
a synchronous or an asynchronous invocation to invoke on an object
that is implemented using either synchronous or asynchronous dispatch,
and neither client nor server is aware of what the other side is doing.
In addition, with AMI and AMD, the client-server contract is identical
on the wire, so you can change your mind and switch from AMI to AMD
at any time without affecting anything that's already deployed.

Also, if you use your own callback objects, you get into all the hairy
scenarios of what to do when things go wrong. What happens if
a server attempts a callback and gets a TRANSIENT exception?
Under what circumstances does the server conclude that the client
really has gone away forever and reap the client's callback registration?
What happens if clients are ill-behaved and the callback from the
server is slow or blocks? If a single client misbehaves (e.g. blocks
the callback for a long time or is slow to respond), what are the
effects of that on all the other clients? Is service to other clients
degraded as a result? If not, what threading strategy is to be used
to ensure quality of service and at what point and for what reasons
does the server decide to not dedicate any more threads to clients.
How does the server reap threads that are permanently stuck inside
an invocation?

There are answers to all of these questions, of course. But the point
is that the engineering you need to answer these questions is typically
way beyond what application developers are willing to confront. After
all, that's why we build infrastructure to deal with such stuff.

Thomas Richter

unread,

Jan 12, 2005, 6:20:30 AM1/12/05

to

Hi Rob,

>> What I had in mind would be to start the long running task using a
>> typical method call on the custom
>> server that would just start the task and return immediately, but the
>> final response/responses would be received
>> asynchronously from the CORBA event service using push or pull models.

> Ah, I see what you mean. The custom server would work as a demultiplexer
> for events. I guess that would reduce the load on the event service because
> a single thread in the custom server will be sufficient pass events to many
> clients.

As far as I see, I can now collect events from the client side using the
event mechanism, either pulling them out or getting them pushed in from
the server using "helper objects". However,

i) in case I'm pulling events out, how would that reduce scaling problem.
Each client waiting for an event to happen would need to have a separate
thread on the server side, thus causing again the scaling problem.

ii) in case the server pushes events to the clients I would avoid having
a thread for each client at first glance (this is how it used to work).
However, problem #1 would be that this would still require an incoming
connection at the client (bad for firewalls) and problem #2 happens in
case one of the clients would shut down the connection. The
"demultiplexer" code in the server would then try to deliver the event to
this client, waiting for a network connection that would never be
established, thus breaking the connection to all other clients. To work
around this, I could launch a thread for each client that wants to listen,
but this won't scale well, once again.

>> That way there would only
>> be one callback object required and perhaps less code would have to be
>> written to support push and pull
>> models. The GUI would display the results whenever they arrived. The GUI
>> could also always block until it received the asynchronous response if
>> it was required to perform
>> a next step. The negative is that you have demultiplex the various events.
>> (Of course, this could be done with custom callback objects as well.)

> Well, it also means that you have to bite into all the other associated
> problems.
> For example, it forces an explicit asynchronous programming style on clients
> when the natural style would be to use a blocking pull model.

Not a real problem for me since that is how the client code was
designed anyhow (namely, around a "push events" model, though this is
currently not used), including some techniques to avoid typical "push"
related problems. As for example, "overrunning" a client with events
faster than what it could handle plus a synchronization mechanism to
repair and reestablish the connection afterwards. All that is
there. However, how to do that without incoming connections? How to
avoid the problem of clients disconnecting in the middle of an event
transmission, thus blocking the server in its activity to distribute
the outgoing events amongst clients?

> What is worse,
> you have to write your IDL definitions to make things explictly asynchronous,
> that is, the clients have to provide callback objects to your custom server,
> and those callback objects have to be explictly be described in IDL.

Also not a problem since these parts are still around (as part of the
earlier design). Thus, the design change (backwards, so to say) doesn't
frighten me, that's what CVS is good for. But I don't see how this
solves my problems, namely #1 and #2 from above. (-;

> Having to do this is nasty because it forces you to change your conceptually
> synchronous type model to an asynchronous one in order to accommodate
> a limitation of the platform.

> Moreover, those clients that still would like to call synchronously can no
> longer do that, unless you provide both synchronous and asynchronous
> interfaces to your custom server.

That is also done and in the code. There are in fact two interfaces,
one sending asynchronous events in case something's going on, and another
one that explicitly requests an event to "refresh" the state of the client,
i.e. to bring it back in sync.

> But that breaks transparency, because
> the server is now aware of whether a client is calling synchronously or
> asynchronously and executes different code in either case. The beauty
> of using AMI and AMD in Ice is that it is transparent: you can use
> a synchronous or an asynchronous invocation to invoke on an object
> that is implemented using either synchronous or asynchronous dispatch,
> and neither client nor server is aware of what the other side is doing.
> In addition, with AMI and AMD, the client-server contract is identical
> on the wire, so you can change your mind and switch from AMI to AMD
> at any time without affecting anything that's already deployed.

> Also, if you use your own callback objects, you get into all the hairy
> scenarios of what to do when things go wrong. What happens if
> a server attempts a callback and gets a TRANSIENT exception?

Yes, that *is* a problem. #2 on my list, exactly. What worries me
most here is not the exception itself, but the time to recover
from it; this breaks the steadyness (quality of service) of the
communication to other clients.

> Under what circumstances does the server conclude that the client
> really has gone away forever and reap the client's callback registration?

Well spoken. Exactly.

> What happens if clients are ill-behaved and the callback from the
> server is slow or blocks? If a single client misbehaves (e.g. blocks
> the callback for a long time or is slow to respond), what are the
> effects of that on all the other clients? Is service to other clients
> degraded as a result? If not, what threading strategy is to be used
> to ensure quality of service and at what point and for what reasons
> does the server decide to not dedicate any more threads to clients.
> How does the server reap threads that are permanently stuck inside
> an invocation?

And, if I need to launch threads in first place to avoid the problem,
how does that have to be designed to avoid the scaling problem I've
now?

> There are answers to all of these questions, of course.

Well? (-;

> But the point
> is that the engineering you need to answer these questions is typically
> way beyond what application developers are willing to confront. After
> all, that's why we build infrastructure to deal with such stuff.

Oh, I'm willing to take that. At least, think about it.

Anyhow. A big thank you to all that contributed to this discussion.
It is really extremly helpful!

So long,
Thomas

Michi Henning

unread,

Jan 12, 2005, 4:00:00 PM1/12/05

to

"Thomas Richter" <th...@cleopatra.math.tu-berlin.de> wrote in message

news:cs315u$doe$1...@mamenchi.zrz.TU-Berlin.DE...

>
> As far as I see, I can now collect events from the client side using the
> event mechanism, either pulling them out or getting them pushed in from
> the server using "helper objects". However,
>
> i) in case I'm pulling events out, how would that reduce scaling problem.
> Each client waiting for an event to happen would need to have a separate
> thread on the server side, thus causing again the scaling problem.

Right. You'd have to use a push model again, but then you might as
well use the push model with the event service directly.

> ii) in case the server pushes events to the clients I would avoid having
> a thread for each client at first glance (this is how it used to work).
> However, problem #1 would be that this would still require an incoming
> connection at the client (bad for firewalls) and problem #2 happens in
> case one of the clients would shut down the connection. The
> "demultiplexer" code in the server would then try to deliver the event to
> this client, waiting for a network connection that would never be
> established, thus breaking the connection to all other clients. To work
> around this, I could launch a thread for each client that wants to listen,
> but this won't scale well, once again.

Yes. Threads are a fairly limited resource -- you can't afford to have
hundreds or thousands.

> > Well, it also means that you have to bite into all the other associated
> > problems.
> > For example, it forces an explicit asynchronous programming style on
clients
> > when the natural style would be to use a blocking pull model.
>
> Not a real problem for me since that is how the client code was
> designed anyhow (namely, around a "push events" model, though this is
> currently not used), including some techniques to avoid typical "push"
> related problems. As for example, "overrunning" a client with events
> faster than what it could handle plus a synchronization mechanism to
> repair and reestablish the connection afterwards. All that is
> there. However, how to do that without incoming connections? How to
> avoid the problem of clients disconnecting in the middle of an event
> transmission, thus blocking the server in its activity to distribute
> the outgoing events amongst clients?

You can't -- that's the problem. Basically, whenever a server calls back
into a client, the server subjects itself to having the client misbehave and
lose a thread of control. The only way to deal with this is to use multiple
threads -- see above.

> > Also, if you use your own callback objects, you get into all the hairy
> > scenarios of what to do when things go wrong. What happens if
> > a server attempts a callback and gets a TRANSIENT exception?
>
> Yes, that *is* a problem. #2 on my list, exactly. What worries me
> most here is not the exception itself, but the time to recover
> from it; this breaks the steadyness (quality of service) of the
> communication to other clients.

Not to mention that it complicates the code a lot.

> And, if I need to launch threads in first place to avoid the problem,
> how does that have to be designed to avoid the scaling problem I've
> now?
>
> > There are answers to all of these questions, of course.
>
> Well? (-;

Timeouts on invocations. Monitoring client behavior and, if a client
is misbehaved, cancelling the client's callback registration. Making
clients responsible for maintaining a heartbeat with the server to
periodically renew their callback registration. Separate thread pools
for different jobs inside the server, so exhausting one thread pool
doesn't affect the others. Using UDP to deliver events instead of TCP.

There are quite a few options (not all of them realizable with CORBA).
But they are all complex, quite a lot of work to implement and test,
and definitely not the sort of thing you want to engineer every time
you build an application that needs a little bit of asynchronous notification.
This sort of thing really is best encapsulated in a service that you design
and build once, and then reuse over and over.

Mike

unread,

Jan 12, 2005, 5:57:28 PM1/12/05

to

Thomas,

What about some sort of load balancing? Here are a couple of links:

Visibroker, but same ideas should apply to omniORB
http://www.cuj.com/documents/s=8053/cuj9904anderson/anderson.htm

CORBA - Kommunikation und Management by Claudia Linnhoff-Popien (covers
load balancing)
http://www.amazon.com/exec/obidos/tg/detail/-/3540640134/qid=1105567767/sr=8-4/ref=sr_8_xs_ap_i4_xgl14/102-1379604-4761735?v=glance&s=books&n=507846

Article with various load balancing strategies
http://dsonline.computer.org/0103/features/oth0103_print.htm

-Mike

Michi Henning

unread,

Jan 12, 2005, 6:18:01 PM1/12/05

to

"Mike" <kup...@yahoo.com> wrote in message
news:1105570648.5...@c13g2000cwb.googlegroups.com...

> Thomas,
>
> What about some sort of load balancing?

It will help to some degree, but throwing more CPU's at it is a
brute force approach: assume your current server can block
500 clients before it runs out of threads. If you buy 10 servers,
that gives you 5000 clients, which isn't exactly cheap. Yet, with
a better dispatch mechanism, chances are that the same single
server could handle all 5000 clients without any problems.

The root cause of the problem is that, without an asynchronous
dispatch mechanism, it's simply not feasible to block a client
for an indefinite amount of time inside an RPC, because that
ties up a thread for the entire duration of the RPC. Throwing
more servers at it is unlikely to solve the problem. And, moreover,
it's not immediately obvious how any load balancing mechanism
would realize when a server has reached the point where it
cannot accept another blocking client request.

Rob Ratcliff

unread,

Jan 20, 2005, 11:17:24 AM1/20/05

to

Hi Thomas,

I wanted to clarify what I was proposing and how it might reduce the
scaling problem. It is my understanding that that the reason the current
approach doesn't scale is that the server-side processing per method
call takes so long that the server runs out of thread resources to
process lots of simultaneous requests. The other part of the problem is
that many of the clients have to negotiate a firewall so
callbacks are problematic if you don't use bidirectional GIOP and such.

So one way I thought to reduce the time required of a thread was to have
the method return a
"session" object reference immediately for the given request. The
request would be added
to an job event queue that would be processed by as many worker threads
as you've allocated
to it. When the job was finished, the result object would be added to
the "session" object held in
a hash table (for instance) somewhere. The client could poll your
server for the result. The poll
request would temporarily use thread, but it would be very fast. Thus
the contention for the
resources would be reduced. (This, of course, is how Java servlets deal
with
the HTTP protocol, leveraging the idea of a session cookie.)

A variation on the theme would be to create channel per client with the
event server.
The result object event would be pushed to the event server which would
then push
the result to the client or allow the client to pull the result (via
polling) from the event
server. The advantages of the event server are that: some support QOS
features,
they support push and pull operations of any events, and the interface
is already defined for
you. The event processing from the event server would be very fast as
well so
it wouldn't use as many threads simultaneously either.

(Of course, for consistency, the initial request could also be pushed by
the client to the
event server which would then push the event to your server's job queue.
It's kind of nice to be able to make type-safe method calls to your
server though.

Did you get a chance to play with Bi-directional GIOP?

Thanks,

Rob