Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

CORBA Strings

712 views
Skip to first unread message

Steve Atkinson

unread,
Aug 12, 1997, 3:00:00 AM8/12/97
to

Somebody posted a request for a small dissertation on Corba string
objects the other day - here is something we put together at work.

The CORBA string type maps to the C++ char * type, so using strings is
usually fairly familiar. The big requirement however is that all
variables which go to and from the ORB (i.e. all in, out, inout
parameters and return values) must be allocated dynamically using the
functions provided by the ORB. In the case of Orbix, these functions
are CORBA::string_alloc() and CORBA::string_free() (to release them).

In other words, for char * variables which have been mapped from
variables within IDL, don't use malloc(), operator new, or point them
to any other place which wasn't the result of string_alloc(). So for
the following IDL:

interface Anthony
{
string MyOperation ();
}

the following function is generated
char * AnthonyImpl::MyOperation ();

Within this function, you should declare a local variable and allocate
it like this:

char * AnthonyImpl::MyOperation ()
{
char * my_string;
char * other_string;
/*
Do some stuff and make other_string point to the
result.
*/
other_string = DoSomeStuff ();

/*
We must now allocate space for my_string
using string_alloc() and copy other_string
into my_string.
It would not be legal to directly return
other_string as (for the purposes of this
example) it was not allocated using string_alloc()
*/
my_string = CORBA::string_alloc (strlen (other_string) + 1);
strcpy (my_string, other_string);

return my_string;
}

Things you can't do in the above example include:
/*
Can't use malloc
*/
my_string = malloc (strlen (other_string) + 1);
/*
Can't use operator new
*/
my_string = new char [strlen (other_string) + 1];
/*
Can't point my_string at locations that
weren't allocated with string_alloc
*/
my_string = other_string;
/*
Again - can't point my_string at
locations not allocated by string_alloc
*/
my_string = "hello world";


>1) Null pointers and strings

CORBA in general doesn't like null pointers. You therefore cannot use
a null pointer as either an in, out, inout, or return string. I.e. you
cannot say
my_string = NULL;
return my_string;

All the above is required due to the memory management rules of CORBA.
The various rules state that the ORB will assume ownership of
my_string at various times. For example, if you have an operation
returning a string, then when you return the char * variable from the
C++ implementation function the ORB will take ownership of it. It will
copy the string data which the char * variable points to and send it
over the network to the client, and then it will free the memory
pointed to by the string. It will use CORBA::string_free() to do this,
and it can only do this if the string was allocated with
CORBA::string_alloc() in the first place.

>2) String manager class
In Orbix, IDL structures which contain strings are mapped to a C++
structure where the string member is mapped to a String_mgr type. This
String_mgr type is an Orbix specific implementation detail. The CORBA
standard doesn't say what type is actually used, so other ORBs may map
the string member to a different type. The CORBA standard only says
that whatever type is used for the string member, it must behave like
a char *, and also it should have a copy constructor which copies the
member's storage and an assignment operator which releases the
member's old storage.

Trying to figure out what happens when you use string structure
members can look confusing but isn't really once you get used to it.

First of all, when you assign a char * to a String_mgr, the String_mgr
assumes ownership of the memory pointed to by the char *. This means
that you must always use string_alloc() to allocate space for the
String_mgr. The following example is in the Orbix reference guide,
page 40:

VariableLengthStruct vls;
char * s1 = CORBA::string_alloc (5+1);
char * s2 = CORBA::string_alloc (6+1);
strcpy (s1, "first");
strcpy (s2, "second");
/*
Assign s1 to vls.str will result in vls.str taking
ownership of s1
*/
vls.str = s1;
/*
Assign s2 to vls.str will result in vls.str releasing
the space pointed to by s1 and taking ownership
of s2.
*/
vls.str = s2;

At the end of the function, s1 no longer points to valid memory as it
was freed when vls.str was assigned to s2.

If you assign a String_mgr variable to another String_mgr variable, or
assign a String_var variable to a String_mgr variable (or to another
String_var) a deep copy will be made.

VariableLengthStruct vls;
String_var s1 = CORBA::string_alloc (5+1);
String_var s2 = CORBA::string_alloc (6+1);
strcpy (s1, "first");
strcpy (s2, "second");
vls.str = s1; // Copy made of s1
vls.str = s2 // Copy made of s2

At the end of the function, vls.str points to an area containing
"second", and s1 and s2 each point to their own original areas.

>5) strings inside and outside of sequences

Strings inside sequences need to have space allocated for them as in
other places, using CORBA::string_alloc(). You can't set an element of
a string sequence to a null pointer.

>6) what does the standard say and how do the vendors differ?

Almost all of what I have said is specified by the standard, except
for the String_mgr type which is Orbix specific. Other ORB vendors
must supply a similar type for string members of structures which
behaves in the same way.

Of course I may have made some mistakes here, so please shout if I've
missed anything out.
--
Steve Atkinson.

"To err is human. To really foul things up, you need a computer"
(Remove antispam from address when replying)

Mikko Holappa

unread,
Aug 13, 1997, 3:00:00 AM8/13/97
to

Steve Atkinson (st...@starbeck.demon.co.uk) wrote:

: In other words, for char * variables which have been mapped from


: variables within IDL, don't use malloc(), operator new, or point them
: to any other place which wasn't the result of string_alloc(). So for
: the following IDL:

All right, I have interpreted instructions in the same way,too - but:

1) In Orbix programming Guide (for example page 160), client code is as
follows:

Account_var aVar = bVar->newAccount("Chris");

Does Orbix use CORBA::string_alloc() automatically for "Chris" string ?
Or is this command against the string allocation rules, too ?

: my_string = "hello world";

Steve Vinoski

unread,
Aug 13, 1997, 3:00:00 AM8/13/97
to

I think you got everything right except:

Steve Atkinson wrote:
> The CORBA string type maps to the C++ char * type, so using strings is
> usually fairly familiar. The big requirement however is that all
> variables which go to and from the ORB (i.e. all in, out, inout
> parameters and return values) must be allocated dynamically using the
> functions provided by the ORB.

Actually, "in" strings need not be dynamically allocated, since the ORB
only reads them and does not attempt to free them. All other directions
(inout, out, and return) must be dynamically allocated, as you correctly
specified.

--steve

--
Steve Vinoski vinoski at iona.com
Senior Architect 1-800-ORBIX-4U
IONA Technologies, Inc. Cambridge, MA USA 02138
60 Aberdeen Ave. http://www.iona.com/hyplan/vinoski/

Steve Atkinson

unread,
Aug 13, 1997, 3:00:00 AM8/13/97
to

In article <33F20D...@NOSPAMiona.com>, Steve Vinoski
<vin...@NOSPAMiona.com> writes

>I think you got everything right except:
>
>Steve Atkinson wrote:
>> The CORBA string type maps to the C++ char * type, so using strings is
>> usually fairly familiar. The big requirement however is that all
>> variables which go to and from the ORB (i.e. all in, out, inout
>> parameters and return values) must be allocated dynamically using the
>> functions provided by the ORB.
>
>Actually, "in" strings need not be dynamically allocated, since the ORB
>only reads them and does not attempt to free them. All other directions
>(inout, out, and return) must be dynamically allocated, as you correctly
>specified.
>
>--steve
>


but dont you still need to allocate the string in the client ?

for example (this is from memory so the syntax may not be exactly right)

interface x
{

struct some_struct
{
string some_string;
};

char some_func(in some_struct);
};

to call this function from the client, you need to create a some_struct
structure and allocate space for some_string :-

...

some_struct my_struct;
my_struct.some_string = CORBA::string_alloc(20);

....

Michi Henning

unread,
Aug 14, 1997, 3:00:00 AM8/14/97
to

On Tue, 12 Aug 1997, Steve Atkinson wrote:

> Somebody posted a request for a small dissertation on Corba string
> objects the other day - here is something we put together at work.
>
> The CORBA string type maps to the C++ char * type, so using strings is
> usually fairly familiar. The big requirement however is that all
> variables which go to and from the ORB (i.e. all in, out, inout
> parameters and return values) must be allocated dynamically using the
> functions provided by the ORB. In the case of Orbix, these functions
> are CORBA::string_alloc() and CORBA::string_free() (to release them).

As Steve Vinoski already pointed out, in string parameters need not
be dynamically allocated.

An additional restriction is that in and inout strings passed to a server by
the client must not be modified in place by the server. Similarly,
inout, out and return value strings passed from the server to the client
must not be modified by the client. As an example, to modify an inout string
in the server:

void
someImpl::
some_op(char * & sinout) // sinout is an inout string
{
//
// Convert sinout to upper case
//
char *p = CORBA::string_dup(sinout); // Make copy
while (*p) { // Modify copy
toupper(*p);
p++;
}
CORBA::string_free(sinout); // Destroy in value
sinout = p; // Return out value
}

You must treat strings passed across IDL interfaces as read-only because
for co-located client and server, a clever ORB could play allocation
games for efficiency (such as passing a string pointer which points
into ROM). The C++ mapping mandates read-only treatment of strings (and
object references) to permit such optimized implementations.


> /*
> We must now allocate space for my_string
> using string_alloc() and copy other_string
> into my_string.
> It would not be legal to directly return
> other_string as (for the purposes of this
> example) it was not allocated using string_alloc()
> */
> my_string = CORBA::string_alloc (strlen (other_string) + 1);
> strcpy (my_string, other_string);
>
> return my_string;
> }

A minor correction there...

CORBA::string_alloc allocates len + 1 bytes, so the above code wastes
a byte at the end of the string. It is perfectly correct to write
the above instead as:

my_string = CORBA::string_alloc(strlen(other_string));
strcpy(my_string, other_string);

If your ORB doesn't allocate the extra byte, it is broken - the updated
C++ mapping spec mandates that the extra byte be allocated. The reason
is historical - the original mapping spec forgot to state how many bytes
should be allocated, and the only way to fix it without breaking existing
code was to require string_alloc to allocate an extra byte.

You can do the same thing more easily though. CORBA defines CORBA::string_dup()
which combines the allocation and the copy:

my_string = CORBA::string_dup(other_string);

The implementation of string_dup() is of course:

CORBA::String
string_dup(const char *s)
{
char *p = CORBA::string_alloc(strlen(s));
return strcpy(p, s);
}

Your ORB must provide CORBA::string_dup() (at least if it claims to conform
to the current C++ mapping).

Cheers,

Michi.
--
Michi Henning +61 7 33654310
DSTC Pty Ltd +61 7 33654311 (fax)
University of Qld 4072 mi...@dstc.edu.au
AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html


Steve Vinoski

unread,
Aug 14, 1997, 3:00:00 AM8/14/97
to

Steven W. Brenneis wrote:
> Since the door is open on CORBA strings (at least partially open), I
> was wondering why the CORBA string does not map to the ANSI Standard
> string class.

1) There was no standard ANSI C++ string class when we put the C++
mapping together.
2) Certain vendors demanded that the C++ mapping for strings have the
same layout in memory as the C mapping for strings so that they could
have direct interoperability in the same address space.
3) The ANSI C++ string class has changed a lot over the years and is
still not available on many platforms, so having an unusable C++ mapping
string standard for the past 3 or 4 years would not have been a good
idea.
4) Because of item (3), ORB vendors would have been forced to provide
their own ANSI C++ string class implementations, which ultimately would
clash with those supplied by the C++ compiler vendors. At least char*
and const char* are known to work seamlessly with the ANSI C++ string
class.

> Using char's in C++ is at the very least undesirable.
> Mapping to char allows every implementation to vary its coded CORBA
> string class.

What CORBA string class do you speak of? CORBA::String_var? I don't see
why vendors would need to vary that class, as it's pretty clearly
specified in the C++ mapping.

> Mapping to the standard string class would at least
> result in predictable behavior. It would also support more than the
> worn-out ASCII character set.

CORBA has had internationalized types added to it, so IDL now has wchar
and wstring types as well. (Not sure if anyone actually supports them
yet, though.)

> Finally, it would greatly simplify the
> marshalling and unmarshalling routines.

I certainly don't buy this. No matter how you wrap them up, a string is
ultimately composed of characters, and each of those characters must be
read to be marshaled. Marshaling walks the string whether it's inside a
string class or whether it's simply accessed by its pointer. When using
a char*, it's sometimes helpful to have a helper class that does the
marshaling and unmarshaling, but a helper class would also be required
for an ANSI C++ string class, since ORB vendors can't just add
marshaling functions to the one provided with the customer's platform.

Michi Henning

unread,
Aug 14, 1997, 3:00:00 AM8/14/97
to

On Wed, 13 Aug 1997, Steve Atkinson wrote:

> In article <33F20D...@NOSPAMiona.com>, Steve Vinoski
> <vin...@NOSPAMiona.com> writes
> >

> >Actually, "in" strings need not be dynamically allocated, since the ORB
> >only reads them and does not attempt to free them. All other directions
> >(inout, out, and return) must be dynamically allocated, as you correctly
> >specified.
>

> but dont you still need to allocate the string in the client ?
>
> for example (this is from memory so the syntax may not be exactly right)
>
> interface x
> {
>
> struct some_struct
> {
> string some_string;
> };
>
> char some_func(in some_struct);
> };
>
> to call this function from the client, you need to create a some_struct
> structure and allocate space for some_string :-
>
> ...
>
> some_struct my_struct;
> my_struct.some_string = CORBA::string_alloc(20);
>

Yes, but that is not what Steve is saying. Your example deals with
assigning a string to a structure member, not with passing a string
as a parameter.

Your original statement said (paraphrased) "all string parameters must
be dynamically allocated". This is not true for in string parameters.
To illustrate, the following is perfectly correct:

//
// IDL
//
interface foo {
void op(in string s);
};


//
// client-side C++
//
foo_var fv = ...; // Get foo reference

fv->op("Hello world"); // Invoke op

In this case, the parameter value "Hello world" is *not* dynamically
allocated, and it does not need to be.

Incidentally, *in* parameters of *any* type need not be dynamically allocated
(with the exception of object references, which can *only* be dynamically
allocated by calling duplicate()). So, if you want to pass a struct
as an in parameter, you need not dynamically allocate the struct value
either. For example:

//
// IDL
//
struct MyStruct {
long long_member;
string string_member;
};

interface foo {
void op(
};


//
// client-side C++
//
foo_var fv = ...; // Get foo reference

MyStruct s; // MyStruct as an automatic variable

// Initialize struct
s.long_member = 5;
s.string_member = CORBA::string_dup("Hello world");

fv->op(s); // Pass automatic struct

Again, the structure is not dynamically allocated (only the string it
contains is).

Incidentally, another rule relating to IDL strings:

Never pass an uninitialized user-defined structured type across an IDL
interface - it is illegal if the structured type contains strings.

Reason:

Strings inside user-defined complex types (sequences, arrays,
unions, and structs) behave like a String_var. The default
constructor for String_var initializes the char * inside the
String_var to NULL. It is illegal to pass NULL pointers across
IDL interfaces. Therefore, passing an uninitialized user-defined
structured type containing strings is also illegal, because
unless the contained strings are initialized, their value is
the NULL pointer.

I raised an issue with the C++ mapping revision task force several months
ago about this. I think it would be much nicer to define that String_var
(and its variants) must initialized the char * to the empty string (instead
of NULL). This would make it safe to pass a partially initialized (or
uninitialized) structured type across an IDL interface.

It would also eliminate quite a bit of coding annoyance, where (depending
on the IDL) a lot of effort can go into initializing contained strings
just to make them safe (the Naming Service IDL with its 'id' and 'kind'
fields in name components makes this painfully obvious).

I also believe that initializing String_var strings to the empty string
would be (largely) backward compatible, and not cause too much grief.

Steven W. Brenneis

unread,
Aug 14, 1997, 3:00:00 AM8/14/97
to

Steve Atkinson <st...@starbeck.demon.co.uk> wrote:

>Somebody posted a request for a small dissertation on Corba string
>objects the other day - here is something we put together at work.
>

[snip]


>Steve Atkinson.
>
>"To err is human. To really foul things up, you need a computer"
>(Remove antispam from address when replying)

Since the door is open on CORBA strings (at least partially open), I


was wondering why the CORBA string does not map to the ANSI Standard

string class. Using char's in C++ is at the very least undesirable.


Mapping to char allows every implementation to vary its coded CORBA

string class. Mapping to the standard string class would at least


result in predictable behavior. It would also support more than the

worn-out ASCII character set. Finally, it would greatly simplify the
marshalling and unmarshalling routines.

Just wondering.

Michi Henning

unread,
Aug 15, 1997, 3:00:00 AM8/15/97
to

On Thu, 14 Aug 1997, Steven W. Brenneis wrote:

> Since the door is open on CORBA strings (at least partially open), I

I don't understand what you mean by that. CORBA strings in C++ are
well-defined, I can't see any partially open door here. Could you explain?

> was wondering why the CORBA string does not map to the ANSI Standard
> string class.

At the time the C++ mapping was produced, the ANSI C++ standard was nowhere
near as advanced as it is today, and implementations were not generally
available. This meant that a minimal mapping was chosen for strings,
which would be unlikely to get in the way of the eventual ANSI C++ spec.

Michi Henning

unread,
Aug 18, 1997, 3:00:00 AM8/18/97
to

On Thu, 14 Aug 1997, I wrote:

> As an example, to modify an inout string in the server:
>
> void
> someImpl::
> some_op(char * & sinout) // sinout is an inout string
> {
> //
> // Convert sinout to upper case
> //
> char *p = CORBA::string_dup(sinout); // Make copy
> while (*p) { // Modify copy
> toupper(*p);
> p++;
> }
> CORBA::string_free(sinout); // Destroy in value
> sinout = p; // Return out value
> }

I've just had some discussions with Steve Vinoski, who pointed out the
error of my ways... (thanks Steve!)

The above code is correct and will work, but it is needlessly inefficient.
As it turns out, the server *can* modify an inout string in place,
so we can write this as:

void
someImpl::
some_op(char * & sinout) // sinout is an inout string
{
//
// Convert sinout to upper case
//

char *p = sinout;
while (*p) {
toupper(*p);
p++;
}
}

There is no need to deallocate the input string and return a new copy
just because I want to modify the string contents. However, deallocation
*is* necessary if the returned string needs to be longer than the input
string.

However, it is still true that the server must not modify an in string
passed to it by the client, nor can the client modify an out string
or returned string passed to it by the server (other than call
CORBA::string_free() on it).

My misunderstanding arose from the wording in the C++ mapping spec, which
can be interpreted to mean what I stated initially. However, Steve
has reliably informed me that my interpretation was not correct
(he should know, having written a lot of the spec himself ;-)

Erik Groeneveld

unread,
Aug 18, 1997, 3:00:00 AM8/18/97
to

Doesn't section 3.10.2 of the CORBA spec not say that the output string may
not be longer than the input string in case of an inout string parameter?

If this is true, deallocation is only necessary if the out string in
_shorter_
or if you already have a string.

--
-- Erik Groeneveld

Baan Tech
Barneveld, The Netherlands

Michi Henning <mi...@foxtail.dstc.edu.au> wrote in article
<Pine.OSF.3.95.970818...@foxtail.dstc.edu.au>...

Steve Vinoski

unread,
Aug 18, 1997, 3:00:00 AM8/18/97
to

Erik Groeneveld wrote:
> Doesn't section 3.10.2 of the CORBA spec not say that the output string may
> not be longer than the input string in case of an inout string parameter?
>
> If this is true, deallocation is only necessary if the out string in
> _shorter_
> or if you already have a string.

Ugh. That was written way way back when C was the only language
supported by CORBA. The C++ mapping definitely allows reallocation and
longer outputs than inputs, see case 4 of Table 28 in CORBA 2.0. Other
language mappings probably do as well. The text you cite should have
been removed long ago.

Michi Henning

unread,
Aug 19, 1997, 3:00:00 AM8/19/97
to

On Mon, 18 Aug 1997, Erik Groeneveld wrote:

> Doesn't section 3.10.2 of the CORBA spec not say that the output string may
> not be longer than the input string in case of an inout string parameter?
>
> If this is true, deallocation is only necessary if the out string in
> _shorter_
> or if you already have a string.

You are right, it actually says that (page 3-29):

"When an unbounded string or sequence is passed as an inout
parameter, the returned value cannot be longer than the
input value."

I'd never noticed that before, thanks for pointing this out!

The question is whether it is really meant to be that way - I suspect
it is a defect in the spec. If inout strings and sequences cannot
be longer for the returned value, what about structs, arrays, or
unions containing such things?

Consider:

typedef sequence<octet> buf_t;

struct foo {
string s;
buf_t b;
};

interface bar {
void op(inout foo param);
};

Since the spec doesn't say anything about the members of structures,
it looks like the server would be allowed to grow the string and the
sequence inside the inout struct. Of course, that requires reallocation,
which seems to make the restriction on page 3-29 totally pointless.

After all, if I can grow a string or a sequence inside an inout struct,
why shouldn't I be able to grow them when they appear as the parameter
type on their own (without being contained in something else)?

Steven W. Brenneis

unread,
Aug 19, 1997, 3:00:00 AM8/19/97
to

Steve Vinoski <vin...@NOSPAMiona.com> wrote:

>Steven W. Brenneis wrote:
>> Since the door is open on CORBA strings (at least partially open), I

>> was wondering why the CORBA string does not map to the ANSI Standard
>> string class.
>

>1) There was no standard ANSI C++ string class when we put the C++
>mapping together.
>2) Certain vendors demanded that the C++ mapping for strings have the
>same layout in memory as the C mapping for strings so that they could
>have direct interoperability in the same address space.
>3) The ANSI C++ string class has changed a lot over the years and is
>still not available on many platforms, so having an unusable C++ mapping
>string standard for the past 3 or 4 years would not have been a good
>idea.
>4) Because of item (3), ORB vendors would have been forced to provide
>their own ANSI C++ string class implementations, which ultimately would
>clash with those supplied by the C++ compiler vendors. At least char*
>and const char* are known to work seamlessly with the ANSI C++ string
>class.
>

Agreed and understood, probably wishful thinking on my part trying to
push vendors to adopting standard compilers.

>> Using char's in C++ is at the very least undesirable.
>> Mapping to char allows every implementation to vary its coded CORBA
>> string class.
>

>What CORBA string class do you speak of? CORBA::String_var? I don't see
>why vendors would need to vary that class, as it's pretty clearly
>specified in the C++ mapping.
>

The problem comes in the implementation of CORBA::String_member in
which constructors and assigment operator overloads that take char*
(*not* const char*) are used. In most implementations, the
constructor simply copies the pointer (a.k.a shallow copy), an unsafe
C++ operation. I fully understand how handy this can be due to the
requirement for freeing string references following a function
execution. Unfortunately, this causes all sorts of unhappy
side-effects, e.g. string literals must be cast since the compiler is
required to type a string literal as a non-const char reference and
the string storage recovery routines will fail if they attempt to
delete the storage. This necessarily leads to vendors needing to
allow for this and other side-effects, thereby possibly creating more
(and sometimes more undesirable) side-effects.

>> Mapping to the standard string class would at least
>> result in predictable behavior. It would also support more than the
>> worn-out ASCII character set.
>

>CORBA has had internationalized types added to it, so IDL now has wchar
>and wstring types as well. (Not sure if anyone actually supports them
>yet, though.)
>

Once again, a convincing argument to adopt the standard string class
which is actually a template using a traits structure. To vary the
underlying traits of the string, you simply substitute the
characteristics of any character set and code re-use is maximized.

>> Finally, it would greatly simplify the
>> marshalling and unmarshalling routines.
>

>I certainly don't buy this. No matter how you wrap them up, a string is
>ultimately composed of characters, and each of those characters must be
>read to be marshaled. Marshaling walks the string whether it's inside a
>string class or whether it's simply accessed by its pointer. When using
>a char*, it's sometimes helpful to have a helper class that does the
>marshaling and unmarshaling, but a helper class would also be required
>for an ANSI C++ string class, since ORB vendors can't just add
>marshaling functions to the one provided with the customer's platform.
>

As anyone who has marshalled text to a stream will attest, the ANSI
string class has standardized and simplified the process. The c_str()
member is used to insert the string in a binary stream, the explicit
const char* constructor is used to extract it. About one or two lines
of code each and extremely consistent behavior.

Steve Vinoski

unread,
Aug 19, 1997, 3:00:00 AM8/19/97
to

Steven W. Brenneis wrote:
> The problem comes in the implementation of CORBA::String_member in
> which constructors and assigment operator overloads that take char*
> (*not* const char*) are used. In most implementations, the
> constructor simply copies the pointer (a.k.a shallow copy), an unsafe
> C++ operation.

Actually, this should be a trait of all implementations of the string
member class, since this behavior is required by the spec.

> I fully understand how handy this can be due to the
> requirement for freeing string references following a function
> execution. Unfortunately, this causes all sorts of unhappy
> side-effects, e.g. string literals must be cast since the compiler is
> required to type a string literal as a non-const char reference and
> the string storage recovery routines will fail if they attempt to
> delete the storage. This necessarily leads to vendors needing to
> allow for this and other side-effects, thereby possibly creating more
> (and sometimes more undesirable) side-effects.

Two things:

1) Get in the habit of doing a CORBA::string_dup("string literal") when
assigning string literals. Even if you cast to const char*, a string_dup
is what's going on under the covers, and calling string_dup looks better
than the cast.

2) ANSI C++ has changed the type of a string literal from char* to const
char*, so once this feature becomes widespread in C++ compilers you
won't have this problem.

Steven W. Brenneis

unread,
Sep 3, 1997, 3:00:00 AM9/3/97
to

Michi Henning <mi...@foxtail.dstc.edu.au> wrote:

>On Thu, 14 Aug 1997, Steven W. Brenneis wrote:
>
>> Since the door is open on CORBA strings (at least partially open), I
>

>I don't understand what you mean by that. CORBA strings in C++ are
>well-defined, I can't see any partially open door here. Could you explain?
>

>> was wondering why the CORBA string does not map to the ANSI Standard
>> string class.
>

>At the time the C++ mapping was produced, the ANSI C++ standard was nowhere
>near as advanced as it is today, and implementations were not generally
>available. This meant that a minimal mapping was chosen for strings,
>which would be unlikely to get in the way of the eventual ANSI C++ spec.
>

> Cheers,
>
> Michi.
>--
>Michi Henning +61 7 33654310
>DSTC Pty Ltd +61 7 33654311 (fax)
>University of Qld 4072 mi...@dstc.edu.au
>AUSTRALIA http://www.dstc.edu.au/BDU/staff/michi-henning.html
>

I guess maybe the basis for my discussion would be more along the line
of, "Do we have to wait for a completely new CORBA standard to get a
new C++ mapping?" I agree that the C++ standard is evolving still,
but that evolution seems to have slowed somewhat and it might be
appropriate for OMG to look to that mapping. The standard library and
the standard template library offer a lot of opportunities that, IMHO,
could make the CORBA/C++ combination even more powerful.


Michi Henning

unread,
Sep 8, 1997, 3:00:00 AM9/8/97
to

On 4 Sep 1997, Steinar Bang wrote:

> >>>>> bren...@surry.net (Steven W. Brenneis):


>
> > Michi Henning <mi...@foxtail.dstc.edu.au> wrote:
> > I guess maybe the basis for my discussion would be more along the line
> > of, "Do we have to wait for a completely new CORBA standard to get a
> > new C++ mapping?" I agree that the C++ standard is evolving still,
> > but that evolution seems to have slowed somewhat and it might be
> > appropriate for OMG to look to that mapping. The standard library and
> > the standard template library offer a lot of opportunities that, IMHO,
> > could make the CORBA/C++ combination even more powerful.

Steinar, you misquoted me here - this was actually said by someone else,
not myself.

> In short:
> - IDL "sequence" to STL "vector"
> - IDL "string" to ISO C++ "string"
> - IDL "boolan" to ISO C++ "bool"
>
> I've been wishing for these three since 1995. How much of a problem
> is it that code will be using the existing binding?

If you do make that change, a lot of existing code will simply break.
At the very least, any changes along the lines you suggest will require
a migration strategy. A switch on the IDL compiler may be sufficient though.

Part of the reason for the current C++ mapping is that some of the
specifiers felt very strongly that the C++ data representation should
be compatible with the C data representation (in other words, the binary
format of the data should be the same for C and C++).

Personally, I don't think that was wise. The way the spec is written,
it is possible, but not required, to use the same binary representation,
so there is no guarantee.

In addition, some of the mapping ended up quite ugly because of the idea
that binary data should be the same in both C and C++ mappings.

IDL sequences are particularly affected by this. They really aren't
sequences, but variable-length vectors in the C++ mapping. As far as
a sequence abstraction is concerned, they don't really make the grade
(for example, you can't insert or delete an element in the middle of
a sequence).

At the same time, the mapping does *not* guarantee that sequences are
allocated in a contiguous block of memory, so you end up with the worst
of scenarios - the API limits sequences to be variable length vectors
without giving you any of the advantages associated with that :-(

> I find myself
> doing conversions between the above types, to be able to pass them as
> CORBA parameters, while using the native C++ types at both ends
> (OK... sequence/vector conversions are too costly, but you get my
> drift...)

Sequence/vector conversions may be entirely affordable. It simply
depends of the relative costs of the additional data copy versus the
cost of implementing the body of an operation. Given that most IDL
calls arrive from remote clients (and take an eternity compared to
a local call), the relative cost of converting sequences to vectors and
back is very small (unless the sequences are truly huge).

For booleans, not all C++ compilers support type bool yet. If your
compiler does, lobby your ORB supplier to take advantage of that and
to map IDL boolean to C++ bool (that's a compliant mapping).

Steinar Bang

unread,
Sep 8, 1997, 3:00:00 AM9/8/97
to

>>>>> Michi Henning <mi...@foxtail.dstc.edu.au>:

> On 4 Sep 1997, Steinar Bang wrote:

>> In short:
>> - IDL "sequence" to STL "vector"
>> - IDL "string" to ISO C++ "string"
>> - IDL "boolan" to ISO C++ "bool"

>> I've been wishing for these three since 1995. How much of a problem
>> is it that code will be using the existing binding?

> If you do make that change, a lot of existing code will simply
> break.

I know. My own among others.

> At the very least, any changes along the lines you suggest will
> require a migration strategy. A switch on the IDL compiler may be
> sufficient though.

It might. I would be willing to go through the migration process, to
get more elegant, more efficient and simpler source code.

> Part of the reason for the current C++ mapping is that some of the
> specifiers felt very strongly that the C++ data representation
> should be compatible with the C data representation (in other words,
> the binary format of the data should be the same for C and C++).

I know.

> Personally, I don't think that was wise.

I agree.

[snip!]

>> I find myself doing conversions between the above types, to be able
>> to pass them as CORBA parameters, while using the native C++ types
>> at both ends (OK... sequence/vector conversions are too costly, but
>> you get my drift...)

> Sequence/vector conversions may be entirely affordable. It simply
> depends of the relative costs of the additional data copy versus the
> cost of implementing the body of an operation. Given that most IDL
> calls arrive from remote clients (and take an eternity compared to
> a local call), the relative cost of converting sequences to vectors and
> back is very small (unless the sequences are truly huge).

Most of my IDL calls, are, and will be local.

> For booleans, not all C++ compilers support type bool yet. If your
> compiler does, lobby your ORB supplier to take advantage of that and
> to map IDL boolean to C++ bool (that's a compliant mapping).

Hm, interesting... Where in the spec?

Michi Henning

unread,
Sep 9, 1997, 3:00:00 AM9/9/97
to

On 8 Sep 1997, Steinar Bang wrote:

> >>>>> Michi Henning <mi...@foxtail.dstc.edu.au>:


>
> > For booleans, not all C++ compilers support type bool yet. If your
> > compiler does, lobby your ORB supplier to take advantage of that and
> > to map IDL boolean to C++ bool (that's a compliant mapping).
>
> Hm, interesting... Where in the spec?

It's on page 16-12 in the new portability submission (orbos/97-04-14).
The spec for the boolean mapping is interesting for what it doesn't say,
rather than for what it does say...

If you read it carefully, you will find that IDL boolean can in fact be
mapped to absolutely any C++ type (including crazy stuff, like a 10 kB
class). The only requirement is that the mapped boolean must be
distinguishable for overloading from the other basic types (except char
and octet). The spec also requires (obliquely) that you must be able
to assign 0 and 1 to the type. In addition, for ANSI C++ compilers, you
also must be able to assign true and false (however, the spec does not
require that IDL boolean must be mapped to C++ bool in ANSI C++ environments).

0 new messages