Hoping not to do the ugly

Richard Harter

unread,

Mar 23, 2010, 5:53:14 PM3/23/10

to

This is a request for suggestions about a good way to do
something. The context is message passing. The program(s) in
question are divided into system code and user code. User code
consists of lots of autonomous elements that pass messages back
and forth. System code takes care of scheduling user code
elements and handling the mechanics of message handling.

There is a lot more to it than that, but that will do for
context. The prototype for a user code element that responds to
a message currently looks like this:

void response_function(sigil_s, void *, packet_s);

The first and third arguments are typedefs for structs. There
are very good reasons why they aren't pointers to structs so
don't go there. For sundry reasons we don't want the fields of
these structs visible to users. (Keepen der sticky Fingers offen

die Buttons).

User code interacts with system code via an API; the game is to
get users to access structs via API calls.

One way to do this is to create structs that look like this:

struct pseudo_sigil {char[n]}; /* n is sizeof(sigil_s) */

In the API we can do something charming like:

sigil_s foo;
...
foo = (sigil_s)bar; /* bar is a pseudo_sigil struct */

This is mildly cumbersome but I think it can be made to work
except for one small glitch - we don't want to hard code the size

of sigil_s et al.

I can think of a couple of ways to hide the sizeof using
preprocessor magic but ugly is as ugly does.

If anybody has any good suggestions, I would be delighted to hear

them.

Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
It's not much to ask of the universe that it be fair;
it's not much to ask but it just doesn't happen.

Seebs

unread,

Mar 23, 2010, 6:22:10 PM3/23/10

to

On 2010-03-23, Richard Harter <c...@tiac.net> wrote:
> I can think of a couple of ways to hide the sizeof using
> preprocessor magic but ugly is as ugly does.

Hmm.

A couple of questions:

1. Is it necessary to completely hide the structure, or do you think you
can get away with just TELLING people "this stuff here is not for using"?
2. Can you do some kind of code-generation-on-export? e.g.:
#include <internal_foo.h>

int main(void) {
printf("struct foo { char INTERNAL[%d]; };",
(int) sizeof(internal_foo));
return 0;
}
3. If you provide a friendly enough API for manipulating the objects,
will users trust you and just use the API?
4. What if you do that, and then regularly and sadistically break
any code which relies on the internals, and keep telling them "hey, we
TOLD you this was internal?"
5. How is your budget for ninja death squads to eliminate those who
oppose you? (Life at $dayjob is a lot better since we got a martial
artist in the group.)

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Eric Sosman

unread,

Mar 23, 2010, 6:26:40 PM3/23/10

to

On 3/23/2010 5:53 PM, Richard Harter wrote:
> This is a request for suggestions about a good way to do
> something. The context is message passing. The program(s) in
> question are divided into system code and user code. User code
> consists of lots of autonomous elements that pass messages back
> and forth. System code takes care of scheduling user code
> elements and handling the mechanics of message handling.
>
> There is a lot more to it than that, but that will do for
> context. The prototype for a user code element that responds to
> a message currently looks like this:
>
>
> void response_function(sigil_s, void *, packet_s);
>
> The first and third arguments are typedefs for structs. There
> are very good reasons why they aren't pointers to structs so
> don't go there. For sundry reasons we don't want the fields of
> these structs visible to users. (Keepen der sticky Fingers offen
>
> die Buttons).
>
> User code interacts with system code via an API; the game is to
> get users to access structs via API calls.
>
> One way to do this is to create structs that look like this:
>
> struct pseudo_sigil {char[n]}; /* n is sizeof(sigil_s) */

(Missing an identifier.)

> In the API we can do something charming like:
>
> sigil_s foo;
> ...
> foo = (sigil_s)bar; /* bar is a pseudo_sigil struct */

Won't work, since you can't cast to a struct type. Other
possibilities:

sigil_s foo;
memcpy (&foo, &bar, sizeof foo);

sigil_s foo = *(sigil_s*) &bar;

sigil_s *pfoo = (sigil_s*) &bar;

The latter two could run into alignment problems that may or may
not be tractable. (In the last, `pfoo' might spelled `pfui'.)

> This is mildly cumbersome but I think it can be made to work
> except for one small glitch - we don't want to hard code the size
>
> of sigil_s et al.
>
> I can think of a couple of ways to hide the sizeof using
> preprocessor magic but ugly is as ugly does.
>
> If anybody has any good suggestions, I would be delighted to hear

I think I'd write a "helper" program to generate a header
with the pseudo declarations, getting the sizes from the real
ones, something along the lines of:

#include <stdio.h>
#include "real_structs.h"

static void
declare(const char *tag, size_t size) {
printf ("struct %s {\n"
" innards[%zu];\n"
"};",
tag, size);
}
}

int main(void) {
declare("pseudo_sigil", sizeof(sigil_s));
declare("pseudo_giggle", sizeof(giggle_s));
...
return 0;
}

If you're trying to deal with alignment issues, you'll
probably generate declarations more along the lines of

struct pseudo_sigil {
union {
long align;
char innards[42];
} guts;
};

--
Eric Sosman
eso...@ieee-dot-org.invalid

Ben Bacarisse

unread,

Mar 23, 2010, 8:07:21 PM3/23/10

to

c...@tiac.net (Richard Harter) writes:

> This is a request for suggestions about a good way to do
> something. The context is message passing. The program(s) in
> question are divided into system code and user code. User code
> consists of lots of autonomous elements that pass messages back
> and forth. System code takes care of scheduling user code
> elements and handling the mechanics of message handling.
>
> There is a lot more to it than that, but that will do for
> context. The prototype for a user code element that responds to
> a message currently looks like this:
>
>
> void response_function(sigil_s, void *, packet_s);
>
> The first and third arguments are typedefs for structs. There
> are very good reasons why they aren't pointers to structs so
> don't go there. For sundry reasons we don't want the fields of
> these structs visible to users. (Keepen der sticky Fingers offen
> die Buttons).

Mildly silly idea: For this "real" struct:

struct sigil_s { char a; int b, c; };

you publish

union sigil_s {
struct { char a; int b, c; } p1;
struct { char a; int c, b; } p2;
struct { int b, c; char a; } p3;
struct { int c, b; char a; } p4;
struct { int b; char a; int c; } p5;
struct { int c; char a; int b; } p6;
};

but you never say which union member the API actually uses!

When some users have cracked this one, you wrap that in a struct with
a member that /encodes/ which union is to be used. This gets chosen
at random when the API allocates the object initially. That'll hold
'em off for a few more weeks.

Slightly more serious: Can you include a "check sum" or is that too
costly? Obviously, by "check sum" I mean a member that encodes some
secret function of the other members so that any changes not made by
API functions are detectable.

Finally, I'd suggest a "handle" interface where the system code maps
handles to real struct objects, but that is functionally so much like
a pointer that I imagine it is ruled out for the same reasons.

<snip>
--
Ben.

BGB / cr88192

unread,

Mar 23, 2010, 8:08:34 PM3/23/10

to

"Richard Harter" <c...@tiac.net> wrote in message
news:4ba93834....@text.giganews.com...

2 ideas:
if only struct pointers are needed, then one can simply use incomplete
structs for the client code, since the compiler generally only complains
about structs being incomplete if one tries to deference them, ...

typedef struct foo_s foo;

#ifdef __MAGIC_THIS_IS_LIB_INTERNAL
struct foo_s {
...
};
#endif

pass-by-value, or giving the client something to hold onto, is a little
harder.
one way I have seen before is to have a dummy struct which is simply padding
(as well as a little other trickery to keep them the same size).

so, the internal code will see the full struct, and the client code will
simply see a struct full of padding...

personally though, I rarely use pass-by-value in these cases, and so the
incomplete struct strategy works fairly well...

another strategy is to have void pointers or no-op structs, but these may
require casts to use...

Richard Harter

unread,

Mar 23, 2010, 11:24:36 PM3/23/10

to

On 23 Mar 2010 22:22:10 GMT, Seebs <usenet...@seebs.net>
wrote:

>On 2010-03-23, Richard Harter <c...@tiac.net> wrote:
>> I can think of a couple of ways to hide the sizeof using
>> preprocessor magic but ugly is as ugly does.
>
>Hmm.
>
>A couple of questions:
>
>1. Is it necessary to completely hide the structure, or do you think you
>can get away with just TELLING people "this stuff here is not for using"?

My view is that it is necessary. It comes under the "There is
always ten percent that doesn't get the word" rule.

>2. Can you do some kind of code-generation-on-export? e.g.:
> #include <internal_foo.h>
>
> int main(void) {
> printf("struct foo { char INTERNAL[%d]; };",
> (int) sizeof(internal_foo));
> return 0;
> }

I expect it could be done. It's not an attractive choice.

>3. If you provide a friendly enough API for manipulating the objects,
>will users trust you and just use the API?

If all goes well end users don't even see the API. Environments
in which applications exist are on top of this code; they use the
API, and it is supposed to be solid even if the engine is
revised.

>4. What if you do that, and then regularly and sadistically break
>any code which relies on the internals, and keep telling them "hey, we
>TOLD you this was internal?"

We can't keep them from digging into the source code, but we can
remove temptation from view.

>5. How is your budget for ninja death squads to eliminate those who
>oppose you? (Life at $dayjob is a lot better since we got a martial
>artist in the group.)

There are things worse than death.

Richard Harter

unread,

Mar 23, 2010, 11:34:22 PM3/23/10

to

On Tue, 23 Mar 2010 18:26:40 -0400, Eric Sosman
<eso...@ieee-dot-org.invalid> wrote:

>On 3/23/2010 5:53 PM, Richard Harter wrote:
>> This is a request for suggestions about a good way to do
>> something. The context is message passing. The program(s) in
>> question are divided into system code and user code. User code
>> consists of lots of autonomous elements that pass messages back
>> and forth. System code takes care of scheduling user code
>> elements and handling the mechanics of message handling.
>>
>> There is a lot more to it than that, but that will do for
>> context. The prototype for a user code element that responds to
>> a message currently looks like this:
>>
>>
>> void response_function(sigil_s, void *, packet_s);
>>
>> The first and third arguments are typedefs for structs. There
>> are very good reasons why they aren't pointers to structs so
>> don't go there. For sundry reasons we don't want the fields of
>> these structs visible to users. (Keepen der sticky Fingers offen
>>
>> die Buttons).
>>
>> User code interacts with system code via an API; the game is to
>> get users to access structs via API calls.
>>
>> One way to do this is to create structs that look like this:
>>
>> struct pseudo_sigil {char[n]}; /* n is sizeof(sigil_s) */
>
> (Missing an identifier.)

You don't use zero length identifiers?

>
>> In the API we can do something charming like:
>>
>> sigil_s foo;
>> ...
>> foo = (sigil_s)bar; /* bar is a pseudo_sigil struct */
>
> Won't work, since you can't cast to a struct type. Other
>possibilities:
>
> sigil_s foo;
> memcpy (&foo, &bar, sizeof foo);
>
> sigil_s foo = *(sigil_s*) &bar;

This is what I had in mind but I got lazy. Mea culpa.

>
> sigil_s *pfoo = (sigil_s*) &bar;
>
>The latter two could run into alignment problems that may or may
>not be tractable. (In the last, `pfoo' might spelled `pfui'.)

I don't think alignment will be a problem in this usage; the
general case is up for grabs.

Thanks for the suggestions; they will improve the result if I do
go the helper route.

Richard Harter

unread,

Mar 24, 2010, 12:19:22 AM3/24/10

to

Snort.

>
>When some users have cracked this one, you wrap that in a struct with
>a member that /encodes/ which union is to be used. This gets chosen
>at random when the API allocates the object initially. That'll hold
>'em off for a few more weeks.
>
>Slightly more serious: Can you include a "check sum" or is that too
>costly? Obviously, by "check sum" I mean a member that encodes some
>secret function of the other members so that any changes not made by
>API functions are detectable.

I don't think that will work. I don't really want users to be
able to see inside these structs except though the eyes of the
API, nor do I want them to create them except by an API call.

>
>Finally, I'd suggest a "handle" interface where the system code maps
>handles to real struct objects, but that is functionally so much like
>a pointer that I imagine it is ruled out for the same reasons.

In the case of the sigil_s struct I can't use a handle because
the information about where the code is running and the identity
of the user function is embedded in the sigil. It is, so to
speak, a passport that the user function can use to access the
API.

In the case of the packet_s struct I could use a handle because
the sigil exists.

It occurs to me now that I can wrap the calls to the user code.
Currently response functions are invoked with

exec(auton->sigil,auton->globals,pkg->packet);

where exec is a function pointer for a response function. This
formulation puts a copy of auton->sigil and of pkg->packet on the
stack. The user can mess up the stack of course, but I can't do
much about that. However having copies on the stack means that
the user can't reach back into the internals of the system code
via pointers.

However what I could do is make copies (as automatic variables)
in the routine that calls the response functions. The code then
becomes:

...
sigil_s sigil;
packet_s packet;
...
sigil = auton->sigil;
packet = pkg->packet;
exec(&sigil,auton->globals,&packet);

Now the user code is looking at pointers, but the pointers aren't
pointing into system code so that is okay.

Life is not all roses. Now the user code no longer has a valid
sigil; it only has a pointer to a sigil so that is the only thing
it can pass to the API which means that every API function has to
be changed. Bummer. We also have to do some checks on that
pointer.

Oh yes, we also want to change the response function prototype to

void response_function(void *, void *, void *);

I think this may work, but there may be a gotcha I'm not seeing
at the moment.

Nick Keighley

unread,

Mar 24, 2010, 5:07:32 AM3/24/10

to

I don't see how you avoid it. Depending of course on what you mean by
"hard code". At compile time the value of n must be known no matter
how you wrap it up.

How have you got into the state where you trust your users so little?
Would you be better off with a less low level language where you have
more control over what they do?

Richard Bos

unread,

Mar 24, 2010, 5:48:14 AM3/24/10

to

c...@tiac.net (Richard Harter) wrote:

> void response_function(sigil_s, void *, packet_s);
>
> The first and third arguments are typedefs for structs. There
> are very good reasons why they aren't pointers to structs so
> don't go there. For sundry reasons we don't want the fields of
> these structs visible to users. (Keepen der sticky Fingers offen
> die Buttons).

Those are more or less contradictory requirements. All solutions are
going to be ugly, because passing information without passing
information (i.e., prevaricating to your caller) _is_ ugly. Sorry.

Richard

Eric Sosman

unread,

Mar 24, 2010, 9:00:36 AM3/24/10

to

On 3/23/2010 11:34 PM, Richard Harter wrote:
> On Tue, 23 Mar 2010 18:26:40 -0400, Eric Sosman
> <eso...@ieee-dot-org.invalid> wrote:

>> [...]
>> sigil_s *pfoo = (sigil_s*)&bar;

>>
>> The latter two could run into alignment problems that may or may
>> not be tractable. (In the last, `pfoo' might spelled `pfui'.)
>
> I don't think alignment will be a problem in this usage; the
> general case is up for grabs.

Since the `bar' struct contains only a char[] array, it
might be aligned no more strictly than char itself -- that
is, at any address whatsoever. If the actual `sigil_s' struct
contains only char elements, that's probably all right. If
it contains any multi-byte elements, I think you should be
concerned.

--
Eric Sosman
eso...@ieee-dot-org.invalid

ImpalerCore

unread,

Mar 24, 2010, 10:13:57 AM3/24/10

to

On Mar 23, 6:22 pm, Seebs <usenet-nos...@seebs.net> wrote:
> On 2010-03-23, Richard Harter <c...@tiac.net> wrote:
>
> > I can think of a couple of ways to hide the sizeof using
> > preprocessor magic but ugly is as ugly does.
>
> Hmm.
>
> A couple of questions:
>
> 1. Is it necessary to completely hide the structure, or do you think you
> can get away with just TELLING people "this stuff here is not for using"?
> 2. Can you do some kind of code-generation-on-export? e.g.:
> #include <internal_foo.h>
>
> int main(void) {
> printf("struct foo { char INTERNAL[%d]; };",
> (int) sizeof(internal_foo));
> return 0;
> }
> 3. If you provide a friendly enough API for manipulating the objects,
> will users trust you and just use the API?
> 4. What if you do that, and then regularly and sadistically break
> any code which relies on the internals, and keep telling them "hey, we
> TOLD you this was internal?"
> 5. How is your budget for ninja death squads to eliminate those who
> oppose you? (Life at $dayjob is a lot better since we got a martial
> artist in the group.)

6. Use a language that actually allows you to control data and
interface visibility through language supported features. I know it's
a long shot, but it is there.

> -s
> --
> Copyright 2010, all wrongs reversed. Peter Seebach / usenet-nos...@seebs.nethttp://www.seebs.net/log/<-- lawsuits, religion, and funny pictureshttp://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Richard Harter

unread,

Mar 24, 2010, 10:21:47 AM3/24/10

to

On Wed, 24 Mar 2010 09:00:36 -0400, Eric Sosman
<eso...@ieee-dot-org.invalid> wrote:

>On 3/23/2010 11:34 PM, Richard Harter wrote:
>> On Tue, 23 Mar 2010 18:26:40 -0400, Eric Sosman
>> <eso...@ieee-dot-org.invalid> wrote:
>>> [...]
>>> sigil_s *pfoo = (sigil_s*)&bar;
>>>
>>> The latter two could run into alignment problems that may or may
>>> not be tractable. (In the last, `pfoo' might spelled `pfui'.)
>>
>> I don't think alignment will be a problem in this usage; the
>> general case is up for grabs.
>
> Since the `bar' struct contains only a char[] array, it
>might be aligned no more strictly than char itself -- that
>is, at any address whatsoever. If the actual `sigil_s' struct
>contains only char elements, that's probably all right. If
>it contains any multi-byte elements, I think you should be
>concerned.

I'm probably not going to use that technique, but I might, and I
think it is worth while exploring the issue to get it right. The
only use of bar is to be passed in calling sequences from the
system to user code which in turn passes it to a system API. It
is, so speak, encoded when the system passes it to the user and
decoded when the API receives it.

In the system code we would have

sigil_s sigil
...
response_function_pointer(*(pseudo_sigil *)&sigil,g,p);

In the response function we would have

void response_function(pseudo_sigil bar, /* etc */)
{
...
api_call(bar,/* etc */);
...
}

And in the API code we would have

void api_call(pseudo_sigil bar, /* etc */)
{
sigil_s sigil;
sigil = *(sigil_s *)&bar;
...
}

Now this is unsafe if the user makes a copy of bar and mucks it
up, but lets assume that the user just passes bar on. Bar is a
struct; can it be unaligned? Can a passed aligned char array be
unaligned when received?

Supposing that it can because the only contents are a char array
can the following ever be unaligned?

struct pseudo_sigil {
union u {
long dummy;
char payload[24];

Richard Harter

unread,

Mar 24, 2010, 10:22:42 AM3/24/10

to

On Wed, 24 Mar 2010 09:48:14 GMT, ral...@xs4all.nl (Richard Bos)
wrote:

I gather that you never use opaque pointers or handles.

Richard Harter

unread,

Mar 24, 2010, 10:59:44 AM3/24/10

to

On Wed, 24 Mar 2010 02:07:32 -0700 (PDT), Nick Keighley
<nick_keigh...@hotmail.com> wrote:

>On 23 Mar, 21:53, c...@tiac.net (Richard Harter) wrote:

[snip]

>
>How have you got into the state where you trust your users so little?
>Would you be better off with a less low level language where you have
>more control over what they do?

It is not a question of not trusting users; the objective is to
create trustworthy code. If the user can easily or inadverently
break my code then it is not very trustworthy. The closer code
is to the bottom of the pyramid the more trustworthy it should
be.

I wouldn't use a different language just to solve this little
problem, but one might have been appropriate for the entire task.

ImpalerCore

unread,

Mar 24, 2010, 11:24:08 AM3/24/10

to

On Mar 24, 10:59 am, c...@tiac.net (Richard Harter) wrote:
> On Wed, 24 Mar 2010 02:07:32 -0700 (PDT), Nick Keighley
>

> <nick_keighley_nos...@hotmail.com> wrote:
> >On 23 Mar, 21:53, c...@tiac.net (Richard Harter) wrote:
>
> [snip]
>
>
>
> >How have you got into the state where you trust your users so little?
> >Would you be better off with a less low level language where you have
> >more control over what they do?
>
> It is not a question of not trusting users; the objective is to
> create trustworthy code. If the user can easily or inadverently
> break my code then it is not very trustworthy. The closer code
> is to the bottom of the pyramid the more trustworthy it should
> be.

If these are errors of ignorance, maybe more education is needed? If
these errors aren't being made maliciously, maybe a FAQ or a set of
verbosely documented examples for your users to follow would be more
useful.

There's a difference to me between navigating a doxygen generated API
and seeing an example using that function in the proper way. It's
definitely more work to generate these examples, but it may be worth
the effort depending on the kinds of user issues you have.

> I wouldn't use a different language just to solve this little
> problem, but one might have been appropriate for the entire task.
>

> Richard Harter, c...@tiac.nethttp://home.tiac.net/~cri,http://www.varinoma.com

Nick Keighley

unread,

Mar 24, 2010, 11:58:16 AM3/24/10

to

On 24 Mar, 14:22, c...@tiac.net (Richard Harter) wrote:
> On Wed, 24 Mar 2010 09:48:14 GMT, ralt...@xs4all.nl (Richard Bos)

> wrote:
>
> >c...@tiac.net (Richard Harter) wrote:
>
> >> void response_function(sigil_s, void *, packet_s);
>
> >> The first and third arguments are typedefs for structs. There
> >> are very good reasons why they aren't pointers to structs so
> >> don't go there. For sundry reasons we don't want the fields of
> >> these structs visible to users. (Keepen der sticky Fingers offen
> >> die Buttons).
>
> >Those are more or less contradictory requirements. All solutions are
> >going to be ugly, because passing information without passing
> >information (i.e., prevaricating to your caller) _is_ ugly. Sorry.
>
> I gather that you never use opaque pointers or handles.

you aren't using opaque pointers or handles, you're using opaque
values. Handles would make more sense to me.

Branimir Maksimovic

unread,

Mar 24, 2010, 12:58:31 PM3/24/10

to

Well at least in some directx driver programer used last bit
of pointer to make difference between pointer or handle
rendering anything above 2gb in win32 useless.

Greets!

--
http://maxa.homedns.org/

Sometimes online sometimes not

Seebs

unread,

Mar 24, 2010, 1:50:50 PM3/24/10

to

On 2010-03-24, Richard Harter <c...@tiac.net> wrote:
> On Wed, 24 Mar 2010 09:48:14 GMT, ral...@xs4all.nl (Richard Bos)
> wrote:
>>Those are more or less contradictory requirements. All solutions are
>>going to be ugly, because passing information without passing
>>information (i.e., prevaricating to your caller) _is_ ugly. Sorry.

> I gather that you never use opaque pointers or handles.

I don't think that follows. The problem is that you're requiring
the caller to actually pass the object, without knowing what the object
is. Opaque pointers and handles work by giving the user something other
than the underlying data to hand around. As long as you're requiring
them to pass the actual struct object, it's going to be pretty hard to
hide its contents, and no way of doing so will be elegant.

-s
--

Eric Sosman

unread,

Mar 24, 2010, 1:57:37 PM3/24/10

to

On 3/24/2010 10:21 AM, Richard Harter wrote:
> [...]

> In the system code we would have
>
> sigil_s sigil
> ...
> response_function_pointer(*(pseudo_sigil *)&sigil,g,p);
>
> In the response function we would have
>
> void response_function(pseudo_sigil bar, /* etc */)
> {
> ...
> api_call(bar,/* etc */);
> ...
> }
>
> And in the API code we would have
>
> void api_call(pseudo_sigil bar, /* etc */)
> {
> sigil_s sigil;
> sigil = *(sigil_s *)&bar;
> ...
> }
>
> Now this is unsafe if the user makes a copy of bar and mucks it
> up, but lets assume that the user just passes bar on. Bar is a
> struct; can it be unaligned? Can a passed aligned char array be
> unaligned when received?

The very act of passing by value *is* copying -- maybe to
a CPU register, maybe to somewhere on a stack, maybe to a bit
of memory set aside for the purpose, but a copy is made. Note
that if the receiving function changes the received struct (by
memset(), say), the changes do not affect the caller's version:
The two are therefore independent, the received parameter having
been copied from the caller's argument. Since you've chosen to
pass by value, you've chosen to copy. The "no copying" assumption
is a non-starter.

So: The caller passes a struct pseudo_sigil, and the value
thereof (nit: possibly excluding padding bytes) is copied to
somewhere in the receiver. The receiver puts the bytes somewhere
that's appropriate for a struct pseudo_sigil object -- register,
stack, whatever, just somewhere. What I'm saying is that there's
no a priori reason to expect that the "somewhere" will also be
an appropriate place for a struct sigil_s object, which might
have a stricter alignment requirement.

> Supposing that it can because the only contents are a char array
> can the following ever be unaligned?
>
> struct pseudo_sigil {
> union u {
> long dummy;
> char payload[24];
> }
> }

This will be aligned at least as strictly as a `long'.

In the original post you said there were "very good reasons"
to pass by value instead of passing pointers. Since the pass-by-
value decision is the cause of many of your difficulties, I hope
the reasons really are good enough to justify all the extra work
and unpleasantness ...

--
Eric Sosman
eso...@ieee-dot-org.invalid

Richard Harter

unread,

Mar 24, 2010, 2:06:19 PM3/24/10

to

On 24 Mar 2010 17:50:50 GMT, Seebs <usenet...@seebs.net>
wrote:

>On 2010-03-24, Richard Harter <c...@tiac.net> wrote:
>> On Wed, 24 Mar 2010 09:48:14 GMT, ral...@xs4all.nl (Richard Bos)
>> wrote:
>>>Those are more or less contradictory requirements. All solutions are
>>>going to be ugly, because passing information without passing
>>>information (i.e., prevaricating to your caller) _is_ ugly. Sorry.
>
>> I gather that you never use opaque pointers or handles.
>
>I don't think that follows. The problem is that you're requiring
>the caller to actually pass the object, without knowing what the object
>is. Opaque pointers and handles work by giving the user something other
>than the underlying data to hand around. As long as you're requiring
>them to pass the actual struct object, it's going to be pretty hard to
>hide its contents, and no way of doing so will be elegant.

Not so. As far as the user is concerned, they have a handle.
Thhe fact that it is a one to one map of the bytes of the object
is irrelevant. They can't access the contents because they don't
know the structure of the object.

Seebs

unread,

Mar 24, 2010, 6:12:15 PM3/24/10

to

On 2010-03-24, Richard Harter <c...@tiac.net> wrote:
> On 24 Mar 2010 17:50:50 GMT, Seebs <usenet...@seebs.net>
> wrote:
>>I don't think that follows. The problem is that you're requiring
>>the caller to actually pass the object, without knowing what the object
>>is. Opaque pointers and handles work by giving the user something other
>>than the underlying data to hand around. As long as you're requiring
>>them to pass the actual struct object, it's going to be pretty hard to
>>hide its contents, and no way of doing so will be elegant.

> Not so. As far as the user is concerned, they have a handle.
> Thhe fact that it is a one to one map of the bytes of the object
> is irrelevant. They can't access the contents because they don't
> know the structure of the object.

Except that, unless you figure out a way to hand them some other object
which can be memcpy'd to/from the object you want, they have to have the
structure of the object available in order to pass it as an argument.

Richard Harter

unread,

Mar 24, 2010, 7:08:05 PM3/24/10

to

On 24 Mar 2010 22:12:15 GMT, Seebs <usenet...@seebs.net>
wrote:

>On 2010-03-24, Richard Harter <c...@tiac.net> wrote:
>> On 24 Mar 2010 17:50:50 GMT, Seebs <usenet...@seebs.net>
>> wrote:
>>>I don't think that follows. The problem is that you're requiring
>>>the caller to actually pass the object, without knowing what the object
>>>is. Opaque pointers and handles work by giving the user something other
>>>than the underlying data to hand around. As long as you're requiring
>>>them to pass the actual struct object, it's going to be pretty hard to
>>>hide its contents, and no way of doing so will be elegant.
>
>> Not so. As far as the user is concerned, they have a handle.
>> Thhe fact that it is a one to one map of the bytes of the object
>> is irrelevant. They can't access the contents because they don't
>> know the structure of the object.
>
>Except that, unless you figure out a way to hand them some other object
>which can be memcpy'd to/from the object you want, they have to have the
>structure of the object available in order to pass it as an argument.

To repeat myself, not so. There are two struct declarations,
sigil_s and pseudo_sigil. They have the same size. User code
and API code expects the pseudo_sigil format (see below). When
the system code calls user code a sigil_s struct is converted
into a pseudo_sigil struct via a cast. When the user code calls
an API it passes the pseudo_sigil struct (or copy thereof) it
received to the API. The API converts the pseudo_sigil struct it
received to a sigil_s struct, again using a cast.

The user code only sees the pseudo_sigil structure; it never sees
the sigil_s structure. It's quite straight forward.

Incidentally, as Eric pointed out upthread, the pseudo_sigil
structure needs a union to force alignment, e.g.,

struct pseudo_sigil {
union u {
long dummy; /* use the appropriate type */
char c[24]; /* get the right size somehow */
}
}

The technique is simple enough, albeit not particularly pretty.
The real objection is that there is no clean way (that I can
think of) to get the size.

Richard Harter

unread,

Mar 24, 2010, 7:09:17 PM3/24/10

to

On Wed, 24 Mar 2010 13:57:37 -0400, Eric Sosman
<eso...@ieee-dot-org.invalid> wrote:

Good stuff. Thanks for the comments.

Richard Harter

unread,

Mar 24, 2010, 8:16:41 PM3/24/10

to

On Wed, 24 Mar 2010 08:24:08 -0700 (PDT), ImpalerCore
<jadi...@gmail.com> wrote:

>On Mar 24, 10:59=A0am, c...@tiac.net (Richard Harter) wrote:
>> On Wed, 24 Mar 2010 02:07:32 -0700 (PDT), Nick Keighley
>>
>> <nick_keighley_nos...@hotmail.com> wrote:
>> >On 23 Mar, 21:53, c...@tiac.net (Richard Harter) wrote:
>>
>> [snip]
>>
>>
>>
>> >How have you got into the state where you trust your users so little?
>> >Would you be better off with a less low level language where you have
>> >more control over what they do?
>>
>> It is not a question of not trusting users; the objective is to

>> create trustworthy code. =A0If the user can easily or inadverently
>> break my code then it is not very trustworthy. =A0The closer code

>> is to the bottom of the pyramid the more trustworthy it should
>> be.
>
>If these are errors of ignorance, maybe more education is needed? If
>these errors aren't being made maliciously, maybe a FAQ or a set of
>verbosely documented examples for your users to follow would be more
>useful.

These are always good things, of course, but not quite to the
point. If a user error breaks my code, I share the blame. If my
code is robust user errors won't break it. The purpose of
education is to help the user use the software more effectively;
it shouldn't be about navigating around booby traps left in the
code.

Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com

Eric Sosman

unread,

Mar 24, 2010, 8:50:16 PM3/24/10

to

On 3/24/2010 7:08 PM, Richard Harter wrote:
> On 24 Mar 2010 22:12:15 GMT, Seebs<usenet...@seebs.net>
> wrote:
>
>> On 2010-03-24, Richard Harter<c...@tiac.net> wrote:
>>> On 24 Mar 2010 17:50:50 GMT, Seebs<usenet...@seebs.net>
>>> wrote:

>>>> [...] As long as you're requiring

>>>> them to pass the actual struct object, it's going to be pretty hard to
>>>> hide its contents, and no way of doing so will be elegant.
>>
>>> Not so. As far as the user is concerned, they have a handle.
>>> Thhe fact that it is a one to one map of the bytes of the object
>>> is irrelevant. They can't access the contents because they don't
>>> know the structure of the object.
>>
>> Except that, unless you figure out a way to hand them some other object
>> which can be memcpy'd to/from the object you want, they have to have the
>> structure of the object available in order to pass it as an argument.
>
> To repeat myself, not so. There are two struct declarations,

> sigil_s and pseudo_sigil. They have the same size.[...]

Ah, this is obviously some strange usage of the word "elegant"
that I wasn't previously aware of.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Richard Harter

unread,

Mar 24, 2010, 9:35:54 PM3/24/10

to

Oh please, it should be obvious that the reference of "not so" is
to "pretty hard to hide its contents" and not to "will be
elegant". Your misreading is, ah, not elegant.

Nick

unread,

Mar 25, 2010, 3:11:14 AM3/25/10

to

c...@tiac.net (Richard Harter) writes:

I'm interested in why you absolutely have to pass structures and not
pointers-to-structures which would - of couse - allow you to hide
everything. If it to avoid having allocation and destruction functions,
which I can understand, or is there more to it than that?
--
Online waterways route planner | http://canalplan.eu
Plan trips, see photos, check facilities | http://canalplan.org.uk

ImpalerCore

unread,

Mar 25, 2010, 9:47:48 AM3/25/10

to

The reason I mention it is that C itself has a rather large FAQ to
educate people on the common missteps of C.

Are these user errors made from C programmers, config/script file
writers, people interacting with a GUI with no programming background?

> Richard Harter, c...@tiac.nethttp://home.tiac.net/~cri,http://www.varinoma.com

Tim Rentsch

unread,

Mar 25, 2010, 10:32:10 AM3/25/10

to

c...@tiac.net (Richard Harter) writes:

> This is a request for suggestions about a good way to do
> something. The context is message passing. The program(s) in
> question are divided into system code and user code. User code
> consists of lots of autonomous elements that pass messages back
> and forth. System code takes care of scheduling user code
> elements and handling the mechanics of message handling.
>
> There is a lot more to it than that, but that will do for
> context. The prototype for a user code element that responds to
> a message currently looks like this:
>
>

> void response_function(sigil_s, void *, packet_s);
>
> The first and third arguments are typedefs for structs. There
> are very good reasons why they aren't pointers to structs so
> don't go there. For sundry reasons we don't want the fields of
> these structs visible to users. (Keepen der sticky Fingers offen
> die Buttons).
>

> User code interacts with system code via an API; the game is to
> get users to access structs via API calls.
>
> One way to do this is to create structs that look like this:
>
> struct pseudo_sigil {char[n]}; /* n is sizeof(sigil_s) */
>
> In the API we can do something charming like:
>
> sigil_s foo;
> ...
> foo = (sigil_s)bar; /* bar is a pseudo_sigil struct */
>
> This is mildly cumbersome but I think it can be made to work
> except for one small glitch - we don't want to hard code the size
> of sigil_s et al.
>

> I can think of a couple of ways to hide the sizeof using
> preprocessor magic but ugly is as ugly does.
>

> If anybody has any good suggestions, I would be delighted to hear
> them.

Reading your comments here and also several of your followup
responses, it's not clear to me just how "firm" you want the
solution to be technically or whether your requirements (and I'm
still not sure what those are) might be achievable through a
combination of technical barriers and "human engineering". If
the latter combination is acceptable (and assuming the client
data type can be a union rather than a struct), one plausible
approach is something like this:

/* Two types, the first being the system internal/private */
/* data type, and the second being the user code public */
/* data type.

typedef struct { /* as appropriate */ } xyz_PRIVATE;

typedef union {
struct { unsigned char stuff[ sizeof (xyz_PRIVATE) ]; } contents;
const xyz_PRIVATE dont_even_THINK_of_using_this;
} user_sigil;

User code uses the 'user_sigil' type exclusively. The 'const'
on the internal type member makes it unavailable for ordinary
assignment. The 'user_sigil' type is usable directly for
initialization and argument passing. For assignment (eg,
if user code wants to store a 'user_sigil' into a malloc'ed
area), user code needs to use the 'contents' member, such
as:

user_sigil a = one_thing(), b = two_thing(), c;

/* set c to one of a or b */
c.contents = who_knows() ? a.contents : b.contents;

Obviously this general approach presents a fairly high barrier
(in human engineering terms) against _writing_ members in the
internal data type, but a decidedly lower barrier against
_reading_ members in the internal data type (although, using a
member with a name like 'dont_even_THINK_of_using_this' would be
a fairly bright red flag for most people). And of course casting
could be used by "malicious" users to defeat these protection
barriers completely if someone were of a mind to do that. I
think the more technical kinds of solutions have been covered
pretty well in some of the other followups.

However, since I don't really know the shape of the space of the
constraints you want to satisfy, I offer these ideas up for your
consideration in case they help your thinking move in a good
direction.

Nick Keighley

unread,

Mar 25, 2010, 11:50:42 AM3/25/10

to

On 25 Mar, 13:47, ImpalerCore <jadil...@gmail.com> wrote:
> On Mar 24, 8:16 pm, c...@tiac.net (Richard Harter) wrote:
> > On Wed, 24 Mar 2010 08:24:08 -0700 (PDT), ImpalerCore
> > <jadil...@gmail.com> wrote:
> > >On Mar 24, 10:59=A0am, c...@tiac.net (Richard Harter) wrote:
> > >> On Wed, 24 Mar 2010 02:07:32 -0700 (PDT), Nick Keighle

> > >> <nick_keighley_nos...@hotmail.com> wrote:
> > >> >On 23 Mar, 21:53, c...@tiac.net (Richard Harter) wrote:

> > >> >How have you got into the state where you trust your users so little?
> > >> >Would you be better off with a less low level language where you have
> > >> >more control over what they do?
>
> > >> It is not a question of not trusting users; the objective is to
> > >> create trustworthy code. =A0If the user can easily or inadverently
> > >> break my code then it is not very trustworthy. =A0The closer code
> > >> is to the bottom of the pyramid the more trustworthy it should
> > >> be.
>
> > >If these are errors of ignorance, maybe more education is needed? If
> > >these errors aren't being made maliciously, maybe a FAQ or a set of
> > >verbosely documented examples for your users to follow would be more
> > >useful.
>
> > These are always good things, of course, but not quite to the
> > point. If a user error breaks my code, I share the blame. If my
> > code is robust user errors won't break it. The purpose of
> > education is to help the user use the software more effectively;
> > it shouldn't be about navigating around booby traps left in the
> > code.
>
> The reason I mention it is that C itself has a rather large FAQ to
> educate people on the common missteps of C.

facinating thought he FAQ is I don't think it really guides you away
from many of the pitfalls. I'm sure it doesn't have

Q. 42.13 When is it appropriate to violated the documented API and
hack the data structures directly?

Richard Harter

unread,

Mar 26, 2010, 10:56:29 AM3/26/10

to

On Thu, 25 Mar 2010 06:47:48 -0700 (PDT), ImpalerCore
<jadi...@gmail.com> wrote:

>On Mar 24, 8:16=A0pm, c...@tiac.net (Richard Harter) wrote:
>> On Wed, 24 Mar 2010 08:24:08 -0700 (PDT), ImpalerCore
>>
>>
>>
>> <jadil...@gmail.com> wrote:

>> >On Mar 24, 10:59=3DA0am, c...@tiac.net (Richard Harter) wrote:
>> >> On Wed, 24 Mar 2010 02:07:32 -0700 (PDT), Nick Keighley
>>
>> >> <nick_keighley_nos...@hotmail.com> wrote:
>> >> >On 23 Mar, 21:53, c...@tiac.net (Richard Harter) wrote:
>>
>> >> [snip]
>>
>> >> >How have you got into the state where you trust your users so little?
>> >> >Would you be better off with a less low level language where you have
>> >> >more control over what they do?
>>
>> >> It is not a question of not trusting users; the objective is to

>> >> create trustworthy code. =3DA0If the user can easily or inadverently
>> >> break my code then it is not very trustworthy. =3DA0The closer code

>> >> is to the bottom of the pyramid the more trustworthy it should
>> >> be.
>>

>> >If these are errors of ignorance, maybe more education is needed? =A0If

>> >these errors aren't being made maliciously, maybe a FAQ or a set of
>> >verbosely documented examples for your users to follow would be more
>> >useful.
>>
>> These are always good things, of course, but not quite to the

>> point. =A0If a user error breaks my code, I share the blame. =A0If my
>> code is robust user errors won't break it. =A0The purpose of

>> education is to help the user use the software more effectively;
>> it shouldn't be about navigating around booby traps left in the
>> code.
>
>The reason I mention it is that C itself has a rather large FAQ to
>educate people on the common missteps of C.
>
>Are these user errors made from C programmers, config/script file
>writers, people interacting with a GUI with no programming background?

Answer 1:
Yes.

Answer 2:
"There are no user errors, there are only defensive programming
errors."

Answer 3:
When you make your code idiot proof, the universe creates better
idiots.

Answer 4:
I am creating an environment. Multiple instances of user code
are directly or indirectly resident in that environment. Bugs
and programming errors in user code should not be able to crash
the environment.

Users may be at various levels of sophistication.

Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com

Richard Harter

unread,

Mar 26, 2010, 2:45:34 PM3/26/10

to

I'm not smart enough to give you a short answer so I will give
you a longer answer. The program in question is the heart of a
data flow programming engine. An analogous program would be the
Erlang run time system. In the model of programming I am
supporting programs are composed of autonomous computational
elements that can reside on different cores, theads, or machines.

The autonomous computational elements (I call them autons)
conceptually have several input ports and several output ports,
along with persistent state data. Data/messages/documents
magically appear at input ports; the engine is responsible for
moving data about, maintaining queues, and waking up autons.

The relevant code in an auton are the input response functions,
one for each input port. The engine invokes them. Each invoked
function runs to completion; it can send out messages that go
elsewhere (it doesn't need to know where they go). The code in
these response functions could be C or it could be in some other
language as long as it has a C interface. All user/user and
user/engine communication is done via API calls. There is no
visible shared memory, no visible locks, and no visible threads.

The calling sequence for an input response function has three
basic arguments: a sigil, a pointer to the auton's persistent
memory, and a packet containing the input data for this
particular invocation. In effect the sigil is a passport that is
needed for API calls.

The fundamental issue is that I don't want user code to crash the
engine, either accidentally or maliciously. For this reason I
don't want to pass pointers - they are an open invitation to
buffer attacks. Don't I trust my users? No. I don't know who
they might be in the future.

This is why at some point in developing the implementation I
decided to pass structs rather than pointers. (In the internal
engine code structs are often passed for other reasons.) In the
latest review it became clear that passing structs is also
problematic.

After all of this discussion I think I see a fairly clean and
safe alternative. The key is that multiple engines can be
running at once. The essential element in a sigil is the engine
identifier. The calling engine knows who it called so it can
keep the relevant data and pass it to any API calls. In turn the
user code has no alternative but to use the API. That is what I
should have done in the first place.

There. Now you know. Aren't you sorry you asked.

Hallvard B Furuseth

unread,

Apr 1, 2010, 4:07:28 AM4/1/10

to

Richard Harter writes:
> void response_function(sigil_s, void *, packet_s);
>
> The first and third arguments are typedefs for structs. There
> are very good reasons why they aren't pointers to structs so
> don't go there. For sundry reasons we don't want the fields of
> these structs visible to users. (Keepen der sticky Fingers offen
> die Buttons).
>
> User code interacts with system code via an API; the game is to
> get users to access structs via API calls.
> One way to do this is to create structs that look like this:
>
> struct pseudo_sigil {char[n]}; /* n is sizeof(sigil_s) */

Coming a bit late to this, but I'm curious if the following is right or
if I've gotten lost in the twisty mazes of the Standard:

When I looked at a similar problem (not due to my own requirements) I
solved it by giving up, which seemed a nicer solution than what I could
think of. I've seen some of my reasons in this thread, but not all:

It'd work to memcpy to a struct pseudo_sigil and pass that, but that
didn't fit my case. Ignoring that solution:

Structs sigil and pseudo_sigil must *both* have the same alignment
requirements and same size, otherwise either passing it from user code
to user-invisible code or vice versa can trap. Unless users never pass
the struct back to the user-invisible code.

So one struct and one union doesn't work, and one struct with a union
and one without is dodgy at best. The union itself could have alignment
requirements that do not exist for the members or the struct. In any
case, I don't remember anything to stop the compiler from using
different calling conventions for different structs?

This likely means sigil and pseudo_sigil should either be equal sans the
names, or they should both be unions: The structs as first member, other
members ensuring alignment and size. (This stopped me: Wanted backwards
binary compat.) The size member can be larger than the expected struct
size, wasting some space but allowing the user-visible code to not know
the exact size of the real struct. A compile-time assert in the
user-invisible part can check that the size is not too small after all.

But then there are the aliasing rules. An static or stack object has
the effective type its variable was declared with, and (roughly) should
be accessed as that type on the pain of undefined behavior. Union
members are an exception - but reading other members than the one last
written is undefined, which looks like an exception to the exception.

For that matter, what if some code accesses a float in union sigil_u at
offset 4, which came from a union pseudo_sigil_u object and that doesn't
have a float at that offset? That's not supposed to happen, so the
compiler's implementation of aliasing rules need not be careful about
it. I suppose it might notice that this float can't possibly have been
set, or something like that, and optimize on it.

In this case that'd require more magical link-time optimization than
what i'd heard about so far, so it probably would work and keep
working. But that's not quite how I prefer to describe my programs.

Anyway, am I getting too paranoid here?

Some other random notes from reading this thread:

Richard mentioned passing an ID around instead of pointers. Reminded me
of a pointer representation which allows Boehm-style garbage collection
to be compacting, if that's of any use in this case: C pointers consist
of (object ID, offset), and the ID is an index to an internal table of
(object address, length). Thus the system can move the object and
change the address without needing to update the C pointer value.

Regarding "hiding pointers", I did that once with ~(intptr_t)(void*)ptr.
That was to hide the pointer from valgrind though, not the user. And it
doesn't help against wild pointers. Still, if users are sufficiently
determined to break the rules for the program, they'll succeed anyway.

--
Hallvard

Richard Harter

unread,

Apr 2, 2010, 11:54:30 AM4/2/10

to

I got lost in your arguments. Clearly alignment has to be taken
care of, but it suffices to create a union of all the relevant
types as a component. Or, IIANM, it suffices to get all structs
from the storage allocator since that must return aligned
addresses.

Did you application absolutely require consistent sizes?
Otherwise I don't see why the pseudo_sigil struct can't be
larger.

>
>
>Some other random notes from reading this thread:
>
>Richard mentioned passing an ID around instead of pointers. Reminded me
>of a pointer representation which allows Boehm-style garbage collection
>to be compacting, if that's of any use in this case: C pointers consist
>of (object ID, offset), and the ID is an index to an internal table of
>(object address, length). Thus the system can move the object and
>change the address without needing to update the C pointer value.

Cute. I've done things like that, but not in the context of
coordinating with a garbage collector. If I were to do that I
would use "pointers" of the form (index,seqno,offset) with the
table holding (seqno,address,length). Whenever a table slot is
reused the sequence number is bumped. The point of this little
song and dance is that you now have a check on stale pointers.

>
>Regarding "hiding pointers", I did that once with ~(intptr_t)(void*)ptr.
>That was to hide the pointer from valgrind though, not the user. And it
>doesn't help against wild pointers. Still, if users are sufficiently
>determined to break the rules for the program, they'll succeed anyway.