Generic callback mechanism in NCI

2 views
Skip to first unread message

par...@seamons.com

unread,
Oct 19, 2007, 12:31:59 AM10/19/07
to perl6-i...@perl.org
I started to write an OpenGL library and was only a couple of dozen
lines into the pir when I remembered the documentation about callbacks
in docs/pdds/draft/pdd16_native_call.pod .

Currently there are only two signatures supported on callbacks: one
with user_data and extern_data and the other with extern_data and
user_data (the positions are all that is different). These are nice
and general functions and work well because the user_data is/should be
opaque to the calling C function, the user_data provides a place to
store the PMC, and the user_data is used to lookup the interpreter
that should be used to run the sub stored in the PMC. The pdd says
that outside the two provided signatures, anybody wanting to implement
NCI connections to callback functions will need to do some hand C
coding. Hand C coding isn't at all bad, but it would be nice to have
a generic mechanism for storing callbacks from C functions that don't
provide a slot for opaque user_data. I was on my way to coding a
solution with my meager C skills and wanted to make sure that what I
was doing was sane (for some definition of the word sane).

So my proposal goes something like this:

- Get a new unique id from a new op called get_callback_id
- Call a variant of the new_callback op, passing in the unique id
- The new_callback variant stores the user data under the unique id
- The callback references a unique C function that will return that unique id
- When the callback is fired, the unique id is used to lookup the user_data
- Any external data is parsed as needed
- The registered sub is played

Ok, sounds hard enough - but it gets worse. Here is how I would
implement this generic callback system and what I think would be
required.

There will probably need to a file similar to src/call_list.txt such
as src/callback_list.txt with entries similar to:

10 v v
10 v ii

Where the first number is the number of functions to pre-create, and
the other two parts are the return type and signature. I'd rather not
have the number be hardcoded and go with a jit solution or some other
solution for runtime C function generation - but I'm not aware of
anything else that would work besides pre-compiled C functions. I think it
would be nice if libraries could generate/compile the needed functions
independent of anything hardcoded (and that goes for call_list.txt too
for that matter).

The entires in callback_list.txt would generate functions similar to the
following in nci.c
# parrot callback functions
void pcbf_v_JV_0(void) # depending up signature
void pcbf_v_JV_1(void)
...
void pcbf_v_JV_9(void)


I would then add two more ops (I shudder to say that - I'm not sure if
adding ops is a frowned upon thing). Those ops are (I haven't played
with it to know if this is the right format):

op get_callback_id(out INT, in STR)
# the output is a unique INT for the given signature in STR
# it would fail if called more times than the 10 listed
# in callback_list.txt (unless a jit solution could be implemented)

op delete_callback_id(in INT, in STR)
# deletes the user_data in the storage structure, freeing it
# for later use


Currently the following op is defined in src/ops/core.ops:

op new_callback(out PMC, invar PMC, invar PMC, in STR)

I want to add one more variant of this op

op new_callback(out PMC, in INT, invar PMC, invar PNC, in STR)

Another piece that is required is that there be a global
ResizablePMCArray (actually there may be multiple based on the
registered signatures). Again, I can hear everybody shuddering at the
mention of a global. I don't know enought about parrot yet to know if
it should be a true global variable, or if it should be tied to an
interpreter, or if there is somewhere I should register the PMC with,
or if there is already a central structure that would take care of
functions like this.

My questions are:

- Does anybody else want a generic callback function mechanism in NCI?
- Is this a relatively sane/clean way to do it?
- Is there a better way to generate the functions for each signature?
- What is the right way to store that global user_data until the callbacks
are fired?

I don't think I've been clear enough, but I'll post and then answer
questions. I think that it would be good to have something that
libraries could use without having to drop down to the C level - I
just am not sure if this is too much bloat to implement it.

Paul Seamons
I've been here all along - I'm just really quiet.

Paul Seamons

unread,
Oct 18, 2007, 11:32:59 PM10/18/07
to perl6-i...@perl.org

Allison Randal

unread,
Oct 19, 2007, 10:35:52 PM10/19/07
to par...@seamons.com, perl6-i...@perl.org
par...@seamons.com wrote:
> I started to write an OpenGL library and was only a couple of dozen
> lines into the pir when I remembered the documentation about callbacks
> in docs/pdds/draft/pdd16_native_call.pod .
>
[...]

>
> My questions are:
>
> - Does anybody else want a generic callback function mechanism in NCI?
> - Is this a relatively sane/clean way to do it?
> - Is there a better way to generate the functions for each signature?
> - What is the right way to store that global user_data until the callbacks
> are fired?

NCI isn't fully specified yet, but I'll outline some of our current
thoughts and likely directions. Discussion welcome.

We would like to eliminate the massive list of precompiled thunks for C
function calls generated by call_list.txt. The tricky part is that you
can't compile a C function at runtime. The best you can do is JIT it,
and that depends on having a working JIT for the platform. We might be
able to use LLVM's JIT, which would gain us a working JIT on a number of
platforms.

The JIT solution would continue to use the dlfunc interface, but instead
of looking up a precompiled thunking function for the passed in
signature, it would JIT one. As with the precompiled thunk, the JITed
thunk is incorporated into an NCI sub object, which can be stored in a
namespace, or passed around anonymously, and invoked, just like an
ordinary sub object.

That said, it's unlikely that we'll ever completely eliminate the list
of precompiled thunks. Some platforms just won't have a JIT, and we
can't afford to cut off NCI for the lack of a JIT. (For one thing,
Parrot internals use NCI quite heavily.) But we can make them more
manageable. We'll probably end up with something similar to
src/ops/*.ops, with multiple files of signatures. The core file would be
the absolute minimum required to run Parrot without loading external C
libraries. Then we could add a file for each subsystem (MySQL, Postgres,
pcre, SDL, Python builtins, and tcl, are a few already mentioned in
call_list.txt). A configuration option when compiling Parrot could
decide whether to precompile a restricted set, the full set, or add in
some additional signatures for another external system. Some signature
files for a particular library could be generated by a limited C parser,
but it would always need to be checked by a human. Duplicate signatures
between files would be merged. (And remember, this is only a fallback
for platforms that can't JIT.)

So, that's one part of the question. The other part is callbacks.
If/when we develop a JIT solution for NCI call thunks, we can use the
same technique for generating callback thunks. In the mean time, we'll
have to continue with the precompiled callback thunks.

Callbacks will use the concurrency scheduler. (There's some info about
the concurrency scheduler in the new Events PDD, but more details will
be in the Concurrency PDD next month.) For the moment all you need to
know is that the concurrency scheduler is a central dispatcher in the
Parrot interpreter that handles events, exceptions, async I/O, threads, etc.

When you call 'new_callback', you pass it a Parrot sub, a user_data
argument, and a signature. At the moment the signature is limited to two
alternatives, in the future it will allow all the same signature options
as an NCI call. The signature of the Parrot sub should match the
signature passed into 'new_callback'.

The 'new_callback' op will take these arguments and create a
CallbackHandler PMC which stores the user data and the passed in sub, as
well as any other information needed to properly invoke the right sub in
the right interpreter. It then registers that CallbackHandler with the
concurrency scheduler. Registering a callback handler returns a unique
concurrency id (CID) for the handler, kind of like a process id.

After registering the callback handler, 'new_callback' will look up a
precompiled thunk or JIT a thunk with the requested signature. For the
JITed thunk it will embed the CID as a constant within the thunk. For
precompiled thunks, we're probably going to have to add more information
to the CallbackHandler PMC (the signature, possibly a library
identifier, etc).

When the callback thunk is called from C, it bundles the C arguments
into appropriate Parrot arguments, then notifies the concurrency
scheduler that it has a callback with a particular CID, or particular
set of characteristics (similar to scheduling an event of a particular
type). The concurrency scheduler will look through its registered
callback handlers to look for a matching handler, and if it finds one,
invoke it, passing it the arguments that were passed to the C thunk.

Essentially, this uses the concurrency scheduler as your global data
store and as the source of unique identifiers. But, it's integrated with
a core system.

The immediate solution, to get OpenGL working now without waiting for
the implementation of the concurrency scheduler and JITed call/callback
thunks, is to add a few more callback signatures to the current set of
alternatives, and to write a little bit of custom C code for the cases
that can't pass dispatch information in a user data argument.

I wouldn't go so far as implementing the callback_list.txt option at
this point. You'll have at least a basic implementation of the
concurrency scheduler by December 1st as we need it for events.

Allison

Reply all
Reply to author
Forward
0 new messages