Idea: method_missing like functionality

Arno Rehn

unread,

Mar 17, 2011, 2:43:43 PM3/17/11

to crack-l...@googlegroups.com

Hi,

I've already mentioned this on IRC:
Something like method_missing as in Ruby or PHP would be really nice for
bindings developers (like me :)). It would make it possible to look up methods
at runtime instead of generating a whole bunch of source files for the target
language. This greatly reduces the effort needed to create bindings. Ideally
this would boild down to something like this (in the C++ Extension API):

> // argv would probably have a different type, maybe use va_args
> void *method_missing(void *this, int argc, void *argv) { ... }
>
> myType->addMethod("method_missing", (void*) &method_missing);

An extension to this - probably faster at runtime, might take more time to
initialize and consume some more memory - would be to support functionals as
target C++ 'methods'. This way we could save all memory offsets or indices for
methods beforehand, resulting in something like this:

> void *method_missing(int methodId, void *this, int argc, void *argv) { ... }
>
> for (int i = 0; i < numMethods; ++i) {
> myType->addMethod(getMethodName(i), std::bind1st(&method_missing, i));
> }

In theory something like this would be possible with C++0x's lambda support,
if g++ or clang had proper support for variadic lambdas (which is not the case
yet).

So, what do you think? :)

--
Arno Rehn
ar...@arnorehn.de

Michael Muller

unread,

Mar 17, 2011, 8:32:16 PM3/17/11

to Arno Rehn, crack-l...@googlegroups.com

Hi Arno,

Arno Rehn wrote:
> Hi,
>
> I've already mentioned this on IRC:
> Something like method_missing as in Ruby or PHP would be really nice for
> bindings developers (like me :)). It would make it possible to look up methods
> at runtime instead of generating a whole bunch of source files for the target
> language. This greatly reduces the effort needed to create bindings. Ideally
> this would boild down to something like this (in the C++ Extension API):
>
> > // argv would probably have a different type, maybe use va_args
> > void *method_missing(void *this, int argc, void *argv) { ... }
> >
> > myType->addMethod("method_missing", (void*) &method_missing);

I think there's three things going on here. One of them I like, one is
impossible, and the third I don't understand :-)

Basically what you're talking about is dynamic dispatch - when we have an
object x which is an instance of class A, normally when we do x.f(1):

- we lookup an overload compatible with f(int) in the symbol table of A and
its base classes
- if the method is virtual, we generate code to get the function pointer from
the vtable entry at runtime, if the method is final we just get the constant
function pointer
- we then generate code that calls the function with the argument values.

With dynamic dispatch, we can say something like: "if a lookup fails on
f(int), do something user defined." In your case, the user defined thing is
to generate a call to a function that accepts the name of the function and the
runtime arguments to the function.

So far, so good. This is the part I like. I thought about dynamic dispatch a
long time ago, and have thought about it again recently as something we could
do with annotations. With the basic plumbing in place, it should certainly be
possible to add support to the extensions API.

Now we get to the impossible part - we can't do something like argc, argv in
crack because the language is statically typed and uses an efficient
representation of primitive types at runtime and supports operator
overloading. The best we could do is provide a dynamic dispatch override that
converts the arguments of an unmatched method to a datastructure (we could use
a high-level Array[Object]) containing primitive and non-primitive types
autoboxed to Object derivatives.

There's still the issue of the return type - every crack function needs to
have a return type defined during compile time. In this case, we could do
anything from always assume a return type of void to always assuming a return
type of a complex object.

Now the problem moves from impossible to merely complicated and costly :-)
That is to say, it is relatively costly to create such a datastructure
(compared to a normal function call) it it will be complicated for the
function to extract the arguments and do its own "overload matching" and
return value construction.

That said, I can see use cases for it - for example, if you wanted to create a
shell process wrapper that converted method names to shell functions, so you
might be able to call something like 'shell.ls("-l")'. You might also want to
do this to create bindings for a language that actually uses dynamic dispatch
(such as Ruby, or Python).

That brings us to the part that I don't understand: what is it that you want
to put in your extension's method_missing() function? If it's just code to
match function names and method types and dispatch to the corresponding C/C++
functions, you're going to incur a huge amount of overhead for something
that can be much more easily accomplished using the extension generator (have
you played with the extension generator? It's like a stripped down,
crack-centric version of SWIG).

So yes, I expect some form of dynamic dispatch will make it into the language,
although I'm not sure when (wanna take a stab at it?) And it's probably
better referred to as "overridable dispatch" because you can do much more with
it than just dynamic dispatch.

>
> An extension to this - probably faster at runtime, might take more time to
> initialize and consume some more memory - would be to support functionals as
> target C++ 'methods'. This way we could save all memory offsets or indices for
> methods beforehand, resulting in something like this:
>
> > void *method_missing(int methodId, void *this, int argc, void *argv) { ... }
> >
> > for (int i = 0; i < numMethods; ++i) {
> > myType->addMethod(getMethodName(i), std::bind1st(&method_missing, i));
> > }
>
> In theory something like this would be possible with C++0x's lambda support,
> if g++ or clang had proper support for variadic lambdas (which is not the case
> yet).
>
> So, what do you think? :)
>
> --
> Arno Rehn
> ar...@arnorehn.de
>

=============================================================================
michaelMuller = mmu...@enduden.com | http://www.mindhog.net/~mmuller
-----------------------------------------------------------------------------
Government is not reason, it is not eloquence; it is force. Like fire, it
is a dangerous servant and a fearsome master. - George Washington
=============================================================================

Arno Rehn

unread,

Mar 18, 2011, 6:41:20 AM3/18/11

to crack-l...@googlegroups.com

Right.

> Now we get to the impossible part - we can't do something like argc, argv
> in crack because the language is statically typed and uses an efficient
> representation of primitive types at runtime and supports operator
> overloading. The best we could do is provide a dynamic dispatch override
> that converts the arguments of an unmatched method to a datastructure (we
> could use a high-level Array[Object]) containing primitive and
> non-primitive types autoboxed to Object derivatives.

Yes, I just didn't want to get too deep into the implementation details yet,
that's why I put a generic argc and argv there (well, it's not really generic,
but you got the idea, obviously :))

> There's still the issue of the return type - every crack function needs to
> have a return type defined during compile time. In this case, we could do
> anything from always assume a return type of void to always assuming a
> return type of a complex object.
>
> Now the problem moves from impossible to merely complicated and costly :-)
> That is to say, it is relatively costly to create such a datastructure
> (compared to a normal function call) it it will be complicated for the
> function to extract the arguments and do its own "overload matching" and
> return value construction.

I see. I wouldn't mind to have two different versions of method_missing - one
for a void return type, one for an Object return type. Apart from being ugly,
it would probably still require some casting...

> That said, I can see use cases for it - for example, if you wanted to
> create a shell process wrapper that converted method names to shell
> functions, so you might be able to call something like 'shell.ls("-l")'.
> You might also want to do this to create bindings for a language that
> actually uses dynamic dispatch (such as Ruby, or Python).
>
> That brings us to the part that I don't understand: what is it that you
> want to put in your extension's method_missing() function? If it's just
> code to match function names and method types and dispatch to the
> corresponding C/C++ functions, you're going to incur a huge amount of
> overhead for something that can be much more easily accomplished using the
> extension generator (have you played with the extension generator? It's
> like a stripped down, crack-centric version of SWIG).

In KDE, we already have a system in place for dynamically looking up C++
classes and methods and invoking them, called SMOKE [0]. It was specifically
engineered for this kind of dynamic invocation by making it nearly trivial to
look up a method and calling it, so it's relatively easy to create bindings
for dynamic languages like Perl, Ruby or Python (it's of course not as fast as
having the whole bindings pre-generated in some file, but being blazingly fast
is usually not a requirement when your write something in a scripting
language).

We've also used it to create bindings for C#, a static language, but this
involves an additional step of creating a C# library from the information
stored in SMOKE. I'd like to avoid this for crack, because it's quite a
hassle.
I've only quickly looked over the extension generator, but it seems to be
rather static. To use it, I'd have to do:

SMOKE -> Generate extension generator input -> generate extension -> some
manual glue code -> crack

If the extension generator can dynamically define types and methods from
strings, I could interface with SMOKE directly in the input file, eliminating
one step above. This would be more or less the same solution we have for C#.
Still not quite what I wish for ;).

This is why I've also proposed to have functionals as possible C++
counterparts of crack methods (see below).

[0] http://techbase.kde.org/Development/Languages/Smoke

> So yes, I expect some form of dynamic dispatch will make it into the
> language, although I'm not sure when (wanna take a stab at it?) And it's
> probably better referred to as "overridable dispatch" because you can do
> much more with it than just dynamic dispatch.
>
> > An extension to this - probably faster at runtime, might take more time
> > to initialize and consume some more memory - would be to support
> > functionals as target C++ 'methods'. This way we could save all memory
> > offsets or indices for
> >
> > methods beforehand, resulting in something like this:
> > > void *method_missing(int methodId, void *this, int argc, void *argv) {
> > > ... }
> > >
> > > for (int i = 0; i < numMethods; ++i) {
> > >
> > > myType->addMethod(getMethodName(i), std::bind1st(&method_missing, i));
> > >
> > > }
> >
> > In theory something like this would be possible with C++0x's lambda
> > support, if g++ or clang had proper support for variadic lambdas (which
> > is not the case yet).

In SMOKE, we can iterate over all methods of all classes in a specific
library. So it would be possible to create all classes and methods in crack on
module initialization. If we could assign functionals as the methods' C++
counterparts, we could bind the method index of SMOKE to the functional, so
the actual invocation would not require an additional lookup of the method.

This way, module initialization will take some more time (probably not really
noticable), but we don't have much overhead later. We don't have to worry
about return types, either, because we can tell at initialization what a
method is going to return.
The actual C++ method that's being called doesn't have to have an Array of
parameters either, we can use a variadic function here. Since the SMOKE method
index is bound, we already have all the meta-information for the actual
method.

Summing it up, there are 3 possibilities:
a) Use method_missing -> fast init, overhead for looking up a method
b) Use functionals -> slower init, nearly no overhead later (I prefer this)
c) Use extenstion generator -> fast init, overhead for looking up a method,
writing the actual generator is lame ;)

Sorry to bore you with all these detailsm, but I hope it sheds some light onto
the difficulties :)

P.S. I'm subscribed to the list, so you don't need to CC me.

--
Arno Rehn
ar...@arnorehn.de

Michael Muller

unread,

Mar 18, 2011, 9:55:38 AM3/18/11

to crack-l...@googlegroups.com

On the contrary, everything makes a lot more sense now.

For what you're describing, I think there is a particularly awesome
solution that is very much in line with my goals for the Crack
executor. During an import, crack should always try to "do the right
thing." So if it finds a .crk file in the search path, it
compiles/executes/links in that file. If it finds a shared library,
it assumes that the shared library is an extension and looks for an
initialization function in it, then runs the initialization function
which provides meta-data for the compiler and function addresses for
the runtime. We aim to support other kinds of import, too, and the
SMOKE mechanism is just the sort of thing I'd like to see supported.

If SMOKE can introspect a C++ shared library, we should be able to
write a general SMOKE-based wrapper. The only piece of information we
need is the name of the shared library. So how about this:

- if we find a shared library on the search path, look for the crack
init function for the library
- if there is no crack init function, and SMOKE is available (linked
into crack as the result of a configuration switch)
load the library using the SMOKE-loader which generates the extensions
dynamically.

I _think_ we can even get this to work in AOT/native mode, assuming
that SMOKE gives us access to either the symbol names or the
underlying function pointers. This would be a really exciting
addition because it essentially allows us to load C++ shared libraries
directly without the need to write an extension.

Alternately, if we discover that we need more meta-data, we can do
something similar with an annotation. So, by example, to wrap libFoo
we could define a module "foo.crk" containing something like this:

@import crack.smoke library, use;

@library 'libFoo';
@use foo; # to convert foo::symbol to just "symbol" in this module.

Does this sound like a good solution?

>
> P.S. I'm subscribed to the list, so you don't need to CC me.

Sorry, standard procedure for my mailers is to reply to both, but I've
removed you from the CC in at least this reply.

>
> --
> Arno Rehn
> ar...@arnorehn.de
>

weyrick

unread,

Mar 18, 2011, 10:27:24 AM3/18/11

to crack-lang-dev

If we could load a C++ shared library upon crack import and do a
dynamic extension, it would indeed be badass. From my quick look at
smoke though, it doesn't seem like it introspects the shared lib
itself, but some magic is needed to parse the corresponding header
files and do something with that information. Arno is that correct?

Anyway, what I wanted to mention was that I think the ability to have
method_missing functionality (or a more generically named "call"
method) is useful outside of extensions as well. So in crack, one
could create a class that does its own dispatch:

class Foo {

void _bar(void) { cout `in bar`; }

Object call(String method, Array[Object] args) {
if (method == "bar") _bar();
}

}

There's the performance hit of course, but as it's entirely optional I
think that's acceptable.

Shannon

Arno Rehn

unread,

Mar 18, 2011, 10:38:16 AM3/18/11

to crack-l...@googlegroups.com

On Friday 18 March 2011 15:27:24 weyrick wrote:
> If we could load a C++ shared library upon crack import and do a
> dynamic extension, it would indeed be badass. From my quick look at
> smoke though, it doesn't seem like it introspects the shared lib
> itself, but some magic is needed to parse the corresponding header
> files and do something with that information. Arno is that correct?

Yes, that's correct. Still, loading arbitrary SMOKE modules with crack would
be cool. It would only work out-of-the-box in the most trivial cases, because
you often encounter complex template-based types that can't easily be wrapped
(for example QList<T> in Qt). We'll have to marshall them to the appropiate
crack type.

> Anyway, what I wanted to mention was that I think the ability to have
> method_missing functionality (or a more generically named "call"
> method) is useful outside of extensions as well. So in crack, one
> could create a class that does its own dispatch:
>
> class Foo {
>
> void _bar(void) { cout `in bar`; }
>
> Object call(String method, Array[Object] args) {
> if (method == "bar") _bar();
> }
>
> }
>
> There's the performance hit of course, but as it's entirely optional I
> think that's acceptable.

+1

> Shannon
>
> On Mar 18, 9:55 am, Michael Muller <mind...@gmail.com> wrote:
> > - if we find a shared library on the search path, look for the crack
> > init function for the library
> > - if there is no crack init function, and SMOKE is available (linked
> > into crack as the result of a configuration switch)
> > load the library using the SMOKE-loader which generates the extensions
> > dynamically.
> >
> > I _think_ we can even get this to work in AOT/native mode, assuming
> > that SMOKE gives us access to either the symbol names or the
> > underlying function pointers. This would be a really exciting
> > addition because it essentially allows us to load C++ shared libraries
> > directly without the need to write an extension.

--
Arno Rehn
ar...@arnorehn.de

Arno Rehn

unread,

Mar 18, 2011, 10:44:06 AM3/18/11

to crack-l...@googlegroups.com

On Friday 18 March 2011 14:55:38 Michael Muller wrote:
> On the contrary, everything makes a lot more sense now.
>
> For what you're describing, I think there is a particularly awesome
> solution that is very much in line with my goals for the Crack
> executor. During an import, crack should always try to "do the right
> thing." So if it finds a .crk file in the search path, it
> compiles/executes/links in that file. If it finds a shared library,
> it assumes that the shared library is an extension and looks for an
> initialization function in it, then runs the initialization function
> which provides meta-data for the compiler and function addresses for
> the runtime. We aim to support other kinds of import, too, and the
> SMOKE mechanism is just the sort of thing I'd like to see supported.

Great :)

> If SMOKE can introspect a C++ shared library, we should be able to
> write a general SMOKE-based wrapper. The only piece of information we
> need is the name of the shared library. So how about this:

As Shannon aready noted, SMOKE doesn't really 'introspect' a shared library.
It provides introspection, but it has to be generated from the header files
first. But once that's done, you can do with it what you want. We have SMOKE
modules in place for every Qt module and most KDE stuff. There once was one
for Wt even. It's relatively easy to generate a new SMOKE module.

> - if we find a shared library on the search path, look for the crack
> init function for the library
> - if there is no crack init function, and SMOKE is available (linked
> into crack as the result of a configuration switch)
> load the library using the SMOKE-loader which generates the extensions
> dynamically.
>
> I _think_ we can even get this to work in AOT/native mode, assuming
> that SMOKE gives us access to either the symbol names or the
> underlying function pointers. This would be a really exciting
> addition because it essentially allows us to load C++ shared libraries
> directly without the need to write an extension.

SMOKE doesn't provide function pointers. It uses a more 'safe' way that's
based on a method index obtained from the other data structures in the lib and
an array of unions as a 'call stack'.

> Alternately, if we discover that we need more meta-data, we can do
> something similar with an annotation. So, by example, to wrap libFoo
> we could define a module "foo.crk" containing something like this:
>
> @import crack.smoke library, use;
>
> @library 'libFoo';
> @use foo; # to convert foo::symbol to just "symbol" in this module.
>
> Does this sound like a good solution?

This looks pretty much like what I want to create, yes :) But as said above,
SMOKE doesn't provide function pointers. So we have to dispatch the call to a
lookup function or some static functor via std::bind.

--
Arno Rehn
ar...@arnorehn.de

Michael Muller

unread,

Mar 18, 2011, 12:40:16 PM3/18/11

to crack-l...@googlegroups.com, Arno Rehn

That's ok. I think it could still be made to work for AOT/native,
albeit not efficiently. If you want efficient bindings, there are
other alternatives.

I think this would be an awesome addition, but it's not something I
can work on this year - are you interested in doing it? Alternately,
we could add it to the GSoC ideas.

>
> --
> Arno Rehn
> ar...@arnorehn.de
>

Michael Muller

unread,

Mar 18, 2011, 1:39:16 PM3/18/11

to crack-l...@googlegroups.com

On Fri, Mar 18, 2011 at 10:38 AM, Arno Rehn <ar...@arnorehn.de> wrote:
> On Friday 18 March 2011 15:27:24 weyrick wrote:
>> If we could load a C++ shared library upon crack import and do a
>> dynamic extension, it would indeed be badass. From my quick look at
>> smoke though, it doesn't seem like it introspects the shared lib
>> itself, but some magic is needed to parse the corresponding header
>> files and do something with that information. Arno is that correct?
> Yes, that's correct. Still, loading arbitrary SMOKE modules with crack would
> be cool. It would only work out-of-the-box in the most trivial cases, because
> you often encounter complex template-based types that can't easily be wrapped
> (for example QList<T> in Qt). We'll have to marshall them to the appropiate
> crack type.
>
>> Anyway, what I wanted to mention was that I think the ability to have
>> method_missing functionality (or a more generically named "call"
>> method) is useful outside of extensions as well. So in crack, one
>> could create a class that does its own dispatch:
>>
>> class Foo {
>>
>> void _bar(void) { cout `in bar`; }
>>
>> Object call(String method, Array[Object] args) {
>> if (method == "bar") _bar();
>> }
>>
>> }
>>
>> There's the performance hit of course, but as it's entirely optional I
>> think that's acceptable.
> +1

+1

I actually have something very general in mind for this, I'll write up
a formal design at some point, but the idea is that it should be
possible to extend lookups from the annotation system. So your
example could be coded something like this:

dyndispatch.crk:
... boilerplate import stuff ...

class _Dispatch @implements LookupCallbacks {
Func func;
oper init(Func func0) : func = func0 {}

Var onFailedLookup(String name, List[Expr] args) {
# generate a wrapper function that collects arguments and uses them to
# populate a datastructure and then passes it to func.
return wrapperCache.generateWrapper(args, func);
}
}

# define an annotation that expects to be followed by a function definition
# and defines a set of callbacks on that function definitiion.
@func_annotation dynamic_dispatch {
void onClose(CrackContext ctx, Func func) {
# set the lookup of the class context.
ctx.getParent().setLookupDefault(_Dispatch(func));
}
}

yourmodule.crk
@import dyndispatch dynamic_dispatch;

class Foo {
void _bar() { cout `in bar\n`; }
@dynamic_dispatch Object call(String method, Array[Object] args) {

if (method == 'bar') _bar();
}
}

>

Shannon Weyrick

unread,

Mar 21, 2011, 11:10:02 AM3/21/11

to crack-l...@googlegroups.com

Looks great! Seems like this could be extended to dynamic properties, as
well?

@import dynprops dynamic_prop;

class Foo {
IntWrapper _bar;
@dynamic_prop_set Object get(String varName) {
if (varName == 'bar') return _bar;
}
@dynamic_prop_get Object set(String varName, Object val) {
if (varName == 'bar') _bar = val;

Michael Muller

unread,

Mar 21, 2011, 5:29:23 PM3/21/11

to Shannon Weyrick, crack-l...@googlegroups.com

Shannon Weyrick wrote:
> Looks great! Seems like this could be extended to dynamic properties, as
> well?

Exactly. That was actually my original use-case for the concept.

It also works for converting normal attributes for accessors, for example:

class Foo {
@attr(readonly) String fullName {
get() { return parent.fullName + shortName; }
}
}

=============================================================================
michaelMuller = mmu...@enduden.com | http://www.mindhog.net/~mmuller
-----------------------------------------------------------------------------

The natural progress of things is for liberty to yield and government to
gain control. - Thomas Jefferson
=============================================================================

Shannon Weyrick

unread,

Mar 21, 2011, 9:51:14 PM3/21/11

to crack-l...@googlegroups.com

On 03/21/2011 05:29 PM, Michael Muller wrote:
>
> Shannon Weyrick wrote:
>> Looks great! Seems like this could be extended to dynamic properties, as
>> well?
>
> Exactly. That was actually my original use-case for the concept.
>
> It also works for converting normal attributes for accessors, for example:
>
> class Foo {
> @attr(readonly) String fullName {
> get() { return parent.fullName + shortName; }
> }
> }
>

Cool, I was just thinking about this the other day and I was going to
ask what the plans were. Regarding visibility, are we going to make it a
compile time error to try to access a "private" property of a class?

class Foo {
int __mySpecialNum;
}

Foo f;
f.__mySpecialNum= 5;

XX error? XX

So would we do something like this?

class Foo {

int __mySpecialNum;

@attr(readwrite) int fooNum {
get { return __mySpecialNum; }
set { __mySpecialNum = value; }
}

}

C# has something like this.

Michael Muller

unread,

Mar 23, 2011, 10:51:46 AM3/23/11

to Shannon Weyrick, crack-l...@googlegroups.com

Shannon Weyrick wrote:
> On 03/21/2011 05:29 PM, Michael Muller wrote:
> >
> > Shannon Weyrick wrote:
> >> Looks great! Seems like this could be extended to dynamic properties, as
> >> well?
> >
> > Exactly. That was actually my original use-case for the concept.
> >
> > It also works for converting normal attributes for accessors, for example:
> >
> > class Foo {
> > @attr(readonly) String fullName {
> > get() { return parent.fullName + shortName; }
> > }
> > }
> >
>
> Cool, I was just thinking about this the other day and I was going to
> ask what the plans were. Regarding visibility, are we going to make it a
> compile time error to try to access a "private" property of a class?
>
> class Foo {
> int __mySpecialNum;
> }
>
> Foo f;
> f.__mySpecialNum= 5;
>
> XX error? XX

Yes.

>
> So would we do something like this?
>
> class Foo {
>
> int __mySpecialNum;
>
> @attr(readwrite) int fooNum {
> get { return __mySpecialNum; }
> set { __mySpecialNum = value; }
> }
>
> }

And yes :-)

>
> C# has something like this.

I am continually impressed by how much good stuff there is in C#.

There is no way to find the best design except to try out as many designs as
possible and discard the failures. - Freeman Dyson
=============================================================================

Reply all

Reply to author

Forward