Help: Ruby<->C++ callbacks

Luigi Ballabio

unread,

Apr 4, 2002, 6:05:15 AM4/4/02

to

Hi all,
I'm writing a C++ extension to Ruby. I need to write a class which can
take a Ruby procedure and store it to be used as a callback, i.e.,

irb> c = Callback(proc { puts "Hello world!" })
irb> c.doit
"Hello world!"
irb>

This has not much use within the Ruby interpreter: however, it would allow
me to wrap Ruby code and use it as a callback in the C++ library I'm
interfacing.

However, my concern is how to make sure that the procedure is not
garbage-collected as long as there are Callback objects storing a reference
to it.
Also, the lifetime of the procedure does not in principle depend on the
lifetime of c since copies of the latter could be made in the C++ library
which are not known to Ruby.

Research on readme.ext and Programming Ruby yielded:
1) rb_global_variable: it can protect a VALUE, but I don't know if the
process is undoable, i.e., if the VALUE can be marked again for gc.
2) rb_gc_mark: it marks an object for gc, but can it be applied to one
previously protected by rb_global_variable? Also, if two callback objects
refer to a VALUE and one calls rb_gc_mark, will the latter be collected
regardless of the fact that another reference remains in use, or is some
kind of reference counting implemented?

At the end, I came up with the following design: however, holes remain
which should be filled before the class is safely useable. Any suggestions?

Thanks in advance,
Luigi

class RubyCallback : public Callback {
public:
RubyCallback(VALUE p)
: p_(p), refCount_(new int(1)) {
/* how to make sure the Ruby proc stays alive
as long as we need it?
Is the following the right way? */
rb_global_variable(p_);
}
RubyCallback(const RubyCallback& o)
: p_(o.p_), refCount_(o.refCount_) {
/* how to make sure the Ruby proc stays alive
as long as we need it? */
o.refCount_++;
}
RubyCallback& operator=(const RubyCallback& o) {
if (refCount_ != o.refCount_) {
if (--*refCount == 0) {
// tell p_ it can go as far as we are concerned
delete refCount_;
}
p_ = o.p_;
refCount_ = o.refCount_;
// make sure p_ stays alive
*refCount_++;
}
return *this;
}
~RubyCallback() {
if (--*refCount_ == 0) {
// now it can go as far as we are concerned
// is the following the right way?
rb_gc_mark(p_);
delete refCount_;
}
}
void doit() {
static ID callId = rb_intern("call");
rb_funcall(p_,callId,0);
}
private:
VALUE p_;
int* refCount_; // is this needed?
};

ts

unread,

Apr 4, 2002, 6:32:17 AM4/4/02

to

>>>>> "L" == Luigi Ballabio <ball...@mac.com> writes:

L> /* how to make sure the Ruby proc stays alive
L> as long as we need it?
L> Is the following the right way? */
L> rb_global_variable(p_);

no,

L> }
L> /* how to make sure the Ruby proc stays alive
L> as long as we need it? */
L> o.refCount_++;

no,

L> }

When you return an object from your C++ code you must use
Data_Wrap_Struct() or Data_Make_Struct()

The 2nd argument of Data_Wrap_Struct (3rd for Data_Make_Struct) is the
mark function which will be called when the GC mark the current object.

Define such a function when you return a new object and in this function
make a call to rb_gc_mark() to mark the Proc object

Guy Decoux

Tobias Peters

unread,

Apr 4, 2002, 9:37:14 AM4/4/02

to

On Thu, 4 Apr 2002, Luigi Ballabio wrote:
> Hi all,
> I'm writing a C++ extension to Ruby. I need to write a class which can
> take a Ruby procedure and store it to be used as a callback, i.e.,

Be sure to understand ruby's memory management before attempting to design
your classes. Since you want to write a "C++ extension to Ruby", you
should defer all memory management to the ruby interpreter.

You should start with an extension library that does practically nothing,
just print out a few statements to cout when objects are created, freed,
used, To get a feeling for how this is done.

Suggestion for a base class that honors Ruby's memory management:

// subclass this for real Ruby C++ extencion classes.
class Rb_Base {
// pointer back to ruby DATA object. Must not be used in
// destructor!
VALUE rb_value;

// Garbage collection "mark" interface function
static void mark(void * ptr) {
reinterpret_cast<Rb_Base*>(ptr)->mark_others();
}
// Garbage collection "free" interface function
static void free(void * ptr) {
delete reinterpret_cast<Rb_Base*>(ptr);
}
// disallow copying
Rb_Base(const Rb_Base &);
void operator=(const Rb_Base &);
protected:
// mark other known ruby objects here
virtual void mark_others() {}

// Extension object constructor, needs to know the ruby class of
// this object
Rb_Base(VALUE rb_class)
{
rb_value =
Data_Wrap_Struct(rb_class, &Rb_Base::mark,
&Rb_Base::free, this)
}
// destructor. Derived classes release C++ resources here.
virtual ~Rb_Base() {}
public:
// convenience function for other C++ extension object that want
// to mark this object
virtual void mark_self() {
rb_gc_mark(rb_value);
}
};

>
> irb> c = Callback(proc { puts "Hello world!" })
> irb> c.doit
> "Hello world!"
> irb>
>
> This has not much use within the Ruby interpreter: however, it would allow
> me to wrap Ruby code and use it as a callback in the C++ library I'm
> interfacing.

There is no need for this Callback class. The method of your extension
class should take a block, store the VALUE pointer to it somewhere and
call rb_gc_mark on it in its mark_others method.

> Research on readme.ext and Programming Ruby yielded:
> 1) rb_global_variable: it can protect a VALUE, but I don't know if the
> process is undoable, i.e., if the VALUE can be marked again for gc.

No need to do this.

> 2) rb_gc_mark: it marks an object for gc, but can it be applied to one
> previously protected by rb_global_variable? Also, if two callback objects
> refer to a VALUE and one calls rb_gc_mark, will the latter be collected
> regardless of the fact that another reference remains in use, or is some
> kind of reference counting implemented?

Sorry, but you misunderstood the purpose of the rb_gc_mark function.
Yes, it marks an object on behalf of the garbage collector, but receiving
a mark means that an object can still be accessed and will *not* be
garbage collected. The garbage collector will collect only those objects
that did not receive a mark. And you should only call rb_gc_mark during
the mark phase of the garbage collector, that is, from within the mark
function that you registered with Data_Wrap_Struct.

Tobias

Luigi Ballabio

unread,

Apr 4, 2002, 10:57:37 AM4/4/02

to

Guy,
thanks a lot for the reply.
The scaffolding is not a big problem: I'm using SWIG to export the stuff,
so the class destructor is automatically passed to Data_[Wrap|Make]_Struct
as the mark function, so I can put there a call to rb_gc_mark for the proc.

What I'm concerned about is the following scenario:

// the Callback constructor makes a copy of the VALUE pointing to
// the proc and stores it away, but the Ruby interpreter doesn't know.
irb> c = Callback.new(proc { puts "Hello World!" })

// for all I know, here the interpreter could decide to gc the
// proc object since the only reference that it saw is already
// out of scope

// therefore, the following call could try to use a pointer to
// a gc'ed object and fail spectacularly.
irb> c.doit
Goodbye, cruel world.

The question really was, how do I tell the interpreter that I'm holding a
reference to a Ruby object from C++?

Thanks again,
Luigi

ts

unread,

Apr 4, 2002, 11:19:45 AM4/4/02

to

>>>>> "L" == Luigi Ballabio <ball...@mac.com> writes:

L> // the Callback constructor makes a copy of the VALUE pointing to
L> // the proc and stores it away, but the Ruby interpreter doesn't know.

irb> c = Callback.new(proc { puts "Hello World!" })

Well, here you return an object with its mark and free function

L> // for all I know, here the interpreter could decide to gc the
L> // proc object since the only reference that it saw is already
L> // out of scope

No. When the GC run it will mark all alive objects. This mean that it will
mark the variable `c' (`c' is still in scope) and to do this it call the
mark function that you have given in Data_Wrap_Struct. Because your mark
function call rb_gc_mark() on the Proc object, the GC will not try to
remove it (i.e. the Proc object is still alive for the GC)

L> The question really was, how do I tell the interpreter that I'm holding a
L> reference to a Ruby object from C++?

You just define a mark function (given to Data_Wrap_Struct) and in this
mark function you call rb_gc_mark() on all ruby object holding by C++

Guy Decoux

Luigi Ballabio

unread,

Apr 4, 2002, 11:41:48 AM4/4/02

to

Thanks a lot, Tobias.

At 12:11 AM 4/5/02 +0900, Tobias Peters wrote:
>Since you want to write a "C++ extension to Ruby", you
>should defer all memory management to the ruby interpreter.

Ok, now I see that I got rb_gc_mark backwards (and it showed in my answer
to Guy Decoux. My apologies, Guy.)

I still have a question, though---rb_gc_mark protects the object I'm
storing, but how do I unmark it? That is, how do I tell the garbage
collector that it can dispose of it when I'm done with the object that
stores it?

Also, now that I knew what to look for, I noticed that SWIG doesn't pass
the "mark" function to Data_Wrap_Struct, only the "free" function...
Lyle, are you reading? Can we do something about this? Or should I inject
the relevant code by hand, instead of relying upon SWIG_NewPointerObj?

>There is no need for this Callback class.

Yes, there is. Here is the full picture, as much simplified as possible.
I'm not writing a C++ extension from scratch: I'm writing bindings for an
existing C++ library. The latter defines an abstract Callback class as

class Callback {
public:
virtual ~Callback();
virtual void doit() = 0;
};

which can be subclassed to give the desired behavior. The usual Strategy
pattern stuff. Another class takes Callbacks and uses them:

class AnotherClass {
public:
void setCallback(Callback* c) {
// store the callback
}
void doYourStuff() {
// among other things,
c_->doit();
}
};

My aim is to provide the user with the ability to define callbacks from
Ruby, which is what he would do in C++ by subclassing Callback. He should
be made able to write:

irb> foo = AnotherClass.new
irb> bar = RubyCallback.new(proc { puts "It's me again" })
irb> foo.setCallback(bar)
irb> foo.doYourStuff
It's me again

AnotherClass does not store Ruby objects: it stores Callback pointers.
What I'm trying to do is to define a subclass of Callback whose doit()
method proxies a Ruby block which is passed upon construction.

Thanks again,
Luigi

Luigi Ballabio

unread,

Apr 4, 2002, 11:41:50 AM4/4/02

to

Responding to myself:

>I still have a question, though---rb_gc_mark protects the object I'm
>storing, but how do I unmark it? That is, how do I tell the garbage
>collector that it can dispose of it when I'm done with the object that
>stores it?

I'm an ass. It's clear now. Don't bother answering.

Thank you all,
Luigi

Lyle Johnson

unread,

Apr 4, 2002, 12:43:31 PM4/4/02

to

> Also, now that I knew what to look for, I noticed that SWIG doesn't pass
> the "mark" function to Data_Wrap_Struct, only the "free" function...
> Lyle, are you reading? Can we do something about this? Or should I inject
> the relevant code by hand, instead of relying upon SWIG_NewPointerObj?

Actually, I added this feature to SWIG 1.3.12 (i.e. the CVS development
version) awhile back and have been meaning to get around to documenting it
;)

You can use SWIG's %markfunc directive to specify the name of a "mark"
function for a class:

%markfunc RubyCallback "RubyCallback_markfunc";

%{
static void RubyCallback_markfunc(void *ptr) {
RubyCallback *callback = static_cast<RubyCallback *>(ptr);
rb_gc_mark(callback->p_);
}
%}

class RubyCallback {
// whatever
};

For this example, SWIG should pass the function name "RubyCallback_markfunc"
to Data_Wrap_Struct() when it calls that. I'm using this all over the place
in FXRuby. Let me know if you run into any trouble with it.

Tobias Peters

unread,

Apr 5, 2002, 2:59:38 AM4/5/02

to

On Thu, 4 Apr 2002, Luigi Ballabio wrote:

> >There is no need for this Callback class.
>
> Yes, there is. Here is the full picture, as much simplified as possible.

No. It might be convenient for you to introduce this class, because you
want to use swig to do the wrapping, but the resulting ruby programming
interface would be ugly. Your method that expects a Callback should also
accept a block instead. You can easily add this feature in native ruby
after you created your wrapper with swig:

# File: cpplib.rb
require "cpplib.so"
class Callback_Taker
alias cpp_add_callback add_callback
def add_callback(*args, &block)
if (block)
cpp_add_callback(Callback.new(block))
elsif (args.size > 0)
cpp_add_callback(args[0])
end
end
end

and let your users require "cpplib", which will prefer "cpplib.rb" over
"cpplib.so".

> I'm not writing a C++ extension from scratch: I'm writing bindings for an
> existing C++ library.

which library?

Tobias

Luigi Ballabio

unread,

Apr 5, 2002, 3:53:07 AM4/5/02

to

At 05:33 PM 4/5/02 +0900, Tobias Peters wrote:
>It might be convenient for you to introduce this class, because you
>want to use swig to do the wrapping, but the resulting ruby programming
>interface would be ugly. Your method that expects a Callback should also
>accept a block instead. You can easily add this feature in native ruby
>after you created your wrapper with swig:

Nice. I'll add this.

> > I'm not writing a C++ extension from scratch: I'm writing bindings for an
> > existing C++ library.
>

>which library?

http://www.quantlib.org/

Thanks,
Luigi

Paul Brannan

unread,

Apr 8, 2002, 7:23:35 PM4/8/02

to

On Thu, Apr 04, 2002 at 08:03:21PM +0900, Luigi Ballabio wrote:
> class RubyCallback : public Callback {
> public:
> RubyCallback(VALUE p)
> : p_(p), refCount_(new int(1)) {
> /* how to make sure the Ruby proc stays alive
> as long as we need it?
> Is the following the right way? */
> rb_global_variable(p_);
> }

Why is refCount_ a pointer?

> RubyCallback(const RubyCallback& o)
> : p_(o.p_), refCount_(o.refCount_) {
> /* how to make sure the Ruby proc stays alive
> as long as we need it? */
> o.refCount_++;

Here, you are incrementing o.refCount_, when you should be incrementing
(*o.refCount_).

> }
> RubyCallback& operator=(const RubyCallback& o) {
> if (refCount_ != o.refCount_) {
> if (--*refCount == 0) {
> // tell p_ it can go as far as we are concerned
> delete refCount_;
> }
> p_ = o.p_;
> refCount_ = o.refCount_;
> // make sure p_ stays alive
> *refCount_++;
> }
> return *this;
> }
> ~RubyCallback() {
> if (--*refCount_ == 0) {
> // now it can go as far as we are concerned
> // is the following the right way?
> rb_gc_mark(p_);
> delete refCount_;
> }
> }

rb_gc_mark doesn't mark an object for deletion; it marks the object as
being in use. You should call this from the mark function that was
passed to Data_Wrap_Struct.

> void doit() {
> static ID callId = rb_intern("call");
> rb_funcall(p_,callId,0);
> }

This is dangerous. Ruby uses longjmp() to implement exceptions. If
your Ruby code throws an exception, and the function that calls doit()
has an object on the stack, then that object's destructor will not get
called.

I do like the use of a static variable here for callId.

> private:
> VALUE p_;
> int* refCount_; // is this needed?
> };

You probably want something like this:

#include <ruby.h>
#include <intern.h>

// Hack to get this to work on gcc3
#define rb_gc_mark(value) ((void (*)(VALUE))(rb_gc_mark))(value)
#if defined(RUBY_METHOD_FUNC)
#undef RUBY_METHOD_FUNC
#endif
typedef VALUE (*RUBY_METHOD_FUNC)();

extern VALUE ruby_errinfo;

struct Ruby_Exception {
VALUE ex;
};

class RubyCallback {
public:
RubyCallback(VALUE p)
: p_(p)
, ruby_obj_(Data_Wrap_Struct(rb_cObject, RubyCallback::mark, 0, this))
{ }

// the default copy constructor and assignment operator will work

// making this work with an arbitrary number of arguments is left as
// an excercise to the reader.
VALUE call();

private:
static VALUE call_ruby_proc(VALUE p) {

static ID callId = rb_intern("call");

return rb_funcall(p, callId, 0);
}

static void mark(void * obj) {
RubyCallback * rc(static_cast<RubyCallback *>(obj));
rb_gc_mark(rc->p_);
}

private:
VALUE p_;
VALUE ruby_obj_;
};

VALUE RubyCallback::call() {
int state = 0;
VALUE retval = rb_protect(
RUBY_METHOD_FUNC(RubyCallback::call_ruby_proc),
p_,
&state);
if(state != 0) {
Ruby_Exception ex = { ruby_errinfo };
throw ex;
}
return retval;
}

// Some test code...

#include <iostream>

#define RUBY_TRY \
extern VALUE ruby_errinfo; \
ruby_errinfo = Qnil; \
try

#define RUBY_CATCH \
catch (Ruby_Exception & ex) { \
rb_exc_raise(ex.ex); \
} \
catch (...) \
{ \
/* Can't raise the exception from here, because the C++ exception \
* won't get properly destroyed. */ \
ruby_errinfo = rb_exc_new2(rb_eRuntimeError, "Unknown error"); \
} \
if(!NIL_P(ruby_errinfo)) { \
rb_exc_raise(ruby_errinfo); \
}

VALUE foo(VALUE /* self */, VALUE cb) {
RUBY_TRY
{
struct Foo {
Foo() { std::cout << "Foo" << std::endl; }
~Foo() { std::cout << "~Foo" << std::endl; }
} foo;
RubyCallback rc(cb);
rc.call();
}
RUBY_CATCH
}

VALUE run_test(VALUE /* self */) {
// should print "Foo\nfoo!\n~Foo\n"
rb_eval_string("foo(proc { puts 'foo!' })");

// should print "Foo\n~Foo\n" then print an exception msg.
rb_eval_string("foo(proc { raise 'foo!' })");

return Qnil;
}

int main() {
int argc = 3;
char * argv[] = { "test", "-e" , "run_test()" };
ruby_init();
ruby_init_loadpath();
ruby_options(argc, argv);

rb_define_global_function("foo", RUBY_METHOD_FUNC(foo), 1);
rb_define_global_function("run_test", RUBY_METHOD_FUNC(run_test), 0);

ruby_run();
}

Hope this helps,

Paul