Procs -> Sexp question (concerning JRuby)

Werner Schuster (murphee)

unread,

Jan 30, 2008, 8:28:53 PM1/30/08

to ambit...@googlegroups.com

howdy,

I was trying to get ambition to run under JRuby, which didn't work.
JRuby has a version of ParseTree (jParseTree) that I maintain.
Problem: JRuby 1.0.x can't support the current implementation of the way
you implement to_sexp
(as I understand from this:
http://github.com/defunkt/ambition/tree/master/lib/ambition/core_ext.rb ):
- take the proc
- stuff it inside a method proc_to_method on ProcHolder using
define_method
- tell ParseTree to sexpr-ify this proc_to_method Method

I won't go into details why JRuby doesn't support that with 1.0.x, but
it doesn't (there's a good chance I can get jParseTree to work with
JRuby 1.1).

Now: I'm wondering: is the current implementation of this the absolutely
perfect one?
jParseTree can manage most of dynamically generated stuff, but you
managed to implement the one that it can't handle ;-)

Also - an unrelated issue: I might be wrong, but the ProcHolder
approach seems to be not threadsafe. Unless I'm missing something, the
ProcHolder's proc_to_method method keeps getting re-defined. This could
potentially fail if two threads try to do this simultaneously - i.e.
thread A could define the method, be preempted by thread B define-ing
_its_ method and to_method-ing it. Once the thread A runs again, it'd
get thread B's proc.
A solution would be to define a new class with a unique name every time
this process happens.
(This might not be a problem for MRI, but systems like JRuby with native
threads can run into this).

Also: a matter of taste... but wouldn't it be nicer to mix in the
various to_* methods into Object, Method, and Proc _objects_ instead of
opening the classes? I don't if this would impact performance (much),
but at least it woulnd't pollute Object, Method, or Proc.

Cheers,
murphee

Chris Wanstrath

unread,

Jan 30, 2008, 10:52:52 PM1/30/08

to ambit...@googlegroups.com

On 1/30/08, Werner Schuster (murphee) <werner....@gmail.com> wrote:

> I was trying to get ambition to run under JRuby, which didn't work.
> JRuby has a version of ParseTree (jParseTree) that I maintain.
> Problem: JRuby 1.0.x can't support the current implementation of the way
> you implement to_sexp
> (as I understand from this:
> http://github.com/defunkt/ambition/tree/master/lib/ambition/core_ext.rb ):
> - take the proc
> - stuff it inside a method proc_to_method on ProcHolder using
> define_method
> - tell ParseTree to sexpr-ify this proc_to_method Method
>
> I won't go into details why JRuby doesn't support that with 1.0.x, but
> it doesn't (there's a good chance I can get jParseTree to work with
> JRuby 1.1).

Hi! I'm glad Ambition caught your eye -- I want to do whatever I can
to help make it run on JRuby.

> Now: I'm wondering: is the current implementation of this the absolutely
> perfect one?
> jParseTree can manage most of dynamically generated stuff, but you
> managed to implement the one that it can't handle ;-)

We stole our to_sexp from ruby2ruby, with some minor modifications. I
have nothing against using a different method which works better.

All we want is the sexp for the proc, so however we can hand that to
Ambition is fine with me.

> Also - an unrelated issue: I might be wrong, but the ProcHolder
> approach seems to be not threadsafe. Unless I'm missing something, the
> ProcHolder's proc_to_method method keeps getting re-defined. This could
> potentially fail if two threads try to do this simultaneously - i.e.
> thread A could define the method, be preempted by thread B define-ing
> _its_ method and to_method-ing it. Once the thread A runs again, it'd
> get thread B's proc.
> A solution would be to define a new class with a unique name every time
> this process happens.
> (This might not be a problem for MRI, but systems like JRuby with native
> threads can run into this).

This is true. The ruby2ruby implementation creates a new, uniquely
named method for each proc -- which leaks memory in MRI. I originally
took the path of killing the leak at the expense of thread safety. It
was fine because Ambition was AR-only, which is not threadsafe. But,
obviously, that's not the case anymore and this lack of thread safety
will not stand.

You bring up the 'class per proc' idea -- I'll try that out this week
and report back, hopefully with good news. Thanks for the suggestion!

> Also: a matter of taste... but wouldn't it be nicer to mix in the
> various to_* methods into Object, Method, and Proc _objects_ instead of
> opening the classes? I don't if this would impact performance (much),
> but at least it woulnd't pollute Object, Method, or Proc.

I think it's just a matter of taste. Both produce the same end result
in MRI, not sure if this pollution is some JRuby thing. Please
elaborate if so.

- Chris

Werner Schuster (murphee)

unread,

Feb 3, 2008, 3:23:58 PM2/3/08

to ambit...@googlegroups.com

Chris Wanstrath wrote:
> On 1/30/08, Werner Schuster (murphee) <werner....@gmail.com> wrote:
>
>> jParseTree can manage most of dynamically generated stuff, but you
>> managed to implement the one that it can't handle ;-)
>>
> We stole our to_sexp from ruby2ruby, with some minor modifications. I
> have nothing against using a different method which works better.
>

OK - I'll have to see what works with JRuby 1.0.
BTW are you planning any releases in the near future? If not, I might
just ignore JRuby 1.0 and it should be possible to use the current way
on JRuby 1.1.

>> Also: a matter of taste... but wouldn't it be nicer to mix in the
>> various to_* methods into Object, Method, and Proc _objects_ instead of
>> opening the classes? I don't if this would impact performance (much),
>> but at least it woulnd't pollute Object, Method, or Proc.
>>
>
> I think it's just a matter of taste. Both produce the same end result
> in MRI, not sure if this pollution is some JRuby thing. Please
> elaborate if so.
>

Sorry, I wasn't clear: by "pollute" I mean that Ambition will add a
method to a very basic class in Ruby, Proc (and also a few others like
Method).
This is problematic, particularly if other libraries do the same, which
can lead to clashes. Eg. another library might add a method with the
same name which does something completely different. This gets even
worse if these changes depend on loading order of libraries, which might
not be deterministic (eg. if a library is only loaded the first time a
feature is used).

Before you say this is a theoretical problem, just consider running
Ambition alongside an application which loads Ruby2Ruby. The code for
to_sexp might be the same _now_, but either Ambition or Ruby2Ruby might
decide to do something else in a future version.
Actually, I happened to stumble over a another potentially conflicting
library: Sequel: http://sequel.rubyforge.org/files/sequel/README.html
It's an ORM, but it has one similarity with Ambition: it also has a
feature where you can write queries in Ruby blocks, which are then
turned in to s-exprs with ParseTree.
And here's the collision:
http://code.google.com/p/ruby-sequel/source/browse/trunk/sequel_core/lib/sequel_core/dataset/sequelizer.rb
Scroll all the way down to this:
class Proc
# replacement for Proc#to_sexp, if it's not available
unless instance_methods.include?('to_sexp')
def to_sexp
block = self
c = Class.new {define_method(:m, &block)}
ParseTree.translate(c, :m)[2]
end
end
end

So, any Ruby program loading both Ambition and Sequel at the same time,
is likely to behave oddly.
I'd say, by putting the to_sexp method into a Proc object (one
particular Proc _object_) right before it's used is safest, because it
doesn't add the method to the Proc class, and this particular Proc
object is going to be discarded right away anyway.
I talk a bit more about this here (although this article is more about
storing data in singleton classes):
http://www.infoq.com/articles/prototypes-for-metadata

Chris Wanstrath

unread,

Feb 7, 2008, 1:18:26 AM2/7/08

to ambit...@googlegroups.com

On 2/3/08, Werner Schuster (murphee) <werner....@gmail.com> wrote:

> OK - I'll have to see what works with JRuby 1.0.
> BTW are you planning any releases in the near future? If not, I might
> just ignore JRuby 1.0 and it should be possible to use the current way
> on JRuby 1.1.

Nothing major, no.

> >> Also: a matter of taste... but wouldn't it be nicer to mix in the
> >> various to_* methods into Object, Method, and Proc _objects_ instead of
> >> opening the classes? I don't if this would impact performance (much),
> >> but at least it woulnd't pollute Object, Method, or Proc.
> >>
> >
> > I think it's just a matter of taste. Both produce the same end result
> > in MRI, not sure if this pollution is some JRuby thing. Please
> > elaborate if so.
> >
> Sorry, I wasn't clear: by "pollute" I mean that Ambition will add a
> method to a very basic class in Ruby, Proc (and also a few others like
> Method).
>
> This is problematic, particularly if other libraries do the same, which
> can lead to clashes. Eg. another library might add a method with the
> same name which does something completely different. This gets even
> worse if these changes depend on loading order of libraries, which might
> not be deterministic (eg. if a library is only loaded the first time a
> feature is used).

Okay, but this is not fixed by 'mixing in' methods -- mixins are
modules which effectively inject instance methods into their receiver,
accomplishing the same end result as re-opening the class and adding
instance methods directly.

That said, I completely agree with you. I'll make a new class which
wraps up this functionality instead of putting the methods on core
classes. Thanks for pointing it out.

--
Chris Wanstrath
http://errfree.com // http://errtheblog.com
http://github.com // http://famspam.com

Werner Schuster (murphee)

unread,

Feb 13, 2008, 7:27:59 PM2/13/08

to ambit...@googlegroups.com

Chris Wanstrath wrote:
> On 2/3/08, Werner Schuster (murphee) <werner....@gmail.com> wrote:
>
>
> Okay, but this is not fixed by 'mixing in' methods -- mixins are
> modules which effectively inject instance methods into their receiver,
> accomplishing the same end result as re-opening the class and adding
> instance methods directly.
>

Yes... you're right, there's no need to use mixins (although it's
possible).

Anyway: something else has been rumbling around the Ruby neighborhoods
of my mind:
As far as I can tell, the Ambition queries are extracted every time the
query code runs, is that right?
Ie. the Proc's ParseTree is ripped out (in whatever way it's done now),
run through the processor,etc.

LISP Macros, however, are expanded at compile time, which means none of
that runtime overhead happens.
I saw something in this blog entry:
http://www.relevancellc.com/2008/2/13/have-you-killed-a-design-pattern-today
that gave me an idea to have LISP-like behavior for Ruby.

The trick using bindings & FILE/LINE which got me the following idea:
- think of a call site that defines an Ambition query (ie. a Ruby
Block) a la: foo.select { ambition_query}
- when it's called, it checks a cache if the query has been translated
yet, by
extracting FILE and LINE variables and looking them up in a the
cache (eg as a tuple [FILE, LINE])
+ in case of a cache miss: the AST extraction takes place and the
cache entry is filled with the result

The FILE/LINE variables for query block can be looked up by:
line = eval("__LINE__", query_block.binding)

Problems:
- not sure if there could be problems with binding or eval
- reloading a file that contains a query is a problem; it could either
keep an old version around (if the position of the query doesn't change),
or (bad case) use a different query if the positions line up correctly.

Hmm... any thoughts?

Chris Wanstrath

unread,

Feb 16, 2008, 5:49:24 AM2/16/08

to ambit...@googlegroups.com

On 2/13/08, Werner Schuster (murphee) <werner....@gmail.com> wrote:

> The trick using bindings & FILE/LINE which got me the following idea:
> - think of a call site that defines an Ambition query (ie. a Ruby
> Block) a la: foo.select { ambition_query}
> - when it's called, it checks a cache if the query has been translated
> yet, by
> extracting FILE and LINE variables and looking them up in a the
> cache (eg as a tuple [FILE, LINE])
> + in case of a cache miss: the AST extraction takes place and the
> cache entry is filled with the result
>
> The FILE/LINE variables for query block can be looked up by:
> line = eval("__LINE__", query_block.binding)
>
> Problems:
> - not sure if there could be problems with binding or eval
> - reloading a file that contains a query is a problem; it could either
> keep an old version around (if the position of the query doesn't change),
> or (bad case) use a different query if the positions line up correctly.
>
>
> Hmm... any thoughts?

I implemented this tonight using Proc#to_s. In some places I'm seeing
a 20% speedup, which rocks.

It seems that Proc#to_s does what we want reliably on MRI. Not sure
about the other implementations yet.

Also, Ambition no longer adds #to_sexp and instead uses a thread safe
SexpGenerator.

proc caching:
http://github.com/defunkt/ambition/commit/4fa480c45ff43540fe411831b650fe35ba1ba7cf

#to_sexp removal:
http://github.com/defunkt/ambition/commit/82636d6518c8492896559a6bce98b0f75e4175da

- chris

Reply all

Reply to author

Forward