> -----Original Message-----
> From: Carlo Dapor [mailto:cat...@gmail.com]
> Sent: Friday, October 27, 2006 4:36 PM
> To: David Griswold
> Subject: Future of StrongTalk
>
>
> David
>
>
> What does it mean for StrongTalk, once Java is open sourced.
> After all, the hotspot is based on it, with a lot more to it.
>
> Will StrongTalk still continue, will it take inherit all the
> intellectual properties from hotspot ?
The HotSpot VM is for Java, not Smalltalk, and the languages and
implementations are very different. The Java VM has static implementation
type information available, and it did not use the type-feedback
capabilities of the Strongtalk VM (the last I heard), which was a decision I
was against. So while it also does extensive inlining, it uses different
algorithms that inline things in different ways, and while it can do better
inlining for some things than Strongtalk, it does worse inlining for other
things, so the tradeoffs are not very clear (and I don't think anyone has
ever done a detailed comparison of the impact of the different inlining
strategies).
Nor does the Java VM use tagging, which hurts Java a lot as a language,
since it makes it really painful and expensive to treat basic types as
objects. So I don't think the Java VM will ever be a better VM to host true
dynamic languages like Smalltalk (JavaScript, Ruby, etc), since they use
tagging and need type-feedback.
Another big disadvantage of the Java VM is that because it has to count on
the type system for dynamic safety, rather than using the flexibility of
type-feedback, it has an enormous amount of complexity in things like the
bytecode verifier, which must prove that the bytecode type information is
valid. And it has all those different kinds of basic types in addition to
objects. So it is more complex than Strongtalk.
The one big engineering advantage of the HotSpot VM is that it is internally
multi-threaded, which means it can take better advantage (right now) of
multi-core and native preemptive threading. But hopefully that engineering
will eventually be done in Strongtalk too.
And a bigger point is that, although I don't know for sure, I doubt that Sun
will release the Java VM under a totally open-source license like BSD. It
will probably be under a more proprietary license like the one they use for
other Java open-source stuff, which I don't think is nearly as nice as the
Strongtalk license.
-Dave
I'm wondering how much needs to be added (if anything) to host Ruby. I see
the problems as:
Ruby is more dynamic than Smalltalk. All instance variables are currently
implemented as slots, ie you can add new instance variable to an object in a
running program, and you need to be able to add mixins to a class and or
change methods and expect the change to effect all of the live objects.
So:
1 Adding new instance variables isn't exactly the same as "become" because
it has to be able to handle live stack frames referencing objects whoes
definition changed, whereas in Smalltalk you just expect the system to crash
in that case.
2. Changing the definition of methods can invalidate inlining optimizations
in live frames: deoptimization plus reoptimization anyone?
3. Even adding methods to unrelated classes can change optimizations based
on assuming that only a single class (or a limited number of them) uses a
given selector.
4. One optimization that makes sense in Ruby but not in Smalltalk is to
figure out which instance variables are rarely instanciated but which are
taking up a lot of memory and changing them into being stored in a hidden
subobject. I assume that using a hidden variable holding a hidden object is
better than trying to implement a completely general, safe and fast become
(that's almost impossible, I think).
Josh Scholar
By the way, the exciting thing about Ruby is that there is a large
community of people using it.
Josh wrote:
> > Nor does the Java VM use tagging, which hurts Java a lot as a language,
> > since it makes it really painful and expensive to treat basic types as
> > objects. So I don't think the Java VM will ever be a better VM to host
> true
> > dynamic languages like Smalltalk (JavaScript, Ruby, etc), since they use
> > tagging and need type-feedback.
>
> I'm wondering how much needs to be added (if anything) to host
> Ruby. I see
> the problems as:
>
> Ruby is more dynamic than Smalltalk. All instance variables are currently
> implemented as slots, ie you can add new instance variable to an
> object in a
> running program, and you need to be able to add mixins to a class and or
> change methods and expect the change to effect all of the live objects.
>
> So:
>
> 1 Adding new instance variables isn't exactly the same as "become" because
> it has to be able to handle live stack frames referencing objects whoes
> definition changed, whereas in Smalltalk you just expect the
> system to crash
> in that case.
From your description I don't understand how that is different than become.
In Smalltalk live stack frames can reference objects that change via become,
and it should work just fine if become is implemented properly. If adding
instance variables has to be *fast*, however, that is another issue. It
would require an object table (or a forwarding wrapper for every object),
and to make the class modification itself fast would require changes that
would slow down all instance var accesses, unless you discriminated against
the new instance variables performance-wise. That is probably a big reason
why Ruby is so slow.
> 2. Changing the definition of methods can invalidate inlining
> optimizations
> in live frames: deoptimization plus reoptimization anyone?
That machinery is already there; deoptimizing live frames is done all the
time in Strongtalk. They don't have to be immediately reoptimized, you can
do what is necessary with the interpreted frames after deoptimization, and
if they are used frequently thereafter, they will get reoptimized by the
normal mechanism.
> 3. Even adding methods to unrelated classes can change optimizations based
> on assuming that only a single class (or a limited number of them) uses a
> given selector.
Once again, deoptimization solves this problem. This exact issue exists in
the JVM, and once you have deoptimization it is trivial. We don't inline in
Strongtalk right now based on the number of method implementations, because
type-feedback can already inline those methods anyway (because if there is
only one implementation, then any send of that message is by definition
going to be monomorphic, which is what Strongtalk looks at).
> 4. One optimization that makes sense in Ruby but not in Smalltalk is to
> figure out which instance variables are rarely instanciated but which are
> taking up a lot of memory and changing them into being stored in a hidden
> subobject. I assume that using a hidden variable holding a
> hidden object is
> better than trying to implement a completely general, safe and fast become
> (that's almost impossible, I think).
Yes, adding a slower mechanism for handling lazily-added instance variables
as an uncommon case would probably be the way to do it.
Ruby would probably be a great fit for a modified Strongtalk VM, but Ruby
apparently has other problems like an ill-defined grammar that would have to
be dealt with first. But if someone wanted to build a StrongRuby, I would
encourage them all I could.
-Dave
> I'm wondering how much needs to be added (if anything) to host
> Ruby. I see
> the problems as:
>
> Ruby is more dynamic than Smalltalk. All instance variables are
> currently
> implemented as slots, ie you can add new instance variable to an
> object in a
> running program, and you need to be able to add mixins to a class
> and or
> change methods and expect the change to effect all of the live
> objects.
No, Smalltalk does all this stuff dynamically as well. Smalltalk
typically doesn't have mixins, but it's not hard to implemented them,
and Strongtalk has explicit support for them. In fact, Ruby's object
model is almost exactly that of Strongtalk. Only the the syntax is
different.
> So:
>
> 1 Adding new instance variables isn't exactly the same as "become"
> because
> it has to be able to handle live stack frames referencing objects
> whoes
> definition changed, whereas in Smalltalk you just expect the system
> to crash
> in that case.
Not so. In Smalltalk, when you add an instance variable to a class a
new version of the class is built and all instances of the old class
are converted to instances of the new class using become. It doesn't
crash, live stack frames continue to function correctly.
> 2. Changing the definition of methods can invalidate inlining
> optimizations
> in live frames: deoptimization plus reoptimization anyone?
In Smalltalk, methods used by active contexts aren't converted to the
new version of the method, but all subsequent invocations of the
method use the new version. If it's the same in Ruby, (and if it's
not, I'd like to know how the contexts are migrated) then the
Strongtalk VM should handle this transparently.
> 3. Even adding methods to unrelated classes can change
> optimizations based
> on assuming that only a single class (or a limited number of them)
> uses a
> given selector.
Again, this is normal and expected in Smalltalk, and the Strongtalk
VM is designed to handle it correctly.
> 4. One optimization that makes sense in Ruby but not in Smalltalk
> is to
> figure out which instance variables are rarely instanciated but
> which are
> taking up a lot of memory and changing them into being stored in a
> hidden
> subobject. I assume that using a hidden variable holding a hidden
> object is
> better than trying to implement a completely general, safe and fast
> become
> (that's almost impossible, I think).
That kind of optimization is probably better done up in Ruby code. If
you add instance variables on the fly, then the ones that never get
used will never be added anyway. Once they do, the only cost will be
an extra pointer in all instances, so it's not catastrophic.
In general, though, I think you're right. The Ruby object model is
almost exactly that of Strongtalk. The main issues that come to my
mind are:
Singleton objects: The runtime would have to create a custom subclass
and convert the object to an instance of it.
Continuations: currently not supported by Strongtalk, but David has
said it's doable.
Ruby message sends can have a variable number of arguments. The
runtime would have to capture sends with an "unexpected" number of
arguments (probably using #doesNotUnderstand:) and create specialized
versions of the method.
Probably the biggest chunk of work in implementing Ruby-on-Strongtalk
would be writing a Ruby-to-Strongtalk-bytecode compiler. Ruby is
notoriously difficult to parse, so it would be a fair amount of work
to make sure all the nooks and crannies of the grammar get covered.
Pulling this off would be a huge coup, though, as it would be way,
way faster than the existing Ruby implementation. I'd really like to
see this happen.
Colin
No, Ruby is slow because it doesn't have a VM. It's an interpreter
that walks the AST, evaluating each node as it goes. The core classes
are implemented in C, so you get acceptable performance out of them.
It's easy to create Ruby bindings for C code, so people just
implement performance critical code in C and call it from Ruby.
I claim adding instance variables doesn't have to be fast. It's going
to happen at most a handful of times per class. Migrating all the
instances won't take very long - the pathological case of adding lots
of instance variables after there are lots of instances is going to
be extremely rare. The common case will be that all the methods will
be added before any instances are created, and in that case no object
migration will be needed.
Colin
All instances mutation can be defferred
and do not need to be mutated inmmediatelly.
In Visual Smalltalk implementation objects can change
their method dictionaries to have a per-instance lookup path,
it is valuable when mutating instances because the mutation
can be deferred until the old-fashioned object is really used.
The "old" instance is set to have a method dictionary that
implements only #doesNotUnderstand: and when a message
impacts the object it is mutated and "becomed" to new shape.
Doing this way, instances are not missed when changes
are reverted and the time spent to change all instances
of a class is not payed if obsolete class to current class
mapping do not impose a size change... (I do not know
if VS evade the #become on such situations, but I
think that it is interesting to evaluate the convenience of
do not have a reference to "the class of the object" in
the object header)
Ale.
>
> > So:
> >
> > 1 Adding new instance variables isn't exactly the same as "become"
> > because
> > it has to be able to handle live stack frames referencing objects
> > whoes
> > definition changed, whereas in Smalltalk you just expect the system
> > to crash
> > in that case.
>
> Not so. In Smalltalk, when you add an instance variable to a class a
> new version of the class is built and all instances of the old class
> are converted to instances of the new class using become. It doesn't
> crash, live stack frames continue to function correctly.
>
I think that, in most smalltalks, generally you can call "become" to change
an object to any object that has the same instance variables in the same
order - otherwise live contexts will involve code that mis-accesses instance
variables.
I suppose adding new instance variables on to the end of an object is the
only case that is safe, after all.
> > 2. Changing the definition of methods can invalidate inlining
> > optimizations
> > in live frames: deoptimization plus reoptimization anyone?
>
> In Smalltalk, methods used by active contexts aren't converted to the
> new version of the method, but all subsequent invocations of the
> method use the new version. If it's the same in Ruby, (and if it's
> not, I'd like to know how the contexts are migrated) then the
> Strongtalk VM should handle this transparently.
>
The problem I was thinking about was that of inlined functions that aren't
officially part of the active context. But I guess that deoptimization can
handle it.
As an example consider this code:
foo: someObject
||
[someObject bar] whileTrue: [someObject baz]
...
Imagine that the function has inlined "baz" but the definition of "baz"
changes while this thread is up a frame evaluating "bar". The definintion
of "foo:" hasn't changed but the definition of optimized foo has.
I guess you have to keep track of which functions inline which others and
deoptimize their frames when the functions they inline change.
>
> > 4. One optimization that makes sense in Ruby but not in Smalltalk
> > is to
> > figure out which instance variables are rarely instanciated but
> > which are
> > taking up a lot of memory and changing them into being stored in a
> > hidden
> > subobject. I assume that using a hidden variable holding a hidden
> > object is
> > better than trying to implement a completely general, safe and fast
> > become
> > (that's almost impossible, I think).
>
> That kind of optimization is probably better done up in Ruby code. If
> you add instance variables on the fly, then the ones that never get
> used will never be added anyway. Once they do, the only cost will be
> an extra pointer in all instances, so it's not catastrophic.
>
As I was going to say in my response to Dave, I think the optimal answer for
Ruby isn't an object table or extra indirection for all variables, but
rather a system that recognizes that mutating classes is relatively rare:
1. the ability to mutate all instances during a stop-the-world collect and
compact later on. This collect would also have to mark effected stack
frames for deoptimization.
2. a stopgap where all objects have a free pointer or two in order to hold
possible additions to the class until such time as a stop and collect can
change the definition.
> In general, though, I think you're right. The Ruby object model is
> almost exactly that of Strongtalk. The main issues that come to my
> mind are:
>
> Singleton objects: The runtime would have to create a custom subclass
> and convert the object to an instance of it.
>
> Continuations: currently not supported by Strongtalk, but David has
> said it's doable.
>
> Ruby message sends can have a variable number of arguments. The
> runtime would have to capture sends with an "unexpected" number of
> arguments (probably using #doesNotUnderstand:) and create specialized
> versions of the method.
>
> Probably the biggest chunk of work in implementing Ruby-on-Strongtalk
> would be writing a Ruby-to-Strongtalk-bytecode compiler. Ruby is
> notoriously difficult to parse, so it would be a fair amount of work
> to make sure all the nooks and crannies of the grammar get covered.
>
> Pulling this off would be a huge coup, though, as it would be way,
> way faster than the existing Ruby implementation. I'd really like to
> see this happen.
>
> Colin
>
The one fly in the ointment here is that there IS a project underway to give
Ruby a VM, but since development is going on in Japanese I can't really be
sure how sophisticated it is planned to be. So far I don't think it's
showing much speedup but it could be that their plans are ambitious.
Josh Scholar
I wonder to what extent the conversion of Ruby extensions to a Strongtalk
Ruby could be facilitated by a limited C or C++ to Strongtalk or to
Strongtalk VM compiler.
I've been thinking about something similar for my own non-strongtalk Ruby
project.. Throwing together a C compiler that supports a copying,
non-conservative collector. Partially I meant it as a test for my code
generator, but it seemed like a cool hack for facilitating the conversion of
existing libraries and extensions.
Anyway I may switch over to strongtalk.
Josh Scholar
I wouldn't worry too much about that. There are currently something
like 6 or 7 projects underway to either port Ruby to an existing vm or
write a new one. Matz, the creator of Ruby, seems to be encouraging
this. I'm sure none of them have the price/performance ratio that
building a VM on top of Strongtalk would have, but at least two of them
have pretty big budgets and are pretty invested in their current VMs,
namely .net and JVM. YARV, the Japanese one, is being written from
scratch essentially by one guy, and seems to be moving pretty slowly.
Probably most interesting from a Smalltalk point of view is Rubinius,
which was created by its author after reading through the Blue Book.
Anyway, getting Ruby to run on Strongtalk was a project I flagged as
'something that would be cool to do if I had time', but after looking
into it a bit and hearing horror stories about actually parsing Ruby
correctly and figuring out what its semantics are supposed to be in
corner cases, I decided to look elsewhere for a hobby project. Don't
let that discourage you, though!
> I think that, in most smalltalks, generally you can call "become"
> to change
> an object to any object that has the same instance variables in the
> same
> order - otherwise live contexts will involve code that mis-accesses
> instance
> variables.
>
> I suppose adding new instance variables on to the end of an object
> is the
> only case that is safe, after all.
Right... luckily that's what we'd be doing
> The problem I was thinking about was that of inlined functions that
> aren't
> officially part of the active context. But I guess that
> deoptimization can
> handle it.
>
> As an example consider this code:
>
> foo: someObject
> ||
>
> [someObject bar] whileTrue: [someObject baz]
> ...
>
> Imagine that the function has inlined "baz" but the definition of
> "baz"
> changes while this thread is up a frame evaluating "bar". The
> definintion
> of "foo:" hasn't changed but the definition of optimized foo has.
>
> I guess you have to keep track of which functions inline which
> others and
> deoptimize their frames when the functions they inline change.
When you modify methods, you do want to flush any native versions
that have been compiled. But you don't have to modify existing stack
frames. Consider the following ruby code:
class Alpha
def foo
garple = 3
bar
garple
end
def bar
self.class.class_eval {
def foo
garple = 4
bar
garple
end
}
end
end
alpha = Alpha.new
puts alpha.foo
puts alpha.foo
The first time it's called, #foo answers 3. The second time, it
answers 4. It works the same way in Smalltalk.
> As I was going to say in my response to Dave, I think the optimal
> answer for
> Ruby isn't an object table or extra indirection for all variables, but
> rather a system that recognizes that mutating classes is relatively
> rare:
>
> 1. the ability to mutate all instances during a stop-the-world
> collect and
> compact later on. This collect would also have to mark effected stack
> frames for deoptimization.
Yeah, you'd want mutation of the instances to be uninterruptible, but
it needn't be tied to garbage collection. Some Smalltalks have a
primitive to do mass becomes, but even if Strongtalk doesn't, I'm
sure you could do #valueUninterruptibly, or something similar.
Again, I don't see why you'd have to deoptimize stacks. Ruby doesn't
have a way of specifying the internal layout of an object, so you
could always add instvars to the end of of the object. This would
mean that existing stacks needn't be deoptimized.
> 2. a stopgap where all objects have a free pointer or two in order
> to hold
> possible additions to the class until such time as a stop and
> collect can
> change the definition.
I don't see why this has to be so complicated. Why not just do
something like this:
1. A method is compiled that references an instance variable not yet
present in the class.
2. The compiler notices this, and does a migration:
a. It creates a new class with the new variable after the existing
variables
b. It enumerates the existing instances, creating instances of the
new class
c. All the old instances are converted to the new instances in a
mass become
3. The new method is installed in the new class
It's actually pretty easy.
Colin
Unless the class has subclasses, in which case the added instance vars
aren't at the end for the subclasses.
-Dave
However, this work could (and should) be done last, after a full
proof of concept had been completed without it. There are at least
two independent parsers of Ruby's syntax (JRuby and the standard C
Ruby), either of which could be used to preprocess Ruby source into a
form that could be much more easily handled (say s-expr or an XML
serialization of a parse tree). It would then be easy enough to
produce Smalltalk source code from this that could be fed to any
Smalltalk compiler, like Strongtalk's. It would certainly be
awkward, but it would be enough to make for a killer demo and allow
serious benchmarking, which I would think should produce enough
momentum to get people interested in building a new parser.
> Pulling this off would be a huge coup, though, as it would be way,
> way faster than the existing Ruby implementation. I'd really like to
> see this happen.
Me too. I've been advocating and tinkering with Ruby-on-Smalltalk
for at least two years now (sadly, far more of the former than the
latter), but having a high-performance liberally licensed VM
available makes the story much more compelling.
Avi
As long as I can get useful help from this list, it should take me much less
time to adapt the StrongTalk VM than write my own... I got over my "not
invented here" attitude about a day ago when I realized that time to getting
something usable was a year less this way and that every optimization that
StrongTalk doesn't do that I wanted, I could probably implement at least as
quickly working with StrongTalk than working on my own. And then my code
would be useful to other projects as well, if it gets accepted.
I tend to change directions quickly and often, but my current inclination is
to devote my spare time in the next month to trying to understand and gut
Ruby 1.8.5's yacc and lex files to make an adapted parser. Then I don't
have to worry about duplicating the context dependent grammar, I'll use the
original.
I like the idea of parsing to s-expressions as well, that has been my plan
for a while. I intended my Ruby to always convert to s-expressions as the
first step in parsing and to extend the language so that the s-expression
form of all code is always available in order to facilitate metaprogramming.
----- Original Message -----
From: "Avi Bryant" <avi.b...@gmail.com>
To: <strongtal...@googlegroups.com>
Sent: Sunday, October 29, 2006 8:06 PM
Subject: Re: Future of StrongTalk
Cool.
Josh Scholar
> > I suppose adding new instance variables on to the end of an object
> > is the
> > only case that is safe, after all.
>
> Right... luckily that's what we'd be doing
As David pointed out, that's not always what we'll be doing.
I think we're going to need a little bit of support in the VM, at least for
getting Ruby fully optimized.
Sure, the behavior you want is simple, but the problem is that getting
optimized, inlined code to give you that simple behavior is very hard. It's
a kind of multitasking problem, and anyone who's tried to write
multiprocessor code knows that you have to consider every combination of
states.
In fact I have my doubts that Strongtalk can preserve the symantics you're
asking for if it does global optimization and scheduling beyond simple
inlining.
Imagine that you're inside of a loop that's has an inlined function. The
unoptimized loop would have called "baz" but now its just a loop that
intertwines "baz" code with whatever other code runs in that loop, maybe
it's even unrolled the loop by 4 and interwined 4 instances of "baz". In
that case you're basically screwed if you have to simulate changing "baz" on
an index that isn't a multiple of 4.
You'd get similar problems if you inlined up more than one level. And I
haven't even taken the time to think about what happens to code and data
that were folded, partially executed in compilation and optimized out of
existence if you have to change some level in the inlined code...
If the code was changed by another thread, then you can just pretend that
the other thread waited until a safer time, but if code in one thread
modifies itself at just the wrong point, then you can't use the trick of
slipping time between threads, so the the compiler has to be completely
correct in detecting that possibility and preventing the wrong optimization.
I hope that's possible.
Even if you don't have that sort of aggressive optimization, mere inlining
makes the situation awfully complicated. You have to take a stack frame for
a single function (that has inlines inside of it) and turn it into nested
stack frames for each of the functions inlined at the current program
counter position.
And that's on top of the complications that every smalltalk has, like the
need to keep obsolete code around until all of the callers have counted out.
No doubt Mr. Griswald is infinitely more familiar with these problems than I
am and can correct my misconceptions.
Well, in a smalltalk without indirection through an object table, every
become requires the equivalent to a full collect, because you have to find
every single reference to the object and update it.
I suppose there's a trick where you change the object in place to hold some
sort of forwarding object, but you'll be stuck with forwarding objects until
you do a full collect that replaces them all. And forwarding objects are
going to gunk up the type feedback and optimization until they're gone.
And forwarding objects will force a deoptimization of active frames because
the forwarding object will not be the type already optimized for.
And as David pointed, out subclasses of change objects will have to force
deoptimization even if we do a full collect and don't bother with
forwarding.
But my idea of having extra pointers waiting to hold extra slots would allow
us to postpone deoptimization and the full sweep. Postponing is a good
thing because it lets you wait until you've aggregated enough work to be
worth your while and prevent thrashing on worst cases.
Josh Scholar
Hi Josh,
Been following this thread, and I'm pleased that you have stepped up to
the plate. There was recently a symposium called Lang.NET. One of the
presentations was by John Gough from Queensland University. Apparently
he as already written a mostly working Ruby compiler for the CLR. The
back-end as I understand it generates C#. He as also created some yacc
and lex like tools specifically for the job, all open source.
Here is a link to the Symposium website:
http://www.langnetsymposium.com/speakers.asp
John as a presentation here. I like Avi's suggestion of compiling to
s-expressions also. One of the thing on my wish list is better
interoperability between OO languages. Using s-expressions as a
franca-lingua just sounds right.
BTW: As anyone though about approaching the universities? This kind of
stuff sounds like it's made for a PhD thesis. Surely there are
proffessors and post grads out there eager and able to help?
Paul.
Well it's lisp. Perhaps a younger person than me would have picked a more
structured intermediate language built on XML. XML is the new s-expression.
Josh Scholar
> The only problem I had with Smalltalk before was that it wasn't
> truly multithreaded - and I think that's the only thing that Java
> really has going for it (aside from a backer who is a champion at
> shameless self-promotion).
>
> Is Strongtalk going to be fully multithreaded?
Strongtalk's Process class is already mapped to native threads, yes,
as discussed in the conversation about continuations. The idea is
that eventually it will support M:N style multiprocessing to support
lighter-weight concurrency and control-flow changes.
--
Brian T. Rice
http://briantrice.com
-----Original Message-----
From: strongtal...@googlegroups.com [mailto:strongtal...@googlegroups.com]On Behalf Of John Kwon
Sent: Monday, October 30, 2006 4:33 PM
To: strongtal...@googlegroups.com
Subject: Re: Future of StrongTalk
The only problem I had with Smalltalk before was that it wasn't truly multithreaded - and I think that's the only thing that Java really has going for it (aside from a backer who is a champion at shameless self-promotion).
Is Strongtalk going to be fully multithreaded?
>>> I suppose adding new instance variables on to the end of an object
>>> is the
>>> only case that is safe, after all.
>>
>> Right... luckily that's what we'd be doing
>
> As David pointed out, that's not always what we'll be doing.
A simple way around this would be to compile accessor methods for all
instance variables and have the Ruby compiler generate accessor calls
rather than direct variable accesses. Then when you add a variable,
you can put it anywhere you like; you just have to make sure to
update all the accessor methods. With inlining, it wouldn't even be
much of a performance hit.
[snip a bunch of stuff, here and at the end]
> Imagine that you're inside of a loop that's has an inlined
> function. The
> unoptimized loop would have called "baz" but now its just a loop that
> intertwines "baz" code with whatever other code runs in that loop,
> maybe
> it's even unrolled the loop by 4 and interwined 4 instances of
> "baz". In
> that case you're basically screwed if you have to simulate changing
> "baz" on
> an index that isn't a multiple of 4.
Ok, I think I see where you're coming from. If you've got code that
calls a method inside a loop, and that method is modified, you've now
inlined the wrong implementation and you've got to deoptimize the
stack with the partially-executed loop intact so that subsequent
invocations of the changed method will work correctly.
Fair enough, but I still don't see the need for explicit support for
Ruby in the Strongtalk VM. My logic goes like this:
1. Ruby and Smalltalk both allow methods to be modified at arbitrary
times during execution.
2. The Strongtalk VM supports this for Smalltalk code today. (As far
as I know...)
3. Ruby methods and Smalltalk methods would be indistinguishable at
the bytecode level.
4. Therefore dynamically modified Ruby methods should work just as
well as Smalltalk methods.
My assumptions might be wrong, and there might be hidden gotchas, but
I do think this is pretty straightforward. My inclination would be to
implement the Ruby compiler and runtime support entirely in
Smalltalk, and focus on getting the semantics right, even at the
expense of speed. That would still produce a Ruby implementation
faster than the existing one, and it could then be tuned further if
need be.
Colin
>
> A simple way around this would be to compile accessor methods for all
> instance variables and have the Ruby compiler generate accessor calls
> rather than direct variable accesses. Then when you add a variable,
> you can put it anywhere you like; you just have to make sure to
> update all the accessor methods. With inlining, it wouldn't even be
> much of a performance hit.
>
> [snip a bunch of stuff, here and at the end]
>
Two problems:
1. Assuming that Strongtalk is neither more safe nor less safe on "become:"
than other smalltalks then conceptually you've just shortened the window in
which the program can screw up from being whole functions to be short
accessor functions. But in multitasking algorithms, even a window of a
couple of instructions that can crash the system is unacceptable. So, in
principle, if you call "become:" on an object when another thread is just
entering an accessor function, you'll still get the miss-access and the
crash. So it will still need some VM support to avoid that problem.
Now I'm not saying VM support is impossible, but it's not high level
smalltalk. But, assuming, as I said, that Strongtalk is neither more safe
nor less safe on "become:" than other smalltalks then this would be good
enough for a demonstration, but not for a release.
Still, I'm going to believe that deoptimization working 100% without causing
crashes when I see it. It's VERY subtle and very new to be able to
deoptimize, I'd be surprised if it's bug free. It may be that these edge
conditions I'm discribing are not handled perfectly.
2. More importantly, we need VM support for a _mass_ become (or we need
another level of indirection - which is also a VM change), because if the
damn system has to walk the stack and look for routines to deoptimize on
every "become" then walking the objects of a specific class and calling
"become" on all of them is could take hours. And if it does the "walk the
whole image and update pointers" on every become, it could take weeks.
Having a correct (or nearly correct) algorithm that can be implemented in
high level smalltalk isn't enough, it has to run acceptably quickly.
Josh Scholar
Of course it isn't Ruby until it has preemptive scheduling!
.........
That said, the second problem still holds. Unless strongtalk has a
fast "become:" (or some sort of lazy, aggregating "become:" that I
think is impossible in an optimized system) then your suggestion would
work but be too slow on any test with more than one or two objects.
> I just realized that the first problem I mentioned isn't really a
> problem until Strongtalk has real multitasking which it doesn't have
> yet.
>
> Of course it isn't Ruby until it has preemptive scheduling!
Right. Strongtalk doesn't run Smalltalk processes concurrently on
multiple processor machines, so we don't have to worry about
optimized methods being preempted at the wrong moment. However, that
doesn't mean that Smalltalk processes are not preemptively scheduled!
The VM uses cooperative multitasking to ensure that processes are
preempted at safe points (eg, between bytecodes) but from the point
of view of Smalltalk code, the scheduling is preemptive.
Ruby works the same way - Ruby threads do not map directly to OS
threads the way Java threads do, but they are preemptively scheduled
by the Ruby interpreter. Mapping Ruby processes to Smalltalk
processes ought to work fine.
Colin
>
>
As David just wrote, Smalltalk sematics traditionally did not include
preemptive scheduling.
I wish you were right that Strongtalk already has preemptive scheduling, but
I don't think you are. As he also wrote, things are likely to start
breaking when we DO turn preemptive scheduing on.
And consider that cooperative preemptive scheduling means sticking polling
code in every basic block.
Josh Scholar
Josh Scholar
Another point is that if there was green-thread preemptive scheduling, then
unless that was specifically designed to make accessor functions safely fix
that obscure object redefinition problem we've been talking about, there
would be no guarantee that it would be any safer than OS threading.
It IS obscure because smalltalk isn't ruby.
And as I said before, I believe that perfectly safe and transparent
deoptimization in the case of classes being redefined while in use rather
severely limits what sorts of optimizations are completely safe. Since that
sort of event would be extremely rare in a Smalltalk program, and since
Strongtalk hasn't been completely debugged anyway, I wouldn't be surprised
to find out that there are some unsafe optimizations going on.
Josh Scholar