Garbage collection problem

dontrango

unread,

Mar 1, 2004, 10:47:09 PM3/1/04

to

Hi,

I have a garbage collection problem below. After line 15, why would the
object referenced by a is eligible for garbage collection whereas that
referenced by b is not?

Thanks for the help.

1 class TestA {
2 TestB b;
3 TestA ( ) { b = new TestB (this); }
4 }
5
6 class TestB {
7 TestA a;
8 TestB (TestA a) { this.a = a; }
9 }
10
11 class TestAll {
12 public static void main (String [ ] args) {
13 new TestAll.makeThings ( );
14 // ... code
15 }
16
17 void makeThings ( ) { testA test = new TestA ( ); }
18 }

John C. Bollinger

unread,

Mar 2, 2004, 9:25:43 AM3/2/04

to

dontrango wrote:

> I have a garbage collection problem below. After line 15, why would the
> object referenced by a is eligible for garbage collection whereas that
> referenced by b is not?
>
> Thanks for the help.

If you want homework help then it is to your benefit to show at least
some evidence of trying to solve the problem yourself. Some reasoning
supporting at least a partial position on the question, something to
show that you have at least read the relevant part of your text or (even
better) the actual specification -- give us something to work with.
Neither you nor any of the rest of us is well served if you manage to
pass your class without actually knowing the material.

John Bollinger
jobo...@indiana.edu

dontrango

unread,

Mar 2, 2004, 10:45:23 AM3/2/04

to

"John C. Bollinger" <jobo...@indiana.edu> wrote in message
news:c225j9$719$1...@hood.uits.indiana.edu...

> dontrango wrote:
>
> > I have a garbage collection problem below. After line 15, why would the
> > object referenced by a is eligible for garbage collection whereas that
> > referenced by b is not?
> >

1 class TestA {

2 TestB b;
3 TestA ( ) { b = new TestB (this); }
4 }
5
6 class TestB {
7 TestA a;
8 TestB (TestA a) { this.a = a; }
9 }
10
11 class TestAll {
12 public static void main (String [ ] args) {
13 new TestAll.makeThings ( );
14 // ... code
15 }
16
17 void makeThings ( ) { testA test = new TestA ( ); }
18 }

here is my thread of thought:

line 13 creates an instance of TestAll class and calls its makeThings
method, without keeping a reference to it.
line 17 instantiates a TestA object, calling its constructor. In the
process, this calls the constructor of class B that sets
the instance variable of TestB object a to the TestA object.

looks like an island of isolation but not quite; they are instances of
different classes.

so I disagree with the statement 'After line 15, the object referenced by a
is eligible for
garbage collection whereas that referenced by b is not?' since both should
be eligible ( the current thread has no access to
both objects ).

What is your opinion on that?

xarax

unread,

Mar 2, 2004, 11:05:05 AM3/2/04

to

"dontrango" <dont...@dontrango.com> wrote in message
news:c2104d$5l9$1...@nobel2.pacific.net.sg...

> Hi,
>
> I have a garbage collection problem below. After line 15, why would the
> object referenced by a is eligible for garbage collection whereas that
> referenced by b is not?
>
> Thanks for the help.
>
> 1 class TestA {
> 2 TestB b;
> 3 TestA ( ) { b = new TestB (this); }
> 4 }
> 5
> 6 class TestB {
> 7 TestA a;
> 8 TestB (TestA a) { this.a = a; }
> 9 }
> 10
> 11 class TestAll {
> 12 public static void main (String [ ] args) {
> 13 new TestAll.makeThings ( );

You meant: new TestAll().makeThings();

> 14 // ... code
> 15 }
> 16
> 17 void makeThings ( ) { testA test = new TestA ( ); }

You meant: void makeThings() {TestA test = new TestA();}

> 18 }

1. If you must use line numbers, make them /*...*/ comments, so
that others can simply copy&paste your code.

2. Only post code that you know will compile. The above
code will not compile (even after removing the line numbers).

3. If your code has comments, be sure they a /*...*/ comments
instead of // comments, because sometimes the code wraps and
screws-up the // comments.

4. The reasoning that "a" is eligible for garbage collection
and "b" is not: "b" is not eligible until all of its strong
references are nullified and "a" still has a strong reference
to "b". "b" is ineligible for GC until after "a" has been
reclaimed. "a" and "b" are not both simultaneously eligible,
but rather incrementally eligible. GC won't know that "b"
is eligible until sometime after it determines that "a"
is eligible. However, depending on the JVM implementation, GC
may defer reclaiming "a" until "b" is eligible also eligible
for reclaim. Eligibility for reclaim and actual reclaiming
are two completely different phases of GC.

It may be easier to understand by examining the "reachability"
of the objects:

At line 13, the current thread can only indirectly reach "b"
through a strong reference to "test" local variable. When the
makeThings() method returns (line 15), its stackframe is popped
(and all local variables on that stackframe are nullified). Thus,
the thread loses its last strong reference to the single instance
of TestA (i.e., the "a" instance). At that time, GC can make a
determination that the TestA instance is eligible for reclaim.
The TestB instance (i.e., the "b" instance) is not yet eligible,
because GC hasn't actually reclaimed the TestA instance (and
nullifying its reference fields). Only *after* the TestA instance
becomes eligible for reclaim will GC notice that its TestB instance
field was the last strong reference for the "b" object. The
determination of GC eligibility is an incremental process.

Note, however, that there are theoretical GC models that
can determine precisely when an instance is eligible for
reclaim without a reachability search (via reference tracking
algorithms). Such GC models still require an incremental
approach, but not a search of the heap.

Hope this helps.

Chris Uppal

unread,

Mar 2, 2004, 12:47:56 PM3/2/04

to

xarax wrote:

> Only *after* the TestA instance
> becomes eligible for reclaim will GC notice that its TestB instance
> field was the last strong reference for the "b" object. The
> determination of GC eligibility is an incremental process.

Are you using the expression "eligible for reclaim" in a technical sense,
defined as part of some Java spec somewhere ? I can't find anywhere that does
so, but could very easily have missed something.

If so then I can understand how the necessity for precision for describing the
lifetime of an object in the presence of finalisation could lead to terminology
that makes what you say exactly correct.

But if not, then I think it's wrong. Ignoring finalisation for a moment, in
what I would call "normal" terminology, an object becomes eligible for reclaim
once there is no longer any path leading from a root, such as a thread's stack
frame, to that object. (I'm also ignoring weak/soft/phantom references here).
Hence, at the moment where any one object becomes eligible, all other objects
that are only reachable via that object also become eligble -- by definition.

What may change is whether and how an actual GC algorithm can *detect* that
eligibility. Algorithms in the broad category including mark-and-sweep,
copying, etc will in one sense discover that both objects are unreachable at
the same time. In another sense they never discover that any object is
unreachable -- they are only interested in ones that are reachable, everything
else is just unintialised RAM. GC algorithms which use some variant of
reference counting *do* have the incremental nature that you describe -- the GC
actively follows trains of unreachability: "aha this is unreachable. Good. So
that means *this* is unreachable too". And so on...

When you factor finalisation into the picture then it gets more complicated,
and the terminology doesn't seem to be particularly well established. However,
one way to see it is that finalisation breaks the link between being
"unreachable from a root" and "eligible for reclaim". Another way of seeing it
(which matches the language of the JLS2 rather better) is that the system
automatically moves finalisable objects which are not otherwise reachable into
a state where they are only reachable by other objects in that state and by the
finalisation process (zero, one, or more threads). In either case (as I read
the rather opaque text) finalisation does introduce something rather like the
incremental process that you describe in that no object becomes eligible for
reclaim until it, and all chains of references to it, have been finalized
without making it reachable again.

I suspect, though, that you know all this perfectly well, and what we have here
is a difference in terminology rather than a different understanding of how
real GC algorithms work. Could you clarify please ?

-- chris

John C. Bollinger

unread,

Mar 2, 2004, 2:41:14 PM3/2/04

to

dontrango wrote:

My opinion is that you have judged correctly, at least with regard to
the Java GC model. In Java an object is eligible for GC if it is not
reachable from a live thread via a chain of strong references. In the
example, every TestA instance is paired with a TestB instance such that
each holds a strong reference to the other. Therefore, if one is
strongly reachable then so is the other, and the two always have the
same eligibility for GC. With the precise code above, it would be
possible to break the relationship after construction of the TestA and
TestB instances by directly modifying their instance variables, but no
such thing is actually done. Both instances created during an
invocation of TestAll.makeThings() in fact become eligible for GC as
soon as makeThings() exits.

There are some subtleties involved in determining eligibility for GC in
Java, the most notable being "hidden" local variables. Hidden local
variable arise because Java does not actually have nested local variable
scopes at the VM level -- only at the Java source level. To the VM all
local variables are treated equally. Therefore, a local reference
variable that goes out of scope in the Java sense sticks around until
the method in which it is declared terminates, no matter how long that
may be. Any object it refers to remains strongly reachable until that time.

That there are other GC models where the answer might be a bit
different, perhaps including some in which the problem is not a trick
question. For Java, however, the bottom line answer is "there is no
such reason."

John Bollinger
jobo...@indiana.edu

Lee Fesperman

unread,

Mar 2, 2004, 4:51:43 PM3/2/04

to

John C. Bollinger wrote:
>
> There are some subtleties involved in determining eligibility for GC in
> Java, the most notable being "hidden" local variables. Hidden local
> variable arise because Java does not actually have nested local variable
> scopes at the VM level -- only at the Java source level. To the VM all
> local variables are treated equally. Therefore, a local reference
> variable that goes out of scope in the Java sense sticks around until
> the method in which it is declared terminates, no matter how long that
> may be. Any object it refers to remains strongly reachable until that time.

That is no quite true. At the bytecode level, local variables are 'slots' in the stack
frame. Some compilers will reuse the slots allocated to nested local variables thus
removing the reachability of references in the reused slots.

This only applies to direct interpretation of bytecodes. Runtime compilers are free to
apply additional optimizations, including reuse of 'unnested' local variables. I have
seem JVMs that perform even more esoteric optimizations.

> That there are other GC models where the answer might be a bit
> different, perhaps including some in which the problem is not a trick
> question. For Java, however, the bottom line answer is "there is no
> such reason."

For Java, it is best to assume that (as you said) a local variable remains reachable
until the method terminates. However, it is not guaranteed.

--
Lee Fesperman, FirstSQL, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)

John C. Bollinger

unread,

Mar 3, 2004, 1:15:16 PM3/3/04

to

Lee Fesperman wrote:

[...]

> At the bytecode level, local variables are 'slots' in the stack
> frame. Some compilers will reuse the slots allocated to nested local variables thus
> removing the reachability of references in the reused slots.
>
> This only applies to direct interpretation of bytecodes. Runtime compilers are free to
> apply additional optimizations, including reuse of 'unnested' local variables. I have
> seem JVMs that perform even more esoteric optimizations.

[...]

> For Java, it is best to assume that (as you said) a local variable remains reachable
> until the method terminates. However, it is not guaranteed.

Yes, I suppose I was a bit presumptive. I should have said that an
object referred to by a local variable of some method _may_ remain
reachable via that variable until the completion [abrupt or normal] of
that method's execution, regardless of whether the variable goes out of
scope in the Java source sense. The specs do not require that behavior,
and reasonable compilers might indeed emit bytecode that does not
exhibit it. The specs also do not forbid the behavior, and some
compiler / VM combinations certainly do exhibit it in at least some cases.

It is seperate question whether a VM might reuse a local variable slot
that would otherwise go unused for the remainder of the execution of
some method. Doing so requires some degree of program flow analysis in
order to determine in the first place that the slot is available. I'm
having trouble imagining a scenario where a compliant VM could
reasonably elect to do that outside the scope of JIT compilation, but
once you JIT a piece of bytecode a wide variety of optimizations are
possible.

John Bollinger
jobo...@indiana.edu

Dale King

unread,

Mar 3, 2004, 11:20:14 PM3/3/04

to

"John C. Bollinger" <jobo...@indiana.edu> wrote in message

news:c257do$9n9$1...@hood.uits.indiana.edu...

Chris Smith and I had a long discussion on this a while back and could not
come to an agreement. Consider the following case:

public void method()
{
{ Object o = new FooBar(); }
someMethodThatExecutesALongTime();
}

I think we would agree that it would not be in error for the object to be
eligible for garbage collection during the call to the other method, since
it is no longer accessible and the variable itself has gone out of scope.

What if we remove that scope:

public void method()
{
Object o = new FooBar();
someMethodThatExecutesALongTime();
}

The question is whether the VM is allowed to make the object created
eligible for garbage collection before the called method returns. It is no
longer accessed in this method but it is still in scope.

I say that the VM should not be allowed to do this because while the
variable will not be accessed again, technically it is still visible after
the method returns and would be accessible even though it isn't actually
accessed.

But the only way to actually see an effect from this is if the finalizer for
the object had a side effect. This type of pattern is of course used all the
time in C++, but makes less sense in Java, but I'm not sure we can just
throw it out without losing program correctness.

Unfortunately, the spec is not explicitly clear on this point.

The problem is that these two methods generate the exact same byte code.

If you agree with me that in the second case it would be wrong for the
object to be garbage collected before the called method returns, then you
have to conclude that the VM is quite limited in what it can do to optimize
garbage collection and it doesn't matter what program flow analysis that you
do. The VM can only tell if the variable *will* be accessed again, but has
no way to know when it ceases to be "accessible".

--
Dale King

Lee Fesperman

unread,

Mar 4, 2004, 8:04:05 PM3/4/04

to

Dale King wrote:
>
> "John C. Bollinger" <jobo...@indiana.edu> wrote in message
> news:c257do$9n9$1...@hood.uits.indiana.edu..

> > Lee Fesperman wrote:
> >
> > > For Java, it is best to assume that (as you said) a local variable
> > > remains reachable until the method terminates. However, it is not
> > > guaranteed.
> >

> If you agree with me that in the second case it would be wrong for the
> object to be garbage collected before the called method returns, then you
> have to conclude that the VM is quite limited in what it can do to optimize
> garbage collection and it doesn't matter what program flow analysis that you
> do. The VM can only tell if the variable *will* be accessed again, but has
> no way to know when it ceases to be "accessible".

I'll vote for better optimization. As you say, your approach would limit VM
optimization. For instance, the runtime compiler could use a register instead of a
variable. On some machines, "accessibility" could force a register save/restore.

I fully expect VMs to do this optimization and much, much more. It would be best to heed
my caveat: "it [accessibility/reachability] is not guaranteed."

Stephen Kellett

unread,

Mar 4, 2004, 8:10:28 PM3/4/04

to

In message <4047...@news.tce.com>, Dale King <kingd@[at].invalid>
writes

>public void method()
>{
> { Object o = new FooBar(); }
> someMethodThatExecutesALongTime();
>}
>

>eligible for garbage collection before the called method returns. It is no
>longer accessed in this method but it is still in scope.

It is *not in scope*. Read the VM spec for local variable table
definitions. This defines the locations within the method for which a
variable is valid.

StartPC and Length - ie, the variable may not be valid until XX bytes
into the method and then is only valid for YY bytes. Params are of
course valid from 0 bytes into the method and for Length bytes from
there.

Given your method, object 'o' would inhabit slot zero and would be valid
from offset 0 to offset at the start of the line with the call to
someMethod...

As such, the variable is valid for reclamation, however most GC's will
take the simple and easy approach and wait for the method to end before
thinking about such things.

Stephen
--
Stephen Kellett
Object Media Limited http://www.objmedia.demon.co.uk
RSI Information: http://www.objmedia.demon.co.uk/rsi.html

Stephen Kellett

unread,

Mar 6, 2004, 12:27:45 PM3/6/04

to

In message <gbsvcwDE...@objmedia.demon.co.uk>, Stephen Kellett
<sn...@objmedia.demon.co.uk> writes

>In message <4047...@news.tce.com>, Dale King <kingd@[at].invalid>
>writes
>>public void method()
>>{
>> { Object o = new FooBar(); }
>> someMethodThatExecutesALongTime();
>>}
>>
>>eligible for garbage collection before the called method returns. It is no
>>longer accessed in this method but it is still in scope.
>
>It is *not in scope*. Read the VM spec for local variable table
>definitions. This defines the locations within the method for which a
>variable is valid.

Following myself up. I feel a qualification of my statement is required.

The local variable table is optional - it does not have to be present
for the java class to load and execute. Its job is to help debuggers and
so forth. I've come to this conclusion after writing a prototype java
tracer that dumped the params and locals to stdout. Many of the Sun
supplied classes don't have a local variable table - even though they
clearly do have local variables and method parameters.

To recast my original statement:
If you have a local variable table for the method and the JVM chooses to
use that information the JVM can determine the variable is out of scope.
The JVM will most likely not use the local variable table information as
it is simpler to simply wait until the end of the method.

For the occasions when the local variable table is absent, you have to
assume the variable is in scope even though you know better. Hence the
JVM won't attempt garbage collection of such local variables until the
method terminates.

Cheers

Chris Uppal

unread,

Mar 8, 2004, 4:51:50 AM3/8/04

to

[I posted a different version of this a couple of days back, but it seems to
have vanished into thin aether.]

Dale King wrote:

> public void method()
> {
> Object o = new FooBar();
> someMethodThatExecutesALongTime();
> }
>
> The question is whether the VM is allowed to make the object created
> eligible for garbage collection before the called method returns. It is no
> longer accessed in this method but it is still in scope.

My apologies if this is a point that you covered in your thread with Chris
Smith, but it seems to me that the JLS2 is pretty clear on this point. From
12.6.1 "Implementing Finalization":

========
A reachable object is any object that can be accessed in any potential
continuing
computation from any live thread. Optimizing transformations of a program
can be designed that reduce the number of objects that are reachable to be less
than those which would naively be considered reachable. For example, a compiler
or code generator may choose to set a variable or parameter that will no longer
be
used to null to cause the storage for such an object to be potentially
reclaimable
sooner.
========

Granted that that may not be a normative specification of what optimisations
may be performed, but does seem to be sufficiently explicit about that such a
normative spec would say if there were any such spec. E.g. I'd expect it to be
legal for the compiler (let alone the JVM) to rewrite the above quoted snippet
to:

public void method()
{
Object o = new FooBar();

o = null;
someMethodThatExecutesALongTime();
}

-- chris

Dale King

unread,

Mar 8, 2004, 12:24:20 PM3/8/04

to

"Stephen Kellett" <sn...@objmedia.demon.co.uk> wrote in message
news:gbsvcwDE...@objmedia.demon.co.uk...

> In message <4047...@news.tce.com>, Dale King <kingd@[at].invalid>
> writes
> >public void method()
> >{
> > { Object o = new FooBar(); }
> > someMethodThatExecutesALongTime();
> >}
> >
> >eligible for garbage collection before the called method returns. It is
no
> >longer accessed in this method but it is still in scope.

> It is *not in scope*. Read the VM spec for local variable table
> definitions. This defines the locations within the method for which a
> variable is valid.

You misquoted me above. The sentence "It is no longer accessed in this
method but it is still in scope." did not apply to the example you quoted.
It applied to the example with the braces removed. The rest of your logic is
based on that false mischaracterization of what I said.

You are correct that the local variable table will tell you scope
information for a variable, but that is optional and therefore cannot be
relied upon.

--
Dale King

Dale King

unread,

Mar 8, 2004, 12:35:43 PM3/8/04

to

"Chris Uppal" <chris...@metagnostic.REMOVE-THIS.org> wrote in message
news:396dnfmITpF...@nildram.net...

> [I posted a different version of this a couple of days back, but it seems
to
> have vanished into thin aether.]
>
> Dale King wrote:
>
> > public void method()
> > {
> > Object o = new FooBar();
> > someMethodThatExecutesALongTime();
> > }
> >
> > The question is whether the VM is allowed to make the object created
> > eligible for garbage collection before the called method returns. It is
no
> > longer accessed in this method but it is still in scope.
>
> My apologies if this is a point that you covered in your thread with Chris
> Smith, but it seems to me that the JLS2 is pretty clear on this point.
From
> 12.6.1 "Implementing Finalization":

Yes, it was discussed and is not as clear as you seem to think.

> ========
> A reachable object is any object that can be accessed in any potential
> continuing
> computation from any live thread.

Here is one of the unclear parts. What does *can be* accessed mean. Does
*can* be accessed require that it *will* be accessed. Or is *could have*
been sufficient.

> Optimizing transformations of a program
> can be designed that reduce the number of objects that are reachable to be
less
> than those which would naively be considered reachable. For example, a
compiler
> or code generator may choose to set a variable or parameter that will no
longer
> be
> used to null to cause the storage for such an object to be potentially
> reclaimable
> sooner.
> ========
>
> Granted that that may not be a normative specification of what
optimisations
> may be performed, but does seem to be sufficiently explicit about that
such a
> normative spec would say if there were any such spec. E.g. I'd expect it
to be
> legal for the compiler (let alone the JVM) to rewrite the above quoted
snippet
> to:

I agree with the spec that "a compiler or code generator" may do that. I
don't agree that the JVM should be allowed to do that, because it does not
have sufficient information.

FYI, to save recovering the same ground here is that previous discussion:

http://groups.google.com/groups?threadm=5b5776f8.0302041235.242e4f91%40posti
ng.google.com

Dale King

unread,

Mar 8, 2004, 1:17:58 PM3/8/04

to

"Lee Fesperman" <firs...@ix.netcom.com> wrote in message
news:4047D1...@ix.netcom.com...

> Dale King wrote:
> >
> > "John C. Bollinger" <jobo...@indiana.edu> wrote in message
> > news:c257do$9n9$1...@hood.uits.indiana.edu..
> > > Lee Fesperman wrote:
> > >
> > > > For Java, it is best to assume that (as you said) a local variable
> > > > remains reachable until the method terminates. However, it is not
> > > > guaranteed.
> > >
> > > It is seperate question whether a VM might reuse a local variable slot
> > > that would otherwise go unused for the remainder of the execution of
> > > some method. Doing so requires some degree of program flow analysis
in
> > > order to determine in the first place that the slot is available. I'm
> > > having trouble imagining a scenario where a compliant VM could
> > > reasonably elect to do that outside the scope of JIT compilation, but
> > > once you JIT a piece of bytecode a wide variety of optimizations are
> > > possible.
> >
> > Chris Smith and I had a long discussion on this a while back and could
not
> > come to an agreement. Consider the following case:
>

> I'll vote for better optimization. As you say, your approach would limit
VM
> optimization. For instance, the runtime compiler could use a register
instead of a
> variable. On some machines, "accessibility" could force a register
save/restore.

And I'll vote for correctness of program execution. C++ has the RAII
paradigm where they create an object and they rely on the fact the object
will be destructed when the variable "holding" it goes out of scope. There
are several reasons why RAII does not work well in Java but the primary
reason is that the object may not be finalized until long after the variable
goes out of scope. RAII is not ver usefull in that case.

But you are talking about allowing the opposite extreme where the object may
be finalized long before (could be hours or days) the variable goes out of
scope. That is just plain wrong.

> I fully expect VMs to do this optimization and much, much more. It would
be best to heed
> my caveat: "it [accessibility/reachability] is not guaranteed."

Your opinion has little weight. It should be explictly specified.

--
Dale King

Stephen Kellett

unread,

Mar 8, 2004, 1:55:17 PM3/8/04

to

In message <404c...@news.tce.com>, Dale King <kingd@[at].invalid>
writes

>You misquoted me above. The sentence "It is no longer accessed in this
>method but it is still in scope." did not apply to the example you quoted.
>It applied to the example with the braces removed.

My apologies. I only saw the article I replied to. A different posting I
made has corrected my comments in any case.

John C. Bollinger

unread,

Mar 8, 2004, 6:09:46 PM3/8/04

to

Dale King wrote:

I agree that the JLS is a bit vague on this point, but I think you are
taking an extreme position. To be sure, a VM that complied with your
assertion of the required behavior certainly would be compliant on this
point, but I don't interpret the spec as restrictively. In particular,
I think it is not required and not useful to interpret "can be accessed
in any potential continuing computation from any live thread" in any
other context than that of the classes currently loaded by the VM and
any that might be loaded in the course of any of the computations described.

For instance, in the example quoted above, the Java compiler or VM could
determine that there is no read of method()'s local variable "o" in the
byte code currently loaded into the VM. How could that not mean that o
cannot be accessed? I simply don't accept that the compiler or VM is
required to take into account the continuum of possible alternative
implementations of method(). The whole point of optimization is to
shortcut a program is a way that has no effect on the results except for
resource consumption and/or running time. I see absolutely no point to
interpreting the spec as you describe. I guess all this puts me in the
same camp as Chris Smith on the matter.

From where I stand "will be" is equivalent to "can be" plus "cannot be
otherwise". It is thus a stronger and inequivalent condition. It may
be impossible or impractical to evaluate a "will be" condition, so the
spec relies on the weaker "can be". The implementation is not in fact
required to make the determination precisely, so long as it doesn't
erroneously mark anything unreachable, because, as you know, it is not
required to GC those objects that are unreachable on any particular
schedule.

>>Optimizing transformations of a program
>>can be designed that reduce the number of objects that are reachable to be
>
> less
>
>>than those which would naively be considered reachable. For example, a
>
> compiler
>
>>or code generator may choose to set a variable or parameter that will no
>
> longer
>
>>be
>>used to null to cause the storage for such an object to be potentially
>>reclaimable
>>sooner.
>>========
>>
>>Granted that that may not be a normative specification of what
>
> optimisations
>
>>may be performed, but does seem to be sufficiently explicit about that
>
> such a
>
>>normative spec would say if there were any such spec. E.g. I'd expect it
>
> to be
>
>>legal for the compiler (let alone the JVM) to rewrite the above quoted
>
> snippet
>
>>to:
>
>
> I agree with the spec that "a compiler or code generator" may do that. I
> don't agree that the JVM should be allowed to do that, because it does not
> have sufficient information.

Hold your horses, there. What information is the VM missing? If it had
that information (or an equivalent) would it be permitted to perform the
transformation, via JIT or otherwise?

John Bollinger
jobo...@indiana.edu

Lee Fesperman

unread,

Mar 9, 2004, 3:33:56 AM3/9/04

to

Dale King wrote:
>
> "Lee Fesperman" <firs...@ix.netcom.com> wrote in message
> news:4047D1...@ix.netcom.com..
> >

> > I'll vote for better optimization. As you say, your approach would
> > limit VM optimization. For instance, the runtime compiler could
> > use a register instead of a variable. On some machines,
> > "accessibility" could force a register save/restore.
>
> And I'll vote for correctness of program execution. C++ has the RAII
> paradigm where they create an object and they rely on the fact the object
> will be destructed when the variable "holding" it goes out of scope. There
> are several reasons why RAII does not work well in Java but the primary
> reason is that the object may not be finalized until long after the variable
> goes out of scope. RAII is not ver usefull in that case.

C++ has no relevance here since it doesn't have GC. However, a VM could use RAII
(including finalization) as an optimization. A Java programmer just can't take advantage
of it.

> But you are talking about allowing the opposite extreme where the object may
> be finalized long before (could be hours or days) the variable goes out of
> scope. That is just plain wrong.

Exactly how is it wrong (other than in your opinion)? I reread the exchange from last
year, again. You may remember that I contributed to the thread. You failed to find any
authority for your position, and you failed to convince any posters to the thread.

You also didn't deal with the finalization aspect. There is some consensus that
finalizeers should be used carefully and rarely. There are those who recommend not using
them at all. I won't go that far; I have implemented finalizers for very solid reasons.
I tend to think that finalizers that are vulnerable in this situation are poorly
implmented. They really shouldn't have side effects except on external resources under
their sole control.

> > I fully expect VMs to do this optimization and much, much more. It
> > would be best to heed my caveat: "it [accessibility/reachability]
> > is not guaranteed."
>
> Your opinion has little weight. It should be explictly specified.

Your first comment is uncalled-for, though it is typical for you to use bluster and
insults in technical discussions. I'd be glad to stack my knowledge and experience
against yours any day.

I develop very complex systems software in Java. It's of great importance to me that JVM
optimization go as far as it can. I don't want it hamstrung by C++ concepts or poor Java
programming practices.

Chris Uppal

unread,

Mar 9, 2004, 8:39:24 AM3/9/04

to

Dale King wrote:

> FYI, to save recovering the same ground here is that previous discussion:
>
> http://groups.google.com/groups?threadm=5b5776f8.0302041235.242e4f91%40posti
> ng.google.com

Thanks. I've read it now...

But I still think the matter is clear ;-)

> > A reachable object is any object that can be accessed in any potential
> > continuing
> > computation from any live thread.
>
> Here is one of the unclear parts. What does *can be* accessed mean. Does
> *can* be accessed require that it *will* be accessed. Or is *could have*
> been sufficient.

I'm approaching it from a different direction, starting from the next part of
the quote:

> > For
> > example, a compiler or code generator may choose to set a variable or
> > parameter that will no longer be
> > used to null to cause the storage for such an object to be potentially
> > reclaimable sooner.

I cannot see any way that this could be true if it were not (at least) legal
for the compiler to rewrite:

{
Object o = new FooBar();
someMethodThatExecutesALongTime();
}

to:

{
Object o = new FooBar();

o = null;
someMethodThatExecutesALongTime();
}

That's to say, I cannot think of any simpler (i.e. easier to prove) transform
that the quote could be referring to. Note, I am not considering
"reachability" (or similar) at all, I'm only considering the legality of
transformations of the use of local variables (of course, that *affects*
reachability). Note that the transform is only using the same (putative)
"licence" as allows the compiler to use just one slot for:

{
Object o = new FooBar();

someMethod();
Object p = new BarFoo()
someMethod();
}

I assert that it is legal for the compiler to reuse the slot on the basis that
for it to be illegal there would have to be a statement to that effect
somewhere, and there ain't one.

> I agree with the spec that "a compiler or code generator" may do that. I
> don't agree that the JVM should be allowed to do that, because it does not
> have sufficient information.

The way I see it is the other way around. From the earlier point, and from the
fact that (as you noted in the original thread) the bytecodes that reach the
JVM are the same in each instance (before and after the transform), I conclude
it must be legal for the JVM to perform the equivalent of the same
optimisations. I.e. if the compiler is entitled to emit either sequence of
bytecodes, then the JVM must be entitled to consider both sequences as
equivalent and produce identical runtime behaviour for them.

OK, the "must"s in that paragraph are too strong -- it isn't a logical
entailment -- but I doubt if you could convince any JVM implementer that s/he
didn't have the (almost;-) implied freedom. And that really is the
bottom-line: I don't think that, pragmatically, it is safe to assume that JVM
implementers haven't interpreted the spec (such as it is) in the same way as
I/Chris Smith/et al.

-- chris

Dale King

unread,

Mar 10, 2004, 6:04:40 PM3/10/04

to

"John C. Bollinger" <jobo...@indiana.edu> wrote in message

news:c2iuhv$1f5$1...@hood.uits.indiana.edu...

I am not asking it to look at all possible alternative implementations, but
that it should respect my code where I declared the actual scope of the
variable, which says when the variable is accessible. You are saying that
the JVM should be able to circumvent what is declared in the program. If I
declare a scope for the variable I may be depending on the object remaining
alive. It just seems to me that you are being over optimistic and violating
the correctness of the program. In the end it probably doesn't matter
because the only way being over optimistic can matter is if you have side
effects from a finally clause and even if you did the garbage collector
usually delays finalization.

> > I agree with the spec that "a compiler or code generator" may do that. I
> > don't agree that the JVM should be allowed to do that, because it does
not
> > have sufficient information.
>
> Hold your horses, there. What information is the VM missing? If it had
> that information (or an equivalent) would it be permitted to perform the
> transformation, via JIT or otherwise?

Sure if the class file actually has the optional local variable table, then
it is free to reclaim objects referenced from variables that have gone out
of scope.

Dale King

unread,

Mar 10, 2004, 6:40:46 PM3/10/04

to

"Lee Fesperman" <firs...@ix.netcom.com> wrote in message

news:404D80...@ix.netcom.com...

> Dale King wrote:
> >
> > "Lee Fesperman" <firs...@ix.netcom.com> wrote in message
> > news:4047D1...@ix.netcom.com..
> > >
> > > I'll vote for better optimization. As you say, your approach would
> > > limit VM optimization. For instance, the runtime compiler could
> > > use a register instead of a variable. On some machines,
> > > "accessibility" could force a register save/restore.
> >
> > And I'll vote for correctness of program execution. C++ has the RAII
> > paradigm where they create an object and they rely on the fact the
object
> > will be destructed when the variable "holding" it goes out of scope.
There
> > are several reasons why RAII does not work well in Java but the primary
> > reason is that the object may not be finalized until long after the
variable
> > goes out of scope. RAII is not ver usefull in that case.
>
> C++ has no relevance here since it doesn't have GC.

I wasn't saying that C++ was relavent, but that the RAII paradigm is one
that depends upon objects not being destructed until the variable goes out
of scope. And in this sense C++ does have a GC. The stack variables are
collected automatically when they go out of scope.

C# has a mechanism for RAII with a garbage collector that eliminates the
need for most finally clauses. It uses an IDisposable interface and a using
statement. See Jon Skeet's RFE to add a similar feature to Java:

http://groups.google.com/groups?selm=MPG.1929810f411094bf98c14e%40dnews.pera
mon.com

> However, a VM could use RAII
> (including finalization) as an optimization. A Java programmer just can't
take advantage
> of it.

Funny, I don't see that in the spec. I know a Java programmer should not use
RAII because the finalization is not immediate, but this optimization says
that it is fine for the VM to do it prematurely. As long as there is a
finalize method which can have side effects then it seems to me that doing
it prematurely can break code.

> > But you are talking about allowing the opposite extreme where the object
may
> > be finalized long before (could be hours or days) the variable goes out
of
> > scope. That is just plain wrong.
>
> Exactly how is it wrong (other than in your opinion)? I reread the
exchange from last
> year, again. You may remember that I contributed to the thread.

Let's say I had code like this:

new FooBar();
callSomeMethod();

Would it be OK for the VM to simply omit the creation of the FooBar instance
or perhaps reorder it to do it after the method call? Of course not. The
reason is that the constructor can have side effects that affect the state
of objects outside of itself. So why does not the same logic apply to the
finalize method. It can have side effects that affect the state of objects
outside of itself. Why should the VM be allowed to do it for finalize, but
not for constructors. I personally do not see why the same rules do not
apply.

> You failed to find any
> authority for your position, and you failed to convince any posters to the
thread.

I wasn't looking for authority just reasoning.

> You also didn't deal with the finalization aspect. There is some consensus
that
> finalizeers should be used carefully and rarely. There are those who
recommend not using
> them at all. I won't go that far; I have implemented finalizers for very
solid reasons.
> I tend to think that finalizers that are vulnerable in this situation are
poorly
> implmented. They really shouldn't have side effects except on external
resources under
> their sole control.

But that is not enforceable. Finalizers can have side effects. I agree it is
not recommended practice. But I don't go as far as to say that the JVM is
free to run them prematurely.

> > > I fully expect VMs to do this optimization and much, much more. It
> > > would be best to heed my caveat: "it [accessibility/reachability]
> > > is not guaranteed."
> >
> > Your opinion has little weight. It should be explictly specified.
>
> Your first comment is uncalled-for, though it is typical for you to use
bluster and
> insults in technical discussions. I'd be glad to stack my knowledge and
experience
> against yours any day.

Sorry, that was not meant as an insult at all! Very poor wording on my part.
I didn't mean that your opinion had little weight because it was *your*
opinion, but because it was just an *opinion*. My opinion doesn't have any
weight either.

My point was that our opinions don't count, the only thing that has any
weight in these sort of matters is the specification. And the spec. needs to
be clarified on this point.

And it is certainly *not* typical for me to use bluster and insults in
technical discussions and I was not doing so here.

> I develop very complex systems software in Java. It's of great importance
to me that JVM
> optimization go as far as it can. I don't want it hamstrung by C++
concepts or poor Java
> programming practices.

And I would have no trouble with that if the spec. explicitly said that it
could make such an optimization and that the behavior is not guaranteed.
Until that time, I think a JVM should be conservative on that point and
guarantee it and that developers should also be conservative and follow your
advice and assume that it is not guaranteed.
--
Dale King

Timo Kinnunen

unread,

Mar 10, 2004, 9:04:03 PM3/10/04

to

"Dale King" <kingd[at]tmicha[dot]net> wrote:

> C# has a mechanism for RAII with a garbage collector that
> eliminates the need for most finally clauses. It uses an
> IDisposable interface and a using statement.

I wouldn't call syntactic sugar for a try-finally statement a garbage
collector. The only link between a using-statement and the real GC is
that the contract for Dispose() in IDisposable says that object
should be registered with the GC as disposed so the GC doesn't need
to call its finalizer.

> Funny, I don't see that in the spec. I know a Java programmer
> should not use RAII because the finalization is not immediate, but
> this optimization says that it is fine for the VM to do it
> prematurely. As long as there is a finalize method which can have
> side effects then it seems to me that doing it prematurely can
> break code.

"Prematurely", "prematurely". Quite emotional words for technical
discussion.

--
No address munging in use. I like the smell of nuked accounts in the
morning.

Chris Smith

unread,

Mar 10, 2004, 10:29:45 PM3/10/04

to

"Dale King" <kingd[at]tmicha[dot]net> wrote:

> > A reachable object is any object that can be accessed in any potential
> > continuing
> > computation from any live thread.
>
> Here is one of the unclear parts. What does *can be* accessed mean. Does
> *can* be accessed require that it *will* be accessed. Or is *could have*
> been sufficient.

Dale,

I'm jumping in a bit late just to clarify something and avoid confusion.
I'm reading this thread with interest.

There are, as far as I can see, two reasonable choices for interpreting
the JLS's "can be accessed" in the phrase above. Certainly, the phrase
"could have been accessed" is one of them. However, the phrase "will be
accessed" is not one of them. There are any number of situations in
which the JRE can't reasonably determine, or it's just plain
indeterminate, whether an object will be accessed; for example, when
there are still references in scope but future behavior depends on user
input. Getting the JRE to determine "will be accessed" is clearly
beyond the scope of any specification.

The other reasonable interpretation (besides your own) is simply to
interpret "can be accessed" as referring to the set of possible
behaviors of the program right now, versus what it could have been
written to do in the past. I can't think of a replacement for the "can
be accessed" phrase that more clearly defines this. I think "can be
accessed" says it very clearly already -- but that, of course, is the
whole point of the disagreement.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Chris Smith

unread,

Mar 10, 2004, 10:41:35 PM3/10/04

to

"Dale King" <kingd[at]tmicha[dot]net> wrote:

> Let's say I had code like this:
>
> new FooBar();
> callSomeMethod();
>
> Would it be OK for the VM to simply omit the creation of the FooBar instance
> or perhaps reorder it to do it after the method call? Of course not. The
> reason is that the constructor can have side effects that affect the state
> of objects outside of itself. So why does not the same logic apply to the
> finalize method. It can have side effects that affect the state of objects
> outside of itself. Why should the VM be allowed to do it for finalize, but
> not for constructors. I personally do not see why the same rules do not
> apply.

Two answers:

1. It's because the JLS, in defining reachability (and thus eligibility
for finalization), doesn't talk about scope. There's really no reason
to see collection of an object while there's still a reference to it in
scope as a "reordering" at all, except that it runs counter to one
obvious implementation of garbage collection -- namely, that of using
the stack pointer to determine with stack slots should be considered
part of the root set.

Basically, the comparison *depends* on interpreting the JLS in a
specific way (and one that I happen to disagree with). If the JLS is
interpreted that way, then of course it's wrong for the finalizer to be
run prior to the reference leaving scope. If it's not interpreted that
way, though, then the comparison is terribly meaningless.

2. Because, frankly, side-effects of constructors are at least vaguely
useful. Finalizers, on the other hand, are just a terrible mistake.
Aside from believing that the JLS really does not make that restriction
(and seeing compelling evidence that a majority of people within Sun
agree), I just really hope that limitations on good VM implementation
aren't made unreasonably just because of the impact of a failed language
feature. This answer wouldn't matter if I thought the JLS required
delaying finalization, but it certainly tips the scales a bit in favor
of choosing one reasonable interpretation over another.

Timo Kinnunen

unread,

Mar 11, 2004, 4:30:37 AM3/11/04

to

Chris Smith <cds...@twu.net> wrote:

> There are, as far as I can see, two reasonable choices for
> interpreting the JLS's "can be accessed" in the phrase above.
> Certainly, the phrase "could have been accessed" is one of them.

Can you explain, please? "Can be accessed" is about the future, whereas
"could have been accessed" is about the past, so how can one be
interpreted as the other?

Chris Smith

unread,

Mar 11, 2004, 9:38:23 AM3/11/04

to

Timo Kinnunen wrote:
> Chris Smith <cds...@twu.net> wrote:
>
> > There are, as far as I can see, two reasonable choices for
> > interpreting the JLS's "can be accessed" in the phrase above.
> > Certainly, the phrase "could have been accessed" is one of them.
>
> Can you explain, please? "Can be accessed" is about the future, whereas
> "could have been accessed" is about the past, so how can one be
> interpreted as the other?

Well, I think that Dale should explain, since he holds that opinion. I
don't. However, I justify it as at least somewhat reasonable because
one might have a mental model in mind in which the JRE is simply going
through a fetch-execute on bytecode and hence "can be accessed", to the
virtual machine, is independent of what code might be coming up next. I
have trouble providing such a description without noting my objections
to it (such as that the JLS never describes this mental model, and that
it's not a very accurate description of reality either)... which is why
Dale is probably better off explaining from his perspective.

Lee Fesperman

unread,

Mar 12, 2004, 6:16:50 PM3/12/04

to

Dale King wrote:
>
> "Lee Fesperman" <firs...@ix.netcom.com> wrote in message

> news:404D80...@ix.netcom.com..

> > Dale King wrote:
> > >
> > > "Lee Fesperman" <firs...@ix.netcom.com> wrote in message
> > > news:4047D1...@ix.netcom.com.
> >

> > C++ has no relevance here since it doesn't have GC.
>
> I wasn't saying that C++ was relavent, but that the RAII paradigm is one
> that depends upon objects not being destructed until the variable goes out
> of scope. And in this sense C++ does have a GC. The stack variables are
> collected automatically when they go out of scope.

I don't see any resemblance to GC at all. It's stil the C++ world of destroying objects
at a specific point.

> C# has a mechanism for RAII with a garbage collector that eliminates the
> need for most finally clauses. It uses an IDisposable interface and a using
> statement. See Jon Skeet's RFE to add a similar feature to Java:
>
> http://groups.google.com/groups?selm=MPG.1929810f411094bf98c14e%40dnews.pera
> mon.com

The above is just not germane to what we're discussing. Put simply, your basic concept
is that reachability (as far as GC is concerned) be the same as scope. Let me look at
some deeper issues here...

1) In general, scope, except method scope, is lost when Java code is translated to
bytecodes (except when a valid Local Variable Table is included.) Your concept means the
JVM would have to assume that all local variables (unless their 'slots' are reused) are
reachable for the entire method.

On the other side, trying to use scope like this:

{
Foo o = new Foo();
}
callSomeLongMethod();

wouldn't force 'o' to be eligible for GC before callSomeLongMethod() completes.

Additional chances for optimization will be lost here.

2) In 1), the effect of translation to bytecode can cause the scope of a variable to be
lengthened but never shortened (inappropriately). With your concept, a shortened scope
could cause incorrect behavior. To avoid this, compilers could only reuse variable slots
that are no longer in scope. I did a quick test on a couple of Sun's compilers and
didn't see an instance of this, which tends to support your ideas. However, I don't know
if it is true for all certified compilers.

> > > But you are talking about allowing the opposite extreme where
> > > the object may be finalized long before (could be hours or days)
> > > the variable goes out of scope. That is just plain wrong.
> >
> > Exactly how is it wrong (other than in your opinion)? I reread
> > the exchange from last year, again.
>

> Let's say I had code like this:
>
> new FooBar();
> callSomeMethod();
>
> Would it be OK for the VM to simply omit the creation of the FooBar instance
> or perhaps reorder it to do it after the method call? Of course not. The
> reason is that the constructor can have side effects that affect the state
> of objects outside of itself. So why does not the same logic apply to the
> finalize method. It can have side effects that affect the state of objects
> outside of itself. Why should the VM be allowed to do it for finalize, but
> not for constructors. I personally do not see why the same rules do not
> apply.

Agreed, the VM couldn't omit or reorder FooBar creation ... unless it knew there were no
side effects. However, the same only applies to finalizers if your concept must be
enforced. That doesn't add anything to your position.

Actually, your concept doesn't apply here anyway. Your concept concerns the scope of
local variables. There are no variables in the code, just expressions. The scope of an
expression is no more than the statement it's in.

> > You failed to find any authority for your position, and you failed to

> > the thread.
>
> I wasn't looking for authority just reasoning.

Fair enough. I am trying to clarify the basic issue and avoid side issues, like RAII and
expression scoping.

> > You also didn't deal with the finalization aspect. There is some
> > consensus that finalizeers should be used carefully and rarely.
> > There are those who recommend not using them at all. I won't go that
> > far; I have implemented finalizers for very solid reasons. I tend to
> > think that finalizers that are vulnerable in this situation are

> > poorly implemented. They really shouldn't have side effects except

> > on external resources under their sole control.
>
> But that is not enforceable. Finalizers can have side effects. I agree it is
> not recommended practice. But I don't go as far as to say that the JVM is
> free to run them prematurely.

Certainly it is not enforceable. I also don't want finalizers to be run permaturely,
that is, unless there is no possiblity of access (the object is
unreachable/inaccessible). The issue is that I'm thinking of 'physical' reachability
(determined by a scan of registers, stacks and then the heap). You're espousing
'logical' reachability (based on local variable scope).

Your approach can require extra code and reduces optimization options. My point above is
that its only purpose is to prevent 'unexpected' invocation of poorly implemented
finalizers.

By some weird sort of serendipity, this issue is also touched on in the current thread,
"Declaring a reference in a loop versus outside a loop". In that thread, Neal Gafter
seems to indicate that Sun doesn't support your idea.

> And I would have no trouble with that if the spec. explicitly said that it
> could make such an optimization and that the behavior is not guaranteed.
> Until that time, I think a JVM should be conservative on that point and
> guarantee it and that developers should also be conservative and follow your
> advice and assume that it is not guaranteed.

I doubt anyone is arguing that clarity in the spec is not desirable.

I do think detailing potential optimizations is generally out of place in the spec,
though I would accept that this situation is an exception.

Timo Kinnunen

unread,

Mar 13, 2004, 3:25:32 AM3/13/04

to

Lee Fesperman <firs...@ix.netcom.com> wrote:

> I doubt anyone is arguing that clarity in the spec is not
> desirable.
>
> I do think detailing potential optimizations is generally out of
> place in the spec, though I would accept that this situation is an
> exception.

Here's the non-normative text from the corresponding section in the
C# specification:

[Note: Implementations may choose to analyze code to determine which
references to an object may be used in the future. For instance, if a
local variable that is in scope is the only existing reference to an
object, but that local variable is never referred to in any possible
continuation of execution from the current execution point in the
procedure, an implementation may (but is not required to) treat the
object as no longer in use. end note]

Food for thought.

John C. Bollinger

unread,

Mar 15, 2004, 10:40:45 AM3/15/04

to

Dale King wrote:

> "John C. Bollinger" <jobo...@indiana.edu> wrote in message
> news:c2iuhv$1f5$1...@hood.uits.indiana.edu...

>>For instance, in the example quoted above, the Java compiler or VM could
>>determine that there is no read of method()'s local variable "o" in the
>>byte code currently loaded into the VM. How could that not mean that o
>>cannot be accessed? I simply don't accept that the compiler or VM is
>>required to take into account the continuum of possible alternative
>>implementations of method(). The whole point of optimization is to
>>shortcut a program is a way that has no effect on the results except for
>>resource consumption and/or running time. I see absolutely no point to
>>interpreting the spec as you describe. I guess all this puts me in the
>>same camp as Chris Smith on the matter.
>
>
> I am not asking it to look at all possible alternative implementations, but
> that it should respect my code where I declared the actual scope of the
> variable, which says when the variable is accessible.

I agree that the Java source determines where the variable is
"accessible" in an informal sense, but as I tried to define that sense
of the term I realized that I was just coming up with something
equivalent to "in scope". That leaves me unable to take any useful
information from your statement.

The JLS does not apply the term "accessible" to local variables, and
does not use it in its defined sense in the discussion of finalization
(back to JLS 12.6.1). The JLS also does not explicitly rely on the Java
scope of reference variables to define reachability, and it certainly
does not depend on the semantics of bytecode for that definition.

> You are saying that
> the JVM should be able to circumvent what is declared in the program.

No, I'm saying that the JLS does not specify that the scope of a local
reference variable has any special relationship to the reachability of
its current referrent within that scope. To read such a requirement
into the spec you are depending on interpreting "can be accessed [...]"
to include the in-scope reference variable case, but I think that is a
mistake. The context of the statement is program _execution_, to which
questions of variable scope in the source code are irrelevant.

> If I
> declare a scope for the variable I may be depending on the object remaining
> alive.

And your program may therefore be incorrect. As Chris Smith wrote a
couple days ago, even if your interpretation were technically right, JVM
implementors may, and some apparently do, use the more permissive
interpretation. You may claim the high moral ground for yourself if you
wish, but if you want your programs to be reliable with respect to this
detail then you must assume the more permissive interpretation is in use
in the JVM.

> It just seems to me that you are being over optimistic and violating
> the correctness of the program.

Funny, I'd say exactly the same thing about your position.

> In the end it probably doesn't matter
> because the only way being over optimistic can matter is if you have side
> effects from a finally clause and even if you did the garbage collector
> usually delays finalization.

Agreed that if a program that assumes finalization will be delayed until
all references are out of scope runs in a JVM that does not ensure that
that assumption is valid then actual misbehavior still requires a
combination of unlikely circumstances. Still, no matter how unlikely,
chances are that eventually it will happen, and that the failure will be
extremely difficult to diagnose. Even if I agreed with you on the
technical point, I would be inclined to play it safe.

John Bollinger
jobo...@indiana.edu

Lee Fesperman

unread,

Mar 17, 2004, 11:01:30 PM3/17/04

to

Lee Fesperman wrote:
>
> Dale King wrote:
> >
> > "Lee Fesperman" <firs...@ix.netcom.com> wrote in message
> > news:404D80...@ix.netcom.com.

> > > You failed to find any authority for your position.

> >
> > I wasn't looking for authority just reasoning.

And I provided that, raising a number of technical issues. No response?

> ....... this issue is also touched on in the current thread,

> "Declaring a reference in a loop versus outside a loop". In that thread,
> Neal Gafter seems to indicate that Sun doesn't support your idea.

I also discussed this issue with a JVM expert. He asserted that only 'physical'
reachability is used, and there is no consideration of logical (scope) reachability. He
also stated that anything else would be very, very expensive.

I would add that a major aspect of optimization is intra-method optimization, often
dealing with local variable usage. Restrictions in this area would be quite detrimental.

Dale King

unread,

Mar 21, 2004, 10:45:47 PM3/21/04

to

"Lee Fesperman" <firs...@ix.netcom.com> wrote in message

news:40591E...@ix.netcom.com...

> Lee Fesperman wrote:
> >
> > Dale King wrote:
> > >
> > > "Lee Fesperman" <firs...@ix.netcom.com> wrote in message
> > > news:404D80...@ix.netcom.com.
> > > > You failed to find any authority for your position.
> > >
> > > I wasn't looking for authority just reasoning.
>
> And I provided that, raising a number of technical issues. No response?

Sorry, couldn't really get to newsgroups for a while.

I haven't seen anyone really address the technical correct semantics. What I
basically have seen is people say that the VM should be able to do this
because it can be more efficient. To me the issue is program correctness. I
see absolutely no real difference with this situation than with this code:

Foo method()
{
Foo f = new Foo();
someMethod();
return f;
}

Should the VM be allowed to reorder the execution so that the method was
effectively:

Foo method()
{
someMethod();
return new Foo();
}

Would it make any difference if I told you that the code would be more
efficient this way? After all the VM can do the analysis and determine that
the first reference to the variable is not referenced until after the method
call.

Would you allow a VM to be able to make such an optimization? Of course not!
Any VM that reordered the code so that the constructor were called after
some other code that could interact with the side effects of that
constructor would be incorrect.

So why should the same rules not apply to finalization as they do to
construction? Why should it be allowed to reorder the finalization call?
This violates the expected semantics of languages like C++ where it is
guaranteed and relied upon.

I actually have no problem if Sun wants to say that those semantics are not
guaranteed, but if that is what is desired it should be specified and not
simply asumed.

> > ....... this issue is also touched on in the current thread,
> > "Declaring a reference in a loop versus outside a loop". In that thread,
> > Neal Gafter seems to indicate that Sun doesn't support your idea.
>
> I also discussed this issue with a JVM expert. He asserted that only
'physical'
> reachability is used, and there is no consideration of logical (scope)
reachability. He
> also stated that anything else would be very, very expensive.

Well of course, it cannot use scope because the scope is not present in the
class file. But then whether it can do the optimization is begging the
question.

I did some looking and got some contradictory results. There is a paper at
citeseer that actually gives some experimental numbers for these kind of
optimizations.

But on my reading of the spec., there is this oft-quoted technical article
by Sun that says:

http://java.sun.com/developer/technicalArticles/ALT/RefObj/
"An executing Java program consists of a set of threads, each of which is
actively executing a set of methods (one having called the next). Each of
these methods can have arguments or local variables that are references to
objects. These references are said to belong to a root set of references
that are immediately accessible to the program...

"All objects referenced by this root set of references are said to be
reachable by the program in its current state and must not be collected.
Also, those objects might contain references to still other objects, which
are also reachable, and so on. "

From which I read that local variables are part of a root set and all
objects referenced by this rootset cannot be collected. I see no mention of
reachability of local variables, but that local variables are part of the
rootset.

So when does a local variable enter and leave the root set. According to the
JLS:

"Local variables are declared by local variable declaration statements
(§14.4). Whenever the flow of control enters a block (§14.2) or for
statement (§14.13), a new variable is created for each local variable
declared in a local variable declaration statement immediately contained
within that block or for statement. A local variable declaration statement
may contain an expression which initializes the variable. The local variable
with an initializing expression is not initialized, however, until the local
variable declaration statement that declares it is executed. (The rules of
definite assignment (§16) prevent the value of a local variable from being
used before it has been initialized or otherwise assigned a value.) The
local variable effectively ceases to exist when the execution of the block
or for statement is complete."

That tells me that the variable ceases to exist when the execution of the
block is complete. The question is the meaning of that "effectively" word.
You might read it to say that it could actually cease to exist before then.
I might read it to say that they have to hedge the statement because the
variable may actually continue to exist after that point (which is actually
what does happen in Java since in the class files variables exist until the
end of the method).

> I would add that a major aspect of optimization is intra-method
optimization, often
> dealing with local variable usage. Restrictions in this area would be
quite detrimental.

As I said optimization arguments don't mean a lot to me if it involves
breaking correct semantics.

Lee Fesperman

unread,

Mar 27, 2004, 9:01:51 PM3/27/04

to

Dale King wrote:
>
> "Lee Fesperman" <firs...@ix.netcom.com> wrote in message

> news:40591E...@ix.netcom.com..
> >
> > ..... No response?

>
> Sorry, couldn't really get to newsgroups for a while.

Apologies. I didn't mean to rush you; I was just concerned that you had left the thread.
Take your time; I'll be here.

> I haven't seen anyone really address the technical correct semantics. What I
> basically have seen is people say that the VM should be able to do this
> because it can be more efficient. To me the issue is program correctness. I
> see absolutely no real difference with this situation than with this code:
>

> --- examples snipped ---

>
> Would you allow a VM to be able to make such an optimization? Of course not!
> Any VM that reordered the code so that the constructor were called after
> some other code that could interact with the side effects of that
> constructor would be incorrect.

We've gone through this earlier in this thread. I agreed that code reodering was
inappropriate (unless the JVM could prove it was correct).

> So why should the same rules not apply to finalization as they do to
> construction? Why should it be allowed to reorder the finalization call?

Simply, because the spec does not clarify this behavior and then there is existing
practice (see below).

> This violates the expected semantics of languages like C++ where it is
> guaranteed and relied upon.

C++ is still irrelevant here. Definite execution of destructors is mandatory in C++
because they *must* destroy any held objects. This does not apply in Java. C++ semantics
are ragged in a number of areas. For instance, C++ does not guarantee that an object
won't be accessed after the destructor is called. Java guarantees that the object is not
accessible after the finalizer is called (unless the finalizer overrides it.)

> I did some looking and got some contradictory results. There is a paper at
> citeseer that actually gives some experimental numbers for these kind of
> optimizations.
>
> But on my reading of the spec., there is this oft-quoted technical article
> by Sun that says:
>
> http://java.sun.com/developer/technicalArticles/ALT/RefObj/
> "An executing Java program consists of a set of threads, each of which is
> actively executing a set of methods (one having called the next). Each of
> these methods can have arguments or local variables that are references to
> objects. These references are said to belong to a root set of references
> that are immediately accessible to the program...
>
> "All objects referenced by this root set of references are said to be
> reachable by the program in its current state and must not be collected.
> Also, those objects might contain references to still other objects, which
> are also reachable, and so on. "
>
> From which I read that local variables are part of a root set and all
> objects referenced by this rootset cannot be collected. I see no mention of
> reachability of local variables, but that local variables are part of the
> rootset.

This is obviously about reachability. It does not specify which reference variables are
in the rootset. It most certainly can't be all (reference) local variables in the
method, since standard compiler generated class files can make that impossible to
accomplish.

> So when does a local variable enter and leave the root set. According to the
> JLS:

Both quotes (above and below) mention local variables but are not connected. They are
from different documents! The quote above is about reachability, the one below about
scope.

> "Local variables are declared by local variable declaration statements
> (§14.4). Whenever the flow of control enters a block (§14.2) or for
> statement (§14.13), a new variable is created for each local variable
> declared in a local variable declaration statement immediately contained
> within that block or for statement. A local variable declaration statement
> may contain an expression which initializes the variable. The local variable
> with an initializing expression is not initialized, however, until the local
> variable declaration statement that declares it is executed. (The rules of
> definite assignment (§16) prevent the value of a local variable from being
> used before it has been initialized or otherwise assigned a value.) The
> local variable effectively ceases to exist when the execution of the block
> or for statement is complete."
>
> That tells me that the variable ceases to exist when the execution of the
> block is complete. The question is the meaning of that "effectively" word.
> You might read it to say that it could actually cease to exist before then.
> I might read it to say that they have to hedge the statement because the
> variable may actually continue to exist after that point (which is actually
> what does happen in Java since in the class files variables exist until the
> end of the method).

I've already shown that they don't always exist until the end of the method (their slots
are reused in the class file). Obfuscators also reuse variables.

'Effectively' is the crux of the matter, and it is vague (intentionally?). The JVM
developer that I spoke to stated --- "All specs should include the implied 'as if'
rule".

I'm not sure there is any point to 'reason' further. The JLS is not definitive on this
issue and existing practice weighs against your approach.

Existing practice: Two developers of major JVMs have asserted that only physical
reachability is considered in GC and that the JVM will not generate code to extend
reachability to match logical scope (see my example earlier in this thread.)