Large Dependency Guidelines

18 views
Skip to first unread message

Sam Berlin

unread,
Nov 8, 2007, 3:09:08 PM11/8/07
to google...@googlegroups.com
Hi Folks,

What's the best practice for requiring lots of dependencies that
spread across packages? Should an implementation within a package
depend directly on interfaces from other packages? Or should there be
some kind of facade or service (for easier testing within the package
itself) and the impl exists outside and glues in the other packages?

The main reason I'm asking is because now that we've converted to
using Guice & constructor-based injection, it becomes a real PITA with
subclasses and huge dependency lists. Consider an AbstractFoo that
requires a lot of other factories and services to run, and many
different implementations. There's two problems: writing tests
requires passing a lot of different dependencies directly (or setting
them via Guice), and if the dependencies of AbstractFoo ever change,
all the subclasses need their constructors altered.

This leads me towards wanting a single catch-all FooFacade or
FooService that interface that AbstractFoo uses, but it leaves a bad
taste in my mouth... Is it really a good idea to pretend that
AbstractFoo doesn't depend on these services (by indirectly using them
through the FooFacade)? How far should this practice go? If I end up
also needing to use a BarFactory, does a createBar method get added to
FooFacade instead of passing it into AbstractFoo (and requiring all
the subclasses change their constructors)? What about if AbstractFoo
needs a lot of dependencies from within the package? Is it a good
idea to put that into the FooFacade too?

These are the things that keep me awake at night!

Thanks,
Sam

Bob Lee

unread,
Nov 8, 2007, 3:16:07 PM11/8/07
to google...@googlegroups.com
This is where method injection comes in handy. I would use a method
instead of the constructor on AbstractFoo. Then, the subclasses don't
need to know about its dependencies.

Bob

Dhanji R. Prasanna

unread,
Nov 8, 2007, 5:26:09 PM11/8/07
to google...@googlegroups.com
I like to use a module per package and hide all the classes as
package-local, exposing only interfaces, enums and final classes as
necessary.
I am not a fan of using Abstract[Service] to inherit functionality of
[Service], cf. composition vs. inheritance in Effective Java, or Allen
Holub on "why extends is evil"

Dhanji.

Bob Lee

unread,
Nov 8, 2007, 5:28:14 PM11/8/07
to google...@googlegroups.com
I second that--I almost never use class inheritance anymore.

Bob

Josh McDonald

unread,
Nov 8, 2007, 10:00:02 PM11/8/07
to google...@googlegroups.com
Agreed, although I must admit being somewhat sceptical the first time
Dhanji explained this point of view, I'll definitely pitch in with
full support.

For anybody following this thread and didn't think to look up "why
extends is evil" thinking it was just DJ expressing himself, the
article in question can be found here:

http://www.javaworld.com/javaworld/jw-08-2003/jw-0801-toolbox.html

It gives a few great examples as well.

-Josh

--
"This is crazy! Why are we talking about going to bed with Wilma
Flintstone... She'll never leave Fred and we know it. "

:: Josh 'G-Funk' McDonald
:: 0437 221 380 :: jo...@gfunk007.com

Sam Berlin

unread,
Nov 12, 2007, 11:37:17 AM11/12/07
to google...@googlegroups.com
Thanks for all the tips.

I think we're going to shy away from using setter injection (as
opposed to constructor injection) because that'll require all the
variables to be volatile (since the object is heavily multi-threaded).
And unfortunately we don't have the time right now to redo the object
hierarchy to not use class inheritence... so I think we're just going
to have to live with difficult-to-construct objects for now.

Sam

Endre Stølsvik

unread,
Nov 15, 2007, 7:25:46 AM11/15/07
to google...@googlegroups.com
Sam Berlin wrote:
> Thanks for all the tips.
>
> I think we're going to shy away from using setter injection (as
> opposed to constructor injection) because that'll require all the
> variables to be volatile (since the object is heavily multi-threaded).

I'm not a zen master in this, but I believe that logic is a
misunderstanding of the threading issues at hand. The objects will be
created (thus constructor injected) and setter injected fully before you
get them back from the Injector, hence there is no need for the fields
to be volatile. The object isn't as such /published/ before both these
things have happened (unless you yourself do any listener registration
or whatever in the constructor or setters), and the setters may thus be
viewed as an extension of the constructor (only of course the fields may
not be declared final, which is annoying).

I read your logic as basically everything in a java program would have
to be volatile if it uses more than one thread. This isn't true, AFAIK.
As long as a thread accesses an object "for the first time", it will
have to fetch all values from main memory, so to speak. It is only if
two or more threads communicate back-and-forth through some data that
these accesses needs to be synched, or volatile - this since each thread
is free to cache the memory as hard as it wants if this isn't the case.

This is at least my understanding, and it has worked out nice so far..! :-)

Well.. On third thought - if some of these objects gets some other
singleton objects injected, with which they communicate, then these
singleton objects will have to handle synch issues. But not the newly
created objects. And in particular not if the whole application object
graph is constructed in one go, where one doesn't use the Injector as a
generic factory throughout the code (I don't, so at least one people use
it that way!).

If I've gotten this wrong, then PLEASE correct me!

Kind regards,
Endre.

Dhanji R. Prasanna

unread,
Nov 15, 2007, 8:16:38 AM11/15/07
to google...@googlegroups.com
Volatile variables do not guarantee thread safety, only memory
coherence. If, as Endre says, you can guarantee that dependencies are
never changed after injection and you do not allow the "this"
reference to escape during construction or setter invocation(s) then
you do not need to declare them as volatile because you have
established a happens-before edge between the last setter invocation
and the return of the instance from the injector.

Synchronization is a whole other matter because it deals will
concurrent threads that modify state (as opposed to safe publication).

I have heard views that all fields ought to be either final or
volatile (except in rare cases). I do not subscribe to this view
because it doesn't address any problem as such, in the abstract. While
it is good practice to declare all members final where possible, it
does not *ensure* safe publication as is often wrongly claimed (for
instance, if you allow the "this" reference to escape during
construction you can make no publication guarantees). So think about
your use case and concurrency environment before mandating finals,
volatiles or synchronizations. =)

Dhanji.

Endre Stølsvik

unread,
Nov 15, 2007, 10:08:09 AM11/15/07
to google...@googlegroups.com
Dhanji R. Prasanna wrote:
> I have heard views that all fields ought to be either final or
> volatile (except in rare cases). I do not subscribe to this view
> because it doesn't address any problem as such, in the abstract.

I concur. I hadn't even heard about this view..! :)

> While
> it is good practice to declare all members final where possible, it
> does not *ensure* safe publication as is often wrongly claimed (for
> instance, if you allow the "this" reference to escape during
> construction you can make no publication guarantees).

Concur again - why I use final is actually more to get compile time
errors (nice squiggly red lines!) if I fail to set them in the
constructor. It makes the field "guaranteed to be set"-safe, which I
think is more important than the "cannot be set again" effect. That is
why I complain about setter injection, since it obviously isn't possible
to declare a field final then.

(I think everybody interested in these things (which everybody should
be!) should get a copy of "Java Concurrency in Practice" by Brian Goetz
et. al. I thought I had most of the stuff nailed, but there are a quite
a few things to think about. The publication-problem hadn't occurred to
me, which I think is a big point for fixing issue 62
(http://code.google.com/p/google-guice/issues/detail?id=62) - "
Lifecycle support", at least a simple @Init annotation or something.)

Endre.

Sam Berlin

unread,
Nov 15, 2007, 3:42:15 PM11/15/07
to google...@googlegroups.com
Quite a good discussion of threading & memory-safety. :)

I was always under the impression that variables set within
constructors are safe to be non-final (because publishing occurs after
the constructor returns), but final is the best option. However,
non-volatile variables set after the constructor returns weren't
guaranteed to be published & visible to other threads. This is
especially a problem in the common case of thread pools, where
although it may be easy to guarantee a Runnable is processed after a
certain point, it isn't easy to guarantee that runnable is processed
in a brand new Thread.

I don't quite understand the distinction that "if it's used after the
injection finishes, you're OK". To me, that seems like an arbitrary
point in time. (And it becomes hazier when non-eager instantiation is
done, or if things are eagerly instantiated singletons and do
initialization code [even worse if that code spawns threads] when
constructed.)

With non-eager instantiation, it's almost certainly the case that
threads will exist prior to the object and its dependency-graph being
constructed, and if those pre-existing threads want to use the new
object, it isn't guaranteed that they'll see the correct values (if
those values were set after the constructor finished). This is based
on my understanding that publishing occurs when the constructor
returns, as opposed the understanding that a thread seeing a new
object for the first time will read all of its variables as if they
were published. That understanding could certainly be wrong.

Sam

Tim Peierls

unread,
Nov 16, 2007, 9:22:40 AM11/16/07
to google-guice, Tim Peierls
On Nov 15, 3:42 pm, "Sam Berlin" <sber...@gmail.com> wrote:
> I was always under the impression that variables set within
> constructors are safe to be non-final (because publishing occurs after
> the constructor returns), but final is the best option.
> However, non-volatile variables set after the constructor returns weren't
> guaranteed to be published & visible to other threads.

Unfortunately, setting fields in the constructor body does NOT make
the fields visible to other threads, except in the special case of
final fields -- that's why final is usually the best option, if it's
an option at all.

However, if the object itself is published safely and one of its
fields is set before publication (in the constructor body or not) and
never modified after publication, then concurrent reads of that field
after publication are still safe. I emphasize that this depends on the
nature of the publication of the object itself, not of its field.

Guice guarantees safe publication from its scopes and that all
injection activity happens before publication. (There's an
outstanding issue to improve the documentation of this.) If you set
your non-final fields in an injected constructor or method, as long as
you don't modify them after publication, those "effectively read-only"
values will be visible to other threads.

But Guice doesn't automatically make your objects thread-safe: If you
DO change field values after publication, you'll need to make those
fields volatile or use synchronized access to them, just as with any
other class.

(The choice of volatile vs. synchronized depends on whether you have
any additional atomicity constraints; if the field doesn't participate
with other fields in any class invariants and doesn't itself need
atomic updating, then volatile is fine, otherwise you need
synchronized access or, for atomic updating only, the Atomic classes
from java.util.concurrent.atomic.)

The same thing applies if you use a custom Scope that doesn't
guarantee safe publication: all bets are off, and your injected
methods and fields need to be final, volatile, or synchronzed
somehow.

If you're writing an injectable class for others to use and you have
non-final fields, you have a choice between (1) documenting that your
class is not safe for use with scopes that don't guarantee safe
publication, and (2) simply making your class thread-safe to guard
against possible unsafe publication.

Here's my personal feeling about this: It's never unsafe to make a non-
final field volatile. At worst, it's unnecessary (and might incur a
small performance penalty); at best, it can make the difference
between working code and broken code.

Incidentally, there are two kinds of programmers: those who would read
the previous paragraph and fixate on the word "broken", and those who
would fixate on the words "performance penalty". I call latter group
"wrong-headed". :-)


> This is especially a problem in the common case of thread pools, where
> although it may be easy to guarantee a Runnable is processed after a
> certain point, it isn't easy to guarantee that runnable is processed
> in a brand new Thread.

In general, yes, but using the task execution framework in
java.util.concurrent, things are nicer: Executing a Runnable with an
Executor is a form of safe publication, so the fields of the Runnable
may be set at any time before the call to execute(Runnable) --
assuming that the thread that creates the Runnable is also the one
that submits it for execution -- and then read safely (but not
written) during Runnable.run().

--tim

Endre Stølsvik

unread,
Nov 16, 2007, 11:21:02 AM11/16/07
to google...@googlegroups.com, Tim Peierls
Tim Peierls wrote:
> On Nov 15, 3:42 pm, "Sam Berlin" <sber...@gmail.com> wrote:
>> I was always under the impression that variables set within
>> constructors are safe to be non-final (because publishing occurs after
>> the constructor returns), but final is the best option.
>> However, non-volatile variables set after the constructor returns weren't
>> guaranteed to be published & visible to other threads.
>
> Unfortunately, setting fields in the constructor body does NOT make
> the fields visible to other threads, except in the special case of
> final fields

Seems like you've gotten that right:
http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.4.5

Check "completely initialized" definition, and "Discussion":
http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.5

So. Seemingly, the reference to an object may be set/published/made
available/visible to other threads before all values touched in the
constructor are committed to "main memory", except for final fields,
which are guaranteed to be set and "flushed" when the constructor
returns and the instance becomes available. Is this a correct
interpretation?
I've now chalked that up as another little thing to be concerned
about in regards to multithreading. I was hoping at some point to become
fully educated about these issues, and always believe that day is today,
but then some new nuance suddenly come popping up! :)

My statement two posts back about a thread accessing an object for the
first time will make it see the values from main memory, is very
misleading, at best - since other threads might have changed these
values plenty in their local cached view, and not flushed the changes
yet unless a happens-before relationship is established between the two
threads, one will possibly get stale data, up to un-initialized data.

It should preferably be up to Guice to establish this specific
happens-before relationship, making sure that the entire object graph is
fully published upon return of the Injector's (Provider's)
getInstance(...) methods. How is this achieved? I tried looking into the
code, but the obvious synchronized on /this/ in the get methods wasn't
there..! (and would that have been enough?)

From the JLS/JMM: "Without correct synchronization, very strange,
confusing and counterintuitive behaviors are possible.".. Right.

Endre.

Tim Peierls

unread,
Nov 16, 2007, 3:38:58 PM11/16/07
to google-guice
On Nov 16, 11:21 am, Endre Stølsvik <stols...@gmail.com> wrote:
> Seemingly, the reference to an object may be set/published/made
> available/visible to other threads before all values touched in the
> constructor are committed to "main memory", except for final fields,
> which are guaranteed to be set and "flushed" when the constructor
> returns and the instance becomes available. Is this a correct
> interpretation?

Yes, though you have to be careful with terms like "main memory" and
"flushed" -- your use of quotes suggests that you already are, but
other readers might not be -- those terms are not part of the JMM, and
no one should attempt to reason about thread-safety using them.

Shameless plug for Java Concurrency in Practice (http://jcip.net):
When writing the book, we pushed all mention of happens-before to an
appendix, using in the body of the book what we hope is a useful and
simple vocabulary that you *can* use when describing the concurrency
properties of a class. So earlier I mentioned that Guice guarantees
safe publication of its injected objects, instead of saying that Guice
guarantees a happens-before edge between the injection of an object's
constructors/methods/fields and any post-injection access of that
object. We hope this vocabulary is natural enough to use in casual
discussions even with those who haven't read the book (or the JLS),
but if you want to be sure ... buy the book! :-)


> It should preferably be up to Guice to establish this specific
> happens-before relationship, making sure that the entire object graph is
> fully published upon return of the Injector's (Provider's)
> getInstance(...) methods. How is this achieved? I tried looking into the
> code, but the obvious synchronized on /this/ in the get methods wasn't
> there..! (and would that have been enough?)

It's not obvious, and in fact I spent some time a while ago convincing
myself that it was correct. It should be better documented in the code
(maybe it has been since I last looked), because it's ferociously hard
to reason about.

But it is all too easy to write a custom Scope that does not guarantee
safe publication, by using, for example, a plain unsynchronized
HashMap to store instances. A Module that created a binding in such a
Scope would be broken unless the binding were to a thread-safe class.

--tim

Bob Lee

unread,
Nov 16, 2007, 3:48:48 PM11/16/07
to google...@googlegroups.com
On Nov 16, 2007 12:38 PM, Tim Peierls <tpei...@gmail.com> wrote:
> On Nov 16, 11:21 am, Endre Stølsvik <stols...@gmail.com> wrote:
> > It should preferably be up to Guice to establish this specific
> > happens-before relationship, making sure that the entire object graph is
> > fully published upon return of the Injector's (Provider's)
> > getInstance(...) methods. How is this achieved? I tried looking into the
> > code, but the obvious synchronized on /this/ in the get methods wasn't
> > there..! (and would that have been enough?)

Strange--I got Tim's email but haven't gotten Endre's email which Tim quoted.

When you call getInstance(), Guice constructs the object graph in the
same thread, so there's no need for synchronization there.

If you pass the object Guice returned to you off to another thread,
it's up to you to do so in a thread safe manner, in which case you
would safely publish it.

Bob

Tim Peierls

unread,
Nov 16, 2007, 3:48:59 PM11/16/07
to google-guice
On Nov 16, 3:38 pm, Tim Peierls <tpeie...@gmail.com> wrote:
> But it is all too easy to write a custom Scope that does not guarantee
> safe publication, by using, for example, a plain unsynchronized
> HashMap to store instances. A Module that created a binding in such a
> Scope would be broken unless the binding were to a thread-safe class.

Actually, it would be broken, period.

--tim

Brian Pontarelli

unread,
Nov 16, 2007, 5:11:26 PM11/16/07
to google...@googlegroups.com
Tim Peierls wrote:
On Nov 16, 11:21 am, Endre Stølsvik <stols...@gmail.com> wrote:
  
Seemingly, the reference to an object may be set/published/made
available/visible to other threads before all values touched in the
constructor are committed to "main memory", except for final fields,
which are guaranteed to be set and "flushed" when the constructor
returns and the instance becomes available. Is this a correct
interpretation?
    
Yes, though you have to be careful with terms like "main memory" and
"flushed" -- your use of quotes suggests that you already are, but
other readers might not be -- those terms are not part of the JMM, and
no one should attempt to reason about thread-safety using them.
  
But they used to be. ;) However, since Guice is JDK 5, it uses the new JMM, which tossed all that nonsense.

-bp

Tim Peierls

unread,
Nov 16, 2007, 11:16:27 PM11/16/07
to google-guice
On Nov 16, 3:48 pm, "Bob Lee" <crazy...@crazybob.org> wrote:
> When you call getInstance(), Guice constructs the object graph in the
> same thread, so there's no need for synchronization there.

If you inject Providers, then the thread-safety of the provided object
depends on the Scope, if any:

If there's no Scope or if the Scope decides to return a new instance,
then the instance is constructed in the thread that requested it, so
you're OK, as in Bob's comment.

If it returns a previously created instance and if the Scope keeps
those instances in a container that does not act as a safe handoff
(i.e., does not have a happens-before edge between storage and
retrieval of the same instance), then you need to make the provided
class thread-safe.

It might seem unlikely that a container could be thread-safe but not
provide safe handoff, but consider session scope implemented with
HttpSession attributes. While it is safe to store and retrieve objects
from different threads (it has to be, otherwise the facility would be
useless), there is no happens-before edge between a set and a get --
or rather, there might be, but the servlet spec doesn't give us enough
information to guarantee that there is -- so bindings in session scope
should be to thread-safe classes.

--tim

Dhanji R. Prasanna

unread,
Nov 17, 2007, 4:11:37 AM11/17/07
to google...@googlegroups.com
On Nov 17, 2007 6:48 AM, Bob Lee <craz...@crazybob.org> wrote:
>
> Strange--I got Tim's email but haven't gotten Endre's email which Tim quoted.

And I got Tim's reply to you before yours. Wonder if this is a spam
slowdown, the JCP lists are also gloriously backed up.

Endre Stølsvik

unread,
Nov 17, 2007, 11:22:23 AM11/17/07
to google...@googlegroups.com
Bob Lee wrote:
> On Nov 16, 2007 12:38 PM, Tim Peierls <tpei...@gmail.com> wrote:
>> On Nov 16, 11:21 am, Endre Stølsvik <stols...@gmail.com> wrote:
>>> It should preferably be up to Guice to establish this specific
>>> happens-before relationship, making sure that the entire object graph is
>>> fully published upon return of the Injector's (Provider's)
>>> getInstance(...) methods. How is this achieved? I tried looking into the
>>> code, but the obvious synchronized on /this/ in the get methods wasn't
>>> there..! (and would that have been enough?)
>
> Strange--I got Tim's email but haven't gotten Endre's email which Tim quoted.

I guess he was in the Cc:, and the list in the To:, so he got it
directly? I also didn't get my message back until just recently, but
that have happened several times lately, even though the message was
present in the archives (the group) - I was thinking maybe googlegroups
was over-smart and excluded the sender from the recipients, or I have
some mess up with my procmail.

>
> When you call getInstance(), Guice constructs the object graph in the
> same thread, so there's no need for synchronization there.

What about singletons? Synched Maps?

>
> If you pass the object Guice returned to you off to another thread,
> it's up to you to do so in a thread safe manner, in which case you
> would safely publish it.

I think there are too many opinions here now - or everybody is saying
the same thing, differently.. Since you, Bob, is the principal author of
this thing, you're probably right!

I read you as since one will have to do some synchronization, or use a
volatile variable, to actually pass the object in question safely off to
another thread, that would constitute the happens-before edge, and hence
everything would be good?

And just to get my head straight here: this one happens-before edge
between the thread setting the object, and the thread getting the
object, will also ensure a metaphorical "deep synch" of the entire
object, making all fields (final and non, from the constructor, up and
including every change up to that edge) be visible to the other thread,
also including the entire graph from that object of?

.. one happens-before edge will ensure all that?

Endre.

Endre Stølsvik

unread,
Nov 17, 2007, 11:27:40 AM11/17/07
to google...@googlegroups.com
Tim Peierls wrote:

> It might seem unlikely that a container could be thread-safe but not
> provide safe handoff, but consider session scope implemented with
> HttpSession attributes. While it is safe to store and retrieve objects
> from different threads (it has to be, otherwise the facility would be
> useless), there is no happens-before edge between a set and a get

Could you give an example of such a container? I can't seem to
understand how the set and a subsequent get would not have to be a
happens-before relationship, given that as you say, it is safe to store
and retrieve from different threads.

Thanks,
Endre.

Bob Lee

unread,
Nov 17, 2007, 12:28:00 PM11/17/07
to google...@googlegroups.com
On Nov 17, 2007 8:22 AM, Endre Stølsvik <stol...@gmail.com> wrote:
> > When you call getInstance(), Guice constructs the object graph in the
> > same thread, so there's no need for synchronization there.
>
> What about singletons? Synched Maps?

Like Tim said, scopes (like singleton) are responsible for their own
thread safety. The singleton scope, for example, uses double-checked
locking. Theoretically, I could write special code for eager
singletons which wouldn't require any synchronization because they
would all be constructed in the same thread as the injector, and then,
the injector itself would be handed off in a thread-safe way.

> I think there are too many opinions here now - or everybody is saying
> the same thing, differently.

I think we're all saying the same thing. The important thing is, the
Injector itself is thread safe, but you need to make sure that your
code is, too. Scopes definitely need to be thread safe. Objects which
our in scopes probably need to be thread safe. For example, multiple
threads can access a singleton or session-scoped object concurrent.

> And just to get my head straight here: this one happens-before edge
> between the thread setting the object, and the thread getting the
> object, will also ensure a metaphorical "deep synch" of the entire
> object, making all fields (final and non, from the constructor, up and
> including every change up to that edge) be visible to the other thread,
> also including the entire graph from that object of?
>
> .. one happens-before edge will ensure all that?

That's a question for Tim. :)

Bob

Cow_woC

unread,
Nov 17, 2007, 5:35:32 PM11/17/07
to google-guice
On Nov 16, 5:11 pm, Brian Pontarelli <br...@pontarelli.com> wrote:
> However, since Guice is JDK 5, it uses the new JMM, which tossed all that nonsense.

I'm not sure I understand. Are you saying that under the new JMM the
constructor is guaranteed to finish executing before the object is
published? Secondly, I'm still not sure I understand how one would
safely publish an object using the JMM mentioned here:
http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.5

If the constructor is not guaranteed to finish executing before the
field value is changed then how can you possibly safely publish? Would
the following code work?

MyObject foo = new MyObject();
myField = foo;

Thanks,
Gili

Dhanji R. Prasanna

unread,
Nov 18, 2007, 7:33:15 AM11/18/07
to google...@googlegroups.com
On Nov 18, 2007 8:35 AM, Cow_woC <gili.t...@gmail.com> wrote:
>
> On Nov 16, 5:11 pm, Brian Pontarelli <br...@pontarelli.com> wrote:
> > However, since Guice is JDK 5, it uses the new JMM, which tossed all that nonsense.
>
> I'm not sure I understand. Are you saying that under the new JMM the
> constructor is guaranteed to finish executing before the object is
> published?

An object is guaranteed to be fully *initialized* before its reference
is published. It is misleading to think of the constructor as having
"finished" in the case of multiple threads because non-final members
are not necessarily memory-coherent (i.e. visible to all threads). The
constructor could "finish", a reference to the object be published,
and yet threads may not see the same values for non-final fields of
the object (even those that are set in the ctor).

>
> If the constructor is not guaranteed to finish executing before the
> field value is changed then how can you possibly safely publish? Would
> the following code work?
>
> MyObject foo = new MyObject();
> myField = foo;

This is guaranteed by lexical ordering. A single thread going thru
this code will fully initialize and return from MyObject() before
assigning the value to myField.

The issue being debated here is publication to myField in the
constructed object (i.e. the one myField is a member of). If myField
is declared final, and your code above is in its ctor (and is the
*only* code in the ctor), then it is safe to assume foo is published
to myField when the ctor completes, i.e. the reference to the
enclosing object is published and visible to ALL threads.

For counter example,

class Enclosing {
private MyObject myField;
private final MyObject myFinalField;

Enclosing() {
myField = new MyObject();
myFinalField = myField;
}
}

Two threads receiving a reference to the same instance of Enclosing
may see different values of myField, even though its value is assigned
in the ctor, i.e. there is no guarantee made for non-final fields. But
ALL threads will see the correct value for myFinalField from a valid
reference to Enclosing.

Dhanji.

Tim Peierls

unread,
Nov 18, 2007, 2:43:07 PM11/18/07
to google-guice
On Nov 17, 12:28 pm, "Bob Lee" <crazy...@crazybob.org> wrote:
> > And just to get my head straight here: this one happens-before edge
> > between the thread setting the object, and the thread getting the
> > object, will also ensure a metaphorical "deep synch" of the entire
> > object, making all fields (final and non, from the constructor, up and
> > including every change up to that edge) be visible to the other thread,
> > also including the entire graph from that object of?
>
> > .. one happens-before edge will ensure all that?

I'm uncomfortable with the "deep synch" metaphor because it makes it
sound as though the happens-before edge is the agent of visibility --
as if a happens-before edge were a tangible entity that somehow *does*
something. Something has to be done, of course -- probably by flushing
a cache -- but the JMM doesn't specify that, only that any write to a
field is guaranteed to be visible to a subsequent read to that same
field iff there is a happens-before edge between the write and the
read.

But yes, the JMM says (for example) that if you write a value to a non-
final, non-volatile field inside some data structure, then write a
reference to the root of that data structure to a volatile, then in
another thread read that volatile and then read the original field's
value ... if you do that, you'll get back the value you wrote at the
beginning, assuming no one else writes that field along the way.

class C {
int f;
static volatile theC;
}

In Thread 1:
C c = new C();
c.f = 42; // 1. plain write
C.theC = c; // 2. volatile write

In Thread 2:
C c = C.theC; // 3. volatile read
int i = c.f; // 4. plain read
// i is 42 assuming no other writes

There are happens-before edges between 1 and 2 (program order), 2 and
3 (volatile write with subsquent read in a different thread), and 3
and 4 (program order), and happens-before is transitive, so there's an
edge between 1 and 4.

(Note that you have to be careful here if the field is reached by
following intermediate references; the visibility of each intermediate
reference has to be considered, too.)

You can use this kind of analysis to satisfy yourself of the
correctness of some difficult code, but no one would want to have to
think in those terms all the time. We tried in Java Concurrency in
Practice to encourage slightly a less precise but hopefully more
helpful and natural nomenclature.

--tim

Endre Stølsvik

unread,
Nov 18, 2007, 9:12:12 PM11/18/07
to google...@googlegroups.com
Thanks Tim,

You're really helping me out here!

> But yes, the JMM says (for example) that if you write a value to a non-
> final, non-volatile field inside some data structure, then write a
> reference to the root of that data structure to a volatile, then in
> another thread read that volatile and then read the original field's
> value ... if you do that, you'll get back the value you wrote at the
> beginning, assuming no one else writes that field along the way.
>
> class C {
> int f;
> static volatile theC;
> }
>
> In Thread 1:
> C c = new C();
> c.f = 42; // 1. plain write
> C.theC = c; // 2. volatile write
>
> In Thread 2:
> C c = C.theC; // 3. volatile read
> int i = c.f; // 4. plain read
> // i is 42 assuming no other writes
>
> There are happens-before edges between 1 and 2 (program order), 2 and
> 3 (volatile write with subsquent read in a different thread), and 3
> and 4 (program order), and happens-before is transitive, so there's an
> edge between 1 and 4.

Okay, this was a nice, clear explanation!

Now, for the final little honing: what about unrelated, so to speak,
writes and reads across a happens-before edge? See, I realize I might
have over-complicated the problem in the question you answered: What
about all other variables in heap-memory, in regard to a happens-before
edge, not within the object in question (nor in its descendant graph)?

I'll try to exemplify, hoping that I don't actually introduce more
complications than the question entails:

class C {
int i;
}

Class Sc {
static C c;
}

Class Svol {
static volatile int vol;
}

Thread 1:
Sc.c = new C();
Sc.c.i = 1; // 1. plain write to h-b "unrelated" field
Svol.vol = 1; // 2. volatile write

Thread 2:
int ivol, ic;
ivol = Svol.vol; // 3. volatile read
ic = Sc.c.i; // 4. plain read of the h-b unrelated field

Basically, does the transitiveness of the happens-before carry across
_all_ heap memory in the system? Will, as long as two threads create any
single happens-before relationship (more precisely, a
/synchronized-with/ edge?), any writes to whatever data before the
synchronized-with edge, in source thread 1, be visible to all reads
after the synchronized-with edge, in destination thread 2?

From the earlier JMM, this I actually how I understood the text, but
I've never gotten to ask the elite about it.

I understand that the JMM part of JLS 2 wasn't precise enough, but I
found the concepts there way easier to grasp, even with its six
different terms for 'read' and 'write'!

Chap 17, Threads and Locks:
http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html#30206

" Each thread has a working memory, in which it may keep copies of
the values of variables from the main memory that is shared between all
threads. To access a shared variable, a thread usually first obtains a
lock and flushes its working memory. This guarantees that shared values
will thereafter be loaded from the shared main memory to the threads
working memory. When a thread unlocks a lock it guarantees the values it
holds in its working memory will be written back to the main memory".

Chap 17.6, Rules about the Interaction of Locks and Variables:
http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html#28325

" Let T be any thread, let V be any variable, and let L be any lock. ...
* Between an assign action by T on V and a subsequent unlock action
by T on L, a store action by T on V must intervene; moreover, the write
action corresponding to that store must precede the unlock action, as
seen by main memory. (Less formally: if a thread is to perform an unlock
action on any lock, it must first copy all assigned values in its
working memory back out to main memory.)
* Between a lock action by T on L and a subsequent use or store
action by T on a variable V, an assign or load action on V must
intervene; moreover, if it is a load action, then the read action
corresponding to that load must follow the lock action, as seen by main
memory. (Less formally: a lock action acts as if it flushes all
variables from the thread's working memory; before use they must be
assigned or loaded from main memory.) "

Ah, I just love those "less formally"s!

So, it seems to me that _whatever_ happens in one thread A before a
synchronizes-with relationship with thread B, will be visible to B after
the synchronized-with?

Why I've never been quite sure, is that I find this so .. wasteful? It
seems to me that it would have been much better, performance and
optimalization wise, to require the programmer to declare every single
field that one possibly wanted to be synchronized when in a
synchronized-with relationship. For example by tagging the field with
the 'volatile' modifier (This is what it means in C, isn't it?). Then
the much more elaborate volatile-modifier as we know it in java would
instead be called 'synched-volatile' or something. In that way, the JVM
could have cached way more aggressively - or rather, not having to
bother about all the flushing of caches on every single synch-edge. This
might be where the famed "escape analysis" come in (?), but there must
be many situations where it will be infeasible for the JVM to look
endlessly far ahead to see whether any thread ever will happen to read a
field in question, thus it will have to flush the value (or not use a
register for its workings) "just in case".

>
> (Note that you have to be careful here if the field is reached by
> following intermediate references; the visibility of each intermediate
> reference has to be considered, too.)

This comment, which I didn't quite get, might be what I'm really asking
about!

Thanks a lot,
Endre.

Tim Peierls

unread,
Nov 19, 2007, 9:17:43 AM11/19/07
to google-guice
On Nov 18, 9:12 pm, Endre Stølsvik <stols...@gmail.com> wrote:
> You're really helping me out here!

Glad to be of assistance! But let me urge you once again to get and
read a copy of Java Concurrency in Practice -- http://jcip.net is a
good starting point. Also see the Wikipedia article on the JMM,
particularly the links to Bill Pugh's pages on it.


> Now, for the final little honing: what about unrelated, so to speak,
> writes and reads across a happens-before edge?

Same thing. I just picked an example to match what your question.
There need be no special relationship between the field whose
visibility is being reasoned about and the happens-before edge. But
see below for a caveat.


> class C {
> int i;
> }
>
> Class Sc {
> static C c;
> }
>
> Class Svol {
> static volatile int vol;
> }
>
> Thread 1:
> Sc.c = new C();
> Sc.c.i = 1; // 1. plain write to h-b "unrelated" field
> Svol.vol = 1; // 2. volatile write
>
> Thread 2:
> int ivol, ic;
> ivol = Svol.vol; // 3. volatile read
> ic = Sc.c.i; // 4. plain read of the h-b unrelated field

This example is flawed. In the line marked 4, there are two reads,
Sc.c and (Sc.c).i, and both of them are plain. There is no guarantee
that Thread 2 sees the same value of Sc.c that Thread 1 is using. If
you made Sc.c final, say, then yes, if thread 2 in line 3 reads the
value written in line 2, then it will also read in line 4 the value
written in line 1.


> Basically, does the transitiveness of the happens-before carry across
> _all_ heap memory in the system? Will, as long as two threads create any
> single happens-before relationship (more precisely, a
> /synchronized-with/ edge?), any writes to whatever data before the
> synchronized-with edge, in source thread 1, be visible to all reads
> after the synchronized-with edge, in destination thread 2?

Those seem like different questions, and I'm uncomfortable answering
either of them with a simple yes or no. I'll repeat that if there is a
happens-before edge between the write of a field and a subsequent read
of the field, then the read sees the value written. Just be sure that
you're reading the same field that you're writing -- in your example,
you weren't necessarily reading the same int that you wrote.


> > (Note that you have to be careful here if the field is reached by
> > following intermediate references; the visibility of each intermediate
> > reference has to be considered, too.)
>
> This comment, which I didn't quite get, might be what I'm really asking about!

Your code fragment above is an example of how the visibility of an
intermediate reference (Sc.c) needs to be considered.


> From the earlier JMM, this I actually how I understood the text, but
> I've never gotten to ask the elite about it.

I'm not a member of the elite, and I still get things wrong, but I
urge you not to put too much stock in the earlier JMM -- it's broken.


> Why I've never been quite sure, is that I find this so .. wasteful?

You should definitely check out Bill Pugh's pages on this.

This has strayed way OT, so perhaps we should call it quits on this
thread.

--tim

Endre Stølsvik

unread,
Nov 19, 2007, 10:02:51 AM11/19/07
to google...@googlegroups.com
Tim Peierls wrote:
> On Nov 18, 9:12 pm, Endre Stølsvik <stols...@gmail.com> wrote:
>> You're really helping me out here!
>
> Glad to be of assistance! But let me urge you once again to get and
> read a copy of Java Concurrency in Practice -- http://jcip.net is a
> good starting point. Also see the Wikipedia article on the JMM,
> particularly the links to Bill Pugh's pages on it.

Thanks again.

I've got the book - I even urge people to buy it in the message in this
thread dated 2007-11-15 16:08!

The problem is that I've read quite a bit - and either I still haven't
comprehended some minute detail, or .. I am right!

>
>
>> Now, for the final little honing: what about unrelated, so to speak,
>> writes and reads across a happens-before edge?
>
> Same thing. I just picked an example to match what your question.
> There need be no special relationship between the field whose
> visibility is being reasoned about and the happens-before edge. But
> see below for a caveat.

I still feel you're stating two different things all the time!! See,
either _all_ writes, anywhere ("... let V be any variable ..."), from a
Thread 1 before a h-b edge will be visible to Thread 2 after the h-b
edge, or only the writes in relation to the specific lock (" ... let L
be any lock ...") that created the edge will be visible to Thread 2
(whatever "in relation to" should mean). And I actually believe the
former, since the latter seems way complicated.

I do not understand your distinction that is discussed in the next
section (with the "field-traversal" problem).

Do note that I acknowledge that it is quite possibly me that's totally
broken here!

>
>
>> class C {
>> int i;
>> }
>>
>> Class Sc {
>> static C c;
>> }
>>
>> Class Svol {
>> static volatile int vol;
>> }
>>
>> Thread 1:
>> Sc.c = new C();
>> Sc.c.i = 1; // 1. plain write to h-b "unrelated" field
>> Svol.vol = 1; // 2. volatile write
>>
>> Thread 2:
>> int ivol, ic;
>> ivol = Svol.vol; // 3. volatile read
>> ic = Sc.c.i; // 4. plain read of the h-b unrelated field
>
> This example is flawed. In the line marked 4, there are two reads,
> Sc.c and (Sc.c).i, and both of them are plain. There is no guarantee
> that Thread 2 sees the same value of Sc.c that Thread 1 is using. If
> you made Sc.c final, say, then yes, if thread 2 in line 3 reads the
> value written in line 2, then it will also read in line 4 the value
> written in line 1.

See, that's why I had this reference - to really make sure that the
synched and unsynced write/reads were totally different - and according
to you, I got to the point.

So if this example really is broken, then my understanding is broken.

WHY is this different? Why, given the program ordering in the two
threads, with the synched-edge in between, can there be a difference in
what Thread 1 and Thread 2 sees as C.c?

>
>
>> Basically, does the transitiveness of the happens-before carry across
>> _all_ heap memory in the system? Will, as long as two threads create any
>> single happens-before relationship (more precisely, a
>> /synchronized-with/ edge?), any writes to whatever data before the
>> synchronized-with edge, in source thread 1, be visible to all reads
>> after the synchronized-with edge, in destination thread 2?
>
> Those seem like different questions, and I'm uncomfortable answering
> either of them with a simple yes or no. I'll repeat that if there is a
> happens-before edge between the write of a field and a subsequent read
> of the field, then the read sees the value written. Just be sure that
> you're reading the same field that you're writing -- in your example,
> you weren't necessarily reading the same int that you wrote.

Those weren't supposed to be different questions! Both are supposed to
be nuances of the same thing: point to the exact problem at hand, which
you do latch on to in the above and next section.

>
>
>>> (Note that you have to be careful here if the field is reached by
>>> following intermediate references; the visibility of each intermediate
>>> reference has to be considered, too.)
>> This comment, which I didn't quite get, might be what I'm really asking about!
>
> Your code fragment above is an example of how the visibility of an
> intermediate reference (Sc.c) needs to be considered.

Exactly. And I'm lost!

>
>
>> From the earlier JMM, this I actually how I understood the text, but
>> I've never gotten to ask the elite about it.
>
> I'm not a member of the elite, and I still get things wrong, but I
> urge you not to put too much stock in the earlier JMM -- it's broken.

It's wasn't _that_ broken, was it? The big problem was the volatile
wording (which made the double checked locking not strictly possible),
and some specifics on the initialization of final fields (which is the
example of "/usr/tmp".substring(4), which may end up as both "/usr" and
"/tmp" in the old model).

The new specification supposedly fixes all that, but I think the new
wordings are extremely technical, and do not make for easy
comprehension! Also, it shies away from using those nice "less formally"
sections - which technically probably is a good thing, but makes it much
more useless for many more developers, I can imagine.

But then again, if my understanding of this actually is broken, as you
basically state above, then it wasn't easy to comprehend before either,
at least not for me!

>
>
>> Why I've never been quite sure, is that I find this so .. wasteful?
>
> You should definitely check out Bill Pugh's pages on this.

I will.

>
> This has strayed way OT, so perhaps we should call it quits on this
> thread.

It has become a bit OT, yes. But maybe others actually are finding it
interesting nonetheless - and maybe later developers will find the
discussion by googling! A discussion often clears up things much better
than "a concise explanation", since it will have to highlight the
problem from many different angles.

It is maybe a bit sad that this discussion didn't come up on the
advanced-java list instead.

Thanks again,
Endre.

Brian Pontarelli

unread,
Nov 19, 2007, 11:27:33 AM11/19/07
to google...@googlegroups.com

> For counter example,
>
> class Enclosing {
> private MyObject myField;
> private final MyObject myFinalField;
>
> Enclosing() {
> myField = new MyObject();
> myFinalField = myField;
> }
> }
>
> Two threads receiving a reference to the same instance of Enclosing
> may see different values of myField, even though its value is assigned
> in the ctor, i.e. there is no guarantee made for non-final fields. But
> ALL threads will see the correct value for myFinalField from a valid
> reference to Enclosing.
>
Yeah, that's what I was gonna say. If you want your objects to be solid,
final variables are the way to go. Also, as to my original comment, my
main point was that the concepts of load and store were tossed from the
new JMM. Prior to version 3, the JMM had lots of information about
thread local storage and how thread local storages was transferred to
main memory. That language has all been removed in favor of happens
before semantics and shared memory.

-bp

Tim Peierls

unread,
Nov 20, 2007, 8:04:45 AM11/20/07
to google-guice
On Nov 19, 10:02 am, Endre Stølsvik <stols...@gmail.com> wrote:
> >> Thread 1:
> >> Sc.c = new C();
> >> Sc.c.i = 1; // 1. plain write to h-b "unrelated" field
> >> Svol.vol = 1; // 2. volatile write
>
> >> Thread 2:
> >> int ivol, ic;
> >> ivol = Svol.vol; // 3. volatile read
> >> ic = Sc.c.i; // 4. plain read of the h-b unrelated field
>
> > This example is flawed.

Whoops -- no, it isn't. Duh. (I told you I still get things wrong!)


> See, that's why I had this reference - to really make sure that the
> synched and unsynced write/reads were totally different - and according
> to you, I got to the point.
>
> So if this example really is broken, then my understanding is broken.

Your understanding is correct -- I was wrong. Sorry for confusion.

--tim

Brian Pontarelli

unread,
Nov 20, 2007, 11:24:24 AM11/20/07
to google...@googlegroups.com
Tim Peierls wrote:
On Nov 19, 10:02 am, Endre Stølsvik <stols...@gmail.com> wrote:
  
Thread 1:
   Sc.c = new C();
   Sc.c.i = 1;    // 1. plain write to h-b "unrelated" field
   Svol.vol = 1;  // 2. volatile write
        
Thread 2:
   int ivol, ic;
   ivol = Svol.vol;  // 3. volatile read
   ic = Sc.c.i;      // 4. plain read of the h-b unrelated field
        
This example is flawed.
      
Whoops -- no, it isn't. Duh. (I told you I still get things wrong!)
  
Hmmmmm... Perhaps some off then because this still looks flawed to me. Thread 3 can change Sc.c at any point and not touch the volatile. Therefore, the processor caches can be out of sync WRT Sc.c correct?

-bp

Endre Stølsvik

unread,
Nov 20, 2007, 12:01:29 PM11/20/07
to google...@googlegroups.com

There is no thread 3 specified in the example, hence there is no thread 3.

Endre.

Brian Pontarelli

unread,
Nov 20, 2007, 1:35:15 PM11/20/07
to google...@googlegroups.com
True, but as a multi-threaded example it does have that weakness. As a pure clean room example with only two threads it is fine. Once you throw in more, things get tricky. I generally try to avoid clean room examples and tend to favor examples with X threads.


-bp

Endre Stølsvik

unread,
Nov 20, 2007, 2:20:24 PM11/20/07
to google...@googlegroups.com

Just to argue this to the bitter end: I don't get the point. If I've
made an application where two threads are communicating with each other,
and that's how I made it, no "thread 3" will suddenly come barging in,
changing variables under my nose!

If one had to think like that for everything, one could not get any code
to work: No matter how one organized one's code in regard to
synchronization, there would always be races where your annoying mystery
Thread Three could sneak in, changing everything under one's feet!

But that's luckily not quite how it works. The threads are mine: I spawn
them and set them to do specific tasks. They will obey my orders - or
else I'll debug them!

Endre.

Brian Pontarelli

unread,
Nov 20, 2007, 7:17:56 PM11/20/07
to google...@googlegroups.com
Had a feeling you were going to respond with something like that ;)

Just as background, I come from big companies like Orbitz and BEA with loads of developers. If you want something safe, you better assume any number of threads. That was really my point. I'm certain that you can code a closed situation, but I never code in that manner and code for broader handling, because someone might come in one day and toss in thread 3 or thread 30 and then you are in trouble. Plus, I do mostly frameworking, so I'm usually dealing with multiple rather than a fixed number of threads. But, that's me and I'm perfectly certain that you capable of coding a closed system. I'd just make sure that everything is private or package protected to ensure no one messes things up.


-bp

Tim Peierls

unread,
Nov 21, 2007, 12:22:21 PM11/21/07
to google-guice
The point is not that other threads couldn't have data races and screw
things up -- of course they could, even if the code in the example
were concatenated into a single method body in a single thread. Data
races mean your code is broken, and it isn't useful to reason about
the correctness of broken code.

The happens-before relation is a characterization of when the
guarantees about visibility that we take for granted in the sequential
world actually still hold in a multi-threaded setting. Nothing
stronger or more magical than that.

But I urge anyone reading this not to conclude that it is a good idea
to build intricate concurrency policies around the visibility
guarantees of volatile. It's very hard to document such policies in a
way that later readers will understand quickly. Far better to use
something like the annotations introduced in Java Concurrency in
Practice, http://jcip.net/annotations/doc/index.html -- they're simple
and not burdensome and they have the additional advantage of support
from FindBugs.

--tim

On Nov 20, 11:24 am, Brian Pontarelli <br...@pontarelli.com> wrote:
> Tim Peierls wrote:On Nov 19, 10:02 am,
> > > Endre Stølsvik<stols...@gmail.com>wrote:
> > > Thread 1: Sc.c = new C();
> > > > Sc.c.i = 1;
> > > > // 1. plain write to h-b "unrelated" field
> > > > Svol.vol = 1; // 2. volatile writeThread 2:
Reply all
Reply to author
Forward
0 new messages