Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Class Constants - pros and cons

22 views
Skip to first unread message

Magnus Warker

unread,
Jul 25, 2010, 1:14:44 AM7/25/10
to
Hi,

in the past I used to declare my constants as this, e. g. the colors of a
chess board:

public static int COL_WHITE = 1;
public static int COL_BLACK = 2;

Now, I find class constants very useful:

public final class Color
{
public static final Color WHITE = new Color();
public static final Color BLACK = new Color();
...
}

The major advantage for me is that I can use the constants over the classes
name, e. g. Color.WHITE.

However, I found that I lose the ability to evaluate these constants in
switch statements:

switch(color)
{
case Color.WHITE:
...

Instead, I have to use cascading if-statements:

if (color==Color.WHITE)
{
}
else
if (color==Color.WHITE)
{
}
else
...

This is a major drawback in my opinion.

Have I missed something? How do you do that?

Thanks
Magnus

Peter Duniho

unread,
Jul 25, 2010, 1:56:30 AM7/25/10
to
Magnus Warker wrote:
> Hi,
>
> in the past I used to declare my constants as this, e. g. the colors of a
> chess board:
>
> public static int COL_WHITE = 1;
> public static int COL_BLACK = 2;
>
> Now, I find class constants very useful:
>
> public final class Color
> {
> public static final Color WHITE = new Color();
> public static final Color BLACK = new Color();
> ....

> }
>
> The major advantage for me is that I can use the constants over the classes
> name, e. g. Color.WHITE.
>
> However, I found that I lose the ability to evaluate these constants in
> switch statements:
> [...]

> Have I missed something? How do you do that?

Have you thought about using enums instead? If enums aren't solving
your need, it would be helpful if you could explain why.

Pete

Lew

unread,
Jul 25, 2010, 1:57:57 AM7/25/10
to
Magnus Warker wrote:
> in the past I used to declare my constants as this, e. g. the colors of a
> chess board:
>
> public static int COL_WHITE = 1;
> public static int COL_BLACK = 2;

Those should have been 'final'.

> Now, I find class constants very useful:
>
> public final class Color
> {
> public static final Color WHITE = new Color();
> public static final Color BLACK = new Color();
> ...
> }
>
> The major advantage for me is that I can use the constants over the classes
> name, e. g. Color.WHITE.

Strictly speaking, in Java terms those aren't "constants" but "final
variables". To be "constants", they'd have to be initialized with a
compile-time constant expression, per
<http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html#10931>
and
<http://java.sun.com/docs/books/jls/third_edition/html/expressions.html#5313>

> However, I found that I lose the ability to evaluate these constants in
> switch statements:
>
> switch(color)
> {
> case Color.WHITE:
> ...
>
> Instead, I have to use cascading if-statements:
>
> if (color==Color.WHITE)
> {
> }
> else
> if (color==Color.WHITE)
> {
> }
> else
> ...
>
> This is a major drawback in my opinion.
>
> Have I missed something? How do you do that?

Make 'Color' an enum.

--
Lew

Magnus Warker

unread,
Jul 25, 2010, 3:33:07 AM7/25/10
to
It's the readibility of the code.

With constant classes I can write something like this:

public void setColor (Color color)
{
if (color == Color.WHITE)
...
}

This is not possible with enums, since an enume type must be defined in a
class:

public void setColor (SomeClass.Color color)
{
if (color == SomeClass.Color.WHITE)
...
}

I think, it would be ok, if I needed the enum only in SomeClass.

But what about constants that are needed in many classes?

I saw that constant classes are used widely in the java api, e. g.
java.awt.PageAttributes.MediaType. But I also saw constants declared as
final ints anyway.

Thanks
Magnus

Alan Gutierrez

unread,
Jul 25, 2010, 4:48:49 AM7/25/10
to
Magnus Warker wrote:

> This is not possible with enums, since an enume type must be defined in a
> class:

Someone is about to pop in and give you chapter and verse of JLS that
tells you that's not so, but I'll just tell you that's not so.

--
Alan Gutierrez - al...@blogometer.com - http://twitter.com/bigeasy

Jean-Baptiste Nizet

unread,
Jul 25, 2010, 4:51:05 AM7/25/10
to
Magnus Warker a écrit :

> It's the readibility of the code.
>
> With constant classes I can write something like this:
>
> public void setColor (Color color)
> {
> if (color == Color.WHITE)
> ...
> }
>
> This is not possible with enums, since an enume type must be defined in a
> class:
>
> public void setColor (SomeClass.Color color)
> {
> if (color == SomeClass.Color.WHITE)
> ...
> }
>

No. Read more about enums. Enums are classes, and can be top-level
classes. You define them, as regular classes, in their own .java file :

// Color.java
package com.foo.bar;

public enum Color {
WHITE,
BLACK;
}

// SomeClass.java
public void setColor(Color color) {
this.color = color;
if (color == Color.WHITE) {
// ...
}
switch (color) {
case WHITE :
// ...
}
}

> I think, it would be ok, if I needed the enum only in SomeClass.
>
> But what about constants that are needed in many classes?
>

You define them in their own .java file

> I saw that constant classes are used widely in the java api, e. g.
> java.awt.PageAttributes.MediaType. But I also saw constants declared as
> final ints anyway.
>

Yes, because enums only appeared in Java 5, and lots of classes had
already been written before that, and thus used final ints or Strings
instead.

Lew

unread,
Jul 25, 2010, 10:39:33 AM7/25/10
to
Magnus Warker a écrit :

Don't top-post!

>> It's the readibility of the code.
>>
>> With constant classes I can write something like this:
>>
>> public void setColor (Color color)
>> {
>> if (color == Color.WHITE)

No. That would be a bug. You'd write 'if ( color.equals( Color.WHITE ) )'.

>> ...
>> }
>>
>> This is not possible with enums, since an enume type must be defined in a
>> class:
>>
>> public void setColor (SomeClass.Color color)
>> {
>> if (color == SomeClass.Color.WHITE)
>> ...
>> }

And how is that less readable? Besides, it's wrong. 'Color' would *be* the
enum, so you'd still have 'Color.WHITE', unless 'Color' were a nested class,
but then it would be in the constant String scenario, too.

Further besides, with an enum you wouldn't say
if ( color.equals( Color.WHITE ))
{
foo( something );
}
or
if ( color == Color.WHITE )
{
foo( something );
}
you'd say
color.foo( something );

Jean-Baptiste Nizet wrote:
> No. Read more about enums. Enums are classes, and can be top-level
> classes. You define them, as regular classes, in their own .java file :
>
> // Color.java
> package com.foo.bar;
>
> public enum Color {
> WHITE,
> BLACK;
> }
>
> // SomeClass.java
> public void setColor(Color color) {
> this.color = color;
> if (color == Color.WHITE) {
> // ...
> }
> switch (color) {
> case WHITE :
> // ...
> }
> }

Magnus Warker a écrit :


>> I think, it would be ok, if I needed the enum only in SomeClass.
>>
>> But what about constants that are needed in many classes?

??

Have them refer to the enum.

Jean-Baptiste Nizet wrote:
> You define them in their own .java file

Magnus Warker a écrit :


>> I saw that constant classes are used widely in the java api, e. g.
>> java.awt.PageAttributes.MediaType. But I also saw constants declared as

Jean-Baptiste Nizet wrote:
> Yes, because enums only appeared in Java 5, and lots of classes had
> already been written before that, and thus used final ints or Strings
> instead.

Peter Duniho wrote:


>>> Have you thought about using enums instead? If enums aren't solving
>>> your need, it would be helpful if you could explain why.

That would still be helpful.

ints and Strings are not typesafe. enums are. Use enums.

If you expect a constant String 'GlueConstants.CHOIX' equal to "foobar" and
pass 'ShoeConstants.CHOIX' equal to "fubar" instead, the compiler won't
complain. If you pass the wrong enum it will.

Since enums are classes, they can contain behavior. That means you won't need
if-chains nor case constructs to select behavior; just invoke the method
directly from the enum constant itself and voilà!

--
Lew
Don't quote sigs, such as this one, either.

jebblue

unread,
Jul 25, 2010, 12:47:09 PM7/25/10
to
On Sun, 25 Jul 2010 10:39:33 -0400, Lew wrote:

> Further besides, with an enum you wouldn't say
> if ( color.equals( Color.WHITE ))
> {
> foo( something );
> }
> or
> if ( color == Color.WHITE )
> {
> foo( something );
> }
> you'd say
> color.foo( something );

> Since enums are classes, they can contain behavior. That means you


> won't need if-chains nor case constructs to select behavior; just invoke
> the method directly from the enum constant itself and voilà!

Sometimes I use plain enums with no specific value assigned to the elements
and sometimes I add a getter, getValue() to enable specific values but I
don't extend an enum (from an OOP perspective) beyond acting as an enum.

If foo() has no relationship to color, and bar() and stuff() also need
to execute when color.equals(Color.WHITE)); why extend an enum to be
more than it needs to be? That's what a regular class is for.

Sorry if I may have missed your point.

--
// This is my opinion.

Joshua Cranmer

unread,
Jul 25, 2010, 12:52:07 PM7/25/10
to
On 07/25/2010 10:39 AM, Lew wrote:
>>> It's the readibility of the code.
>>>
>>> With constant classes I can write something like this:
>>>
>>> public void setColor (Color color)
>>> {
>>> if (color == Color.WHITE)
>
> No. That would be a bug. You'd write 'if ( color.equals( Color.WHITE ) )'.

To be pedantic, it would not be a bug if the Color class were written in
a certain way, i.e., there is no other Color instances other than those
in Color's static instance variables.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth

Lew

unread,
Jul 25, 2010, 3:03:17 PM7/25/10
to
Magnus Warker wrote:
>>>> It's the readibility of the code.
>>>>
>>>> With constant classes I can write something like this:
>>>>
>>>> public void setColor (Color color)
>>>> {
>>>> if (color == Color.WHITE)

Lew wrote:
>> No. That would be a bug. You'd write
>> 'if ( color.equals( Color.WHITE ) )'.

Joshua Cranmer wrote:
> To be pedantic, it would not be a bug if the Color class were written in
> a certain way, i.e., there is no other Color instances other than those
> in Color's static instance variables.

You are wrong, this once. (In your case, an extremely rare occurrence.)

It's impossible to enforce that for all clients 'color' is assigned the exact
instance, therefore that == will work if the static variables are Strings.
It's all too easy to accidentally push in a different constant String from
some other class, as outlined in my other response, or by other means to
create a different instance of a String that you intend even with the correct
value. So yes, to be pedantic, using == would be a bug waiting to happen, as
would using String constants when enum constants are so much safer (and
guaranteed to work with == ).

--
Lew

Tom Anderson

unread,
Jul 26, 2010, 8:01:33 AM7/26/10
to

I won't argue with you about Strings.

But Joshua was talking about using instances of Color, where those
instances are singletons (well, flyweights is probably the right term when
there are several of them), exposed in static final fields on Color, and
the class is written in a certain way, which i take to mean having a
private constructor, not creating any instances other than those in the
statics, and implementing readResolve. In that case, how can there be
pairs of instances for which .equals is true and == isn't? Colour doesn't
make such instances, no other class can make such instances directly,
serialization won't, and sun.misc.Unsafe is a foul.

Indeed, this is exactly what enums do, so why do you think classes can't?

tom

--
If goods don't cross borders, troops will. -- Fr

Lew

unread,
Jul 26, 2010, 8:58:27 AM7/26/10
to
Tom Anderson wrote:
> I won't argue with you about Strings.
>
> But Joshua was talking about using instances of Color, where those
> instances are singletons (well, flyweights is probably the right term
> when there are several of them), exposed in static final fields on
> Color, and the class is written in a certain way, which i take to mean
> having a private constructor, not creating any instances other than
> those in the statics, and implementing readResolve. In that case, how
> can there be pairs of instances for which .equals is true and == isn't?
> Colour doesn't make such instances, no other class can make such
> instances directly, serialization won't, and sun.misc.Unsafe is a foul.
>
> Indeed, this is exactly what enums do, so why do you think classes can't?

We're talking apples and oranges here. I made a mistake and it's my fault. I
was focused on String constants and should not have used 'Color' instances as
an example, but 'Foo'. In my mind I had conflated the examples and was
imagining a fictitious 'Color' class with Strings not 'Color' instances, and
that was where I went wrong.

You and Joshua are, of course, correct.

--
Lew

Tom McGlynn

unread,
Jul 26, 2010, 9:37:12 AM7/26/10
to
On Jul 26, 8:01 am, Tom Anderson <t...@urchin.earth.li> wrote:

...


> But Joshua was talking about using instances of Color, where those
> instances are singletons (well, flyweights is probably the right term when
> there are several of them), exposed in static final fields on Color, and

...
While I agree with Tom's main point here,I am dubious about his
suggestion for what the static instances should be called
in a class that only creates a unique closed set of such instances.

I don't think flyweights is the right word. For me flyweights
are classes where part of the state is externalized for some
purpose. This is orthogonal to the concept of singletons.
E.g., suppose I were running a simulation of galaxy mergers
of two 100-million-star galaxies. Stars differ only in position,
velocity and mass. Rather than creating 200 million Star objects
I might create a combination flyweight/singleton Star where each
method call includes an index that is used to find the mutable
state in a few external arrays. At least in some versions of Java this
could be a very useful optimization since we only need to create
of order 10 objects rather than 10^8.

So is there a better word than flyweights for the extension of a
singleton
to a set with cardinality > 1?


Tom McGlynn

Lew

unread,
Jul 26, 2010, 11:31:07 AM7/26/10
to
Tom McGlynn wrote:
> E.g., suppose I were running a simulation of galaxy mergers
> of two 100-million-star galaxies.  Stars differ only in position,
> velocity and mass.  Rather than creating 200 million Star objects
> I might create a combination flyweight/singleton Star where each
> method call includes an index that is used to find the mutable
> state in a few external arrays. At least in some versions of Java this
> could be a very useful optimization since we only need to create
> of order 10 objects rather than 10^8.

Except that that "mutable state in a few external arrays" still has to
maintain state for order 10^8 objects, coordinating position, velocity
(relative to ...?) and mass. So you really aren't reducing very much,
except simplicity in the programming model and protection against
error.

--
Lew

Andreas Leitgeb

unread,
Jul 26, 2010, 12:29:32 PM7/26/10
to

About 2.4GB are not much?
(200 mill * (8bytes plain Object + 4bytes for some ref to each))

Of course that needs to be taken in relation to the perhaps 8GB of RAM
still needed for the 5 doubles per star: about 33% of payload would be
"packaging" costs with a separate instance for each star!

I for myself might choose the perhaps less cpu-efficient (due to all
the repeated indexing and for the defied locality) and also less simple
programming model, if it allowed me to solve larger problems with the
available RAM.

PS: Without further tricks, each single simulation-step would probably
take much too long, anyway, if for each star its interaction with each
other star needs to be calculated... But then it was only thought-
experiment, in the first place ("... suppose I were running ...")

Roedy Green

unread,
Jul 26, 2010, 1:18:17 PM7/26/10
to
On Sun, 25 Jul 2010 07:14:44 +0200, Magnus Warker
<mag...@mailinator.com> wrote, quoted or indirectly quoted someone who
said :

>
> public static int COL_WHITE = 1;
> public static int COL_BLACK = 2;

These are neatly done with enums. Then you can have accessor methods
to get at associated properties such as Color.

See http://mindprod.com/jgloss/enum.html for examples.
--
Roedy Green Canadian Mind Products
http://mindprod.com

You encapsulate not just to save typing, but more importantly, to make it easy and safe to change the code later, since you then need change the logic in only one place. Without it, you might fail to change the logic in all the places it occurs.

Tom Anderson

unread,
Jul 26, 2010, 1:33:11 PM7/26/10
to
On Mon, 26 Jul 2010, Tom McGlynn wrote:

> On Jul 26, 8:01 am, Tom Anderson <t...@urchin.earth.li> wrote:
>
>> But Joshua was talking about using instances of Color, where those
>> instances are singletons (well, flyweights is probably the right term
>> when there are several of them), exposed in static final fields on
>> Color, and
>

> While I agree with Tom's main point here,I am dubious about his
> suggestion for what the static instances should be called in a class
> that only creates a unique closed set of such instances.
>
> I don't think flyweights is the right word. For me flyweights are
> classes where part of the state is externalized for some purpose. This
> is orthogonal to the concept of singletons. E.g., suppose I were running
> a simulation of galaxy mergers of two 100-million-star galaxies. Stars
> differ only in position, velocity and mass. Rather than creating 200
> million Star objects I might create a combination flyweight/singleton
> Star where each method call includes an index that is used to find the
> mutable state in a few external arrays.

I am 90% sure that is absolutely not how 'flyweight' is defined in the
Gang of Four book, from which its use in the programming vernacular
derives. If you want to use a different definition, then that's fine, but
you are of course wrong.

> So is there a better word than flyweights for the extension of a
> singleton to a set with cardinality > 1?

Multipleton.

More seriously, enumeration.

tom

--
Remember Sammy Jankis.

Tom Anderson

unread,
Jul 26, 2010, 1:33:39 PM7/26/10
to

Ah, i thought it was probably something like that.

Lew

unread,
Jul 26, 2010, 1:37:59 PM7/26/10
to
Lew wrote:
>> Except that that "mutable state in a few external arrays" still has to
>> maintain state for order 10^8 objects, coordinating position, velocity
>> (relative to ...?) and mass.  So you really aren't reducing very much,
>> except simplicity in the programming model and protection against
>> error.
>

Andreas Leitgeb wrote:
> About 2.4GB are not much?
>  (200 mill * (8bytes plain Object + 4bytes for some ref to each))
>

No, it isn't. Around USD 75 worth of RAM.

> Of course that needs to be taken in relation to the perhaps 8GB of RAM
> still needed for the 5 doubles per star: about 33% of payload would be
> "packaging" costs with a separate instance for each star!
>
> I for myself might choose the perhaps less cpu-efficient (due to all
> the repeated indexing and for the defied locality) and also less simple
> programming model, if it allowed me to solve larger problems with the
> available RAM.
>

With that much data to manage, I'd go with the straightforward object
model and a database or the $75 worth of RAM chips. Or both.

Complicated code is more expensive than memory.

And what about when the model changes, and you want to track star age,
brightness, color, classification, planets, name, temperature,
galactic quadrant, ...? With parallel arrays the complexity and risk
of bugs just goes up and up. With an object model, the overhead of
maintaining that model becomes less and less significant, but the
complexity holds roughly steady.

Saving a single memory-stick's worth of RAM is a false economy.

--
Lew

Andreas Leitgeb

unread,
Jul 27, 2010, 6:55:56 AM7/27/10
to
Lew <l...@lewscanon.com> wrote:
> Lew wrote:
>>> Except that that "mutable state in a few external arrays" still has to
>>> maintain state for order 10^8 objects, coordinating position, velocity
>>> (relative to ...?) and mass.  So you really aren't reducing very much,
>>> except simplicity in the programming model and protection against
>>> error.
> Andreas Leitgeb wrote:
>> About 2.4GB are not much?
>>  (200 mill * (8bytes plain Object + 4bytes for some ref to each))
> No, it isn't. Around USD 75 worth of RAM.

Except, if the machine is already RAM-stuffed to its limits...

Even if the machine wasn't yet fully RAM'ed, then buying more RAM
*and* using the arrays-kludge(yes, that's it, afterall) would allow
even larger galaxies to be simulated.

>> I for myself might choose the perhaps less cpu-efficient (due to all
>> the repeated indexing and for the defied locality) and also less simple
>> programming model, if it allowed me to solve larger problems with the
>> available RAM.
> With that much data to manage, I'd go with the straightforward object
> model and a database or the $75 worth of RAM chips. Or both.

On some deeper level, a relational DB seems to actually use the "separate
arrays" approach, too. Otherwise I cannot explain the relatively low cost
of adding another column to a table of 100 million entries already in it.

> And what about when the model changes, and you want to track star age,
> brightness, color, classification, planets, name, temperature,
> galactic quadrant, ...? With parallel arrays the complexity and risk
> of bugs just goes up and up. With an object model, the overhead of
> maintaining that model becomes less and less significant, but the
> complexity holds roughly steady.

100% agree to these points.

It's like fixing something with duct-tape. The result looks not very
good, but it may last for a particular use, that otherwise would have
required a redesign from scratch and possibly costly further ressources.
(changing object to separate arrays *may* still be less cost&effort than
switching to some database)

If, collisions and break-ups of stars were also simulated (thus a varying
number of them), then, suddenly, duct-tape won't fix it anymore, anyway...

Lew

unread,
Jul 27, 2010, 8:00:03 AM7/27/10
to
Andreas Leitgeb wrote:
> On some deeper level, a relational DB seems to actually use the "separate
> arrays" approach, too. Otherwise I cannot explain the relatively low cost
> of adding another column to a table of 100 million entries already in it.

There's a big difference between a database with tens of thousands, maybe far
more manhours of theory, development, testing, user feedback, optimization
efforts, commercial competition and evolution behind it, and an ad-hoc use of
in-memory arrays by a solo programmer.

A database system is far, far more than a simple "separate arrays" approach.
There are B[+]-trees, caches, indexes, search algorithms, stored procedures,
etc., etc., etc. Your comment is like saying that "on some deeper level" a
steel-and-glass skyscraper is like the treehouse you built for your kid in the
back yard.

--
Lew

Tom McGlynn

unread,
Jul 27, 2010, 9:53:42 AM7/27/10
to


Thanks Tom for responding to my real question about the vocabularly
rather than the example. That was just intended to clarify a
distinction between a singleton and my idea of a flyweight.

Here's a bit of what the GOF has to say about flyweights. (Page 196
in my version)....


"A flyweight is a shared object that can be used in multiple contexts
simultaneously. The flyweight acts as an independent object in each
context--it's indistinguishable from an instance of the object that's
not shared.... The key concept here is the distinction between
intrinsic and extrinsic state. Intrinsic state is stored in the
flyweight. It consists of information that's independent of the
flyweight's context, thereby making it shareable. Extrinsic state
depends on and varies with the flyweights context and therefore can't
be shared. Client objects are responsible for passing extrinsic state
to the flyweight when it needs it."

That's reasonably close to what I had in mind. In my simple example
stars may share some common state (e.g., age), but the information
about the position and velocity is extrinsic and supplied when the
object is used. In a more realistic example I might have multiple
flyweights for different types of stars.

Getting back to my original concern, I don't think enumeration is a
good word for the concept either. Enumerations are often used for an
implementation of the basis set -- favored in Java by special syntax.
However the word enumeration strongly suggests a list. In general
the set of special values may have a non-list relationship (e.g., they
could form a hierarchy). I like the phrase 'basis set' I used above
but that suggests that other elements can be generated by combining
the elements of the basis so it's not really appropriate either.

Regards,
Tom McGlynn

Alan Gutierrez

unread,
Jul 27, 2010, 1:34:27 PM7/27/10
to

In other words, nobody ever got fired for buying IBM.

Alan Gutierrez

unread,
Jul 27, 2010, 2:11:56 PM7/27/10
to
Andreas Leitgeb wrote:
> Lew <l...@lewscanon.com> wrote:
>> Lew wrote:
>>>> Except that that "mutable state in a few external arrays" still has to
>>>> maintain state for order 10^8 objects, coordinating position, velocity
>>>> (relative to ...?) and mass. So you really aren't reducing very much,
>>>> except simplicity in the programming model and protection against
>>>> error.
>> Andreas Leitgeb wrote:
>>> About 2.4GB are not much?
>>> (200 mill * (8bytes plain Object + 4bytes for some ref to each))
>> No, it isn't. Around USD 75 worth of RAM.
>
> Except, if the machine is already RAM-stuffed to its limits...
>
> Even if the machine wasn't yet fully RAM'ed, then buying more RAM
> *and* using the arrays-kludge(yes, that's it, afterall) would allow
> even larger galaxies to be simulated.

The RAM is cheaper than programmer time argument is useful to salt the
tail of the newbie that seeks to dive down every micro-optimization
rabbit hole that he come across on the path to the problems that truly
deserve such intense consideration. You have to admire the moxie of the
newbie that wants to catenate last name first as fast as possible, but
you explain to them that their are plenty of dragons to slay further
down the road.

It is not a good argument for someone who brings a problem that is truly
limited by available memory. Memory management is an appropriate
consideration for the problem. Memory management is the problem.

Memory procurement is the non-programmer solution. Throw money at it.
Scale up rather than scaling out, because we can scale up with cash, but
scaling out requires programmers who understand algorithms.

You're right that scaling up hits a foreseeable limit. I like to have
the limitations of my program be unforeseeable. That is, if I'm going to
read something into memory, say, every person in the world who would
loan money to me personally without asking questions, I'd like to know
that hitting the limits of the finite resource employed on a
contemporary computer system correlates to situation in reality that is
unimaginable.

Moore's Law does not excuse brute force.

Which is why I am similarly taken aback to hear RAM prices quoted for
something that has obvious solutions in plain old Java.

>>> I for myself might choose the perhaps less cpu-efficient (due to all
>>> the repeated indexing and for the defied locality) and also less simple
>>> programming model, if it allowed me to solve larger problems with the
>>> available RAM.
>> With that much data to manage, I'd go with the straightforward object
>> model and a database or the $75 worth of RAM chips. Or both.
>
> On some deeper level, a relational DB seems to actually use the "separate
> arrays" approach, too. Otherwise I cannot explain the relatively low cost
> of adding another column to a table of 100 million entries already in it.

On some deeper level, a relational database through an object relational
mapping layer will be paging information in and out of memory, on and
off of disk, as you need it. That is the feature you need to address
your memory problem.

Lately, I've been mucking about with `MappedByteBuffer`, so I imagine
for your (hypothetical) problem of modeling the Galaxy, you would model
it by keeping the primitives you describe in the `MappedByteBuffer` and
creating objects from them as needed. This is not `Flyweight` to my
mind, where you keep objects that map to finite set of values, these
values are assembled into a larger structure in an infinite number of
permutations. These atomic components exist within the larger structure,
but they are reused. Interned `String` is a flyweight to my mind.

I'm not sure what the pattern is for the short term objectification of a
record, but that is a lot of what Hibernate is about. Making objecty
that which is stringy, just long enough for you do your GRUD in the
security of your type-safe world.

>> And what about when the model changes, and you want to track star age,
>> brightness, color, classification, planets, name, temperature,
>> galactic quadrant, ...? With parallel arrays the complexity and risk
>> of bugs just goes up and up. With an object model, the overhead of
>> maintaining that model becomes less and less significant, but the
>> complexity holds roughly steady.
>
> 100% agree to these points.

You create an `Star` object that can read the information from a
`MappedByteBuffer` at a particular index, and you can simply change the
`read` and `write` method of the star.

You've reached down to the deeper level of the ORM+RDBMS stack and
extracted the only design pattern you need to address the problem of
reading the Universe into memory.

Martin Gregorie

unread,
Jul 27, 2010, 2:58:25 PM7/27/10
to
On Tue, 27 Jul 2010 12:34:27 -0500, Alan Gutierrez wrote:

>
> In other words, nobody ever got fired for buying IBM.
>

Regardless of what you might think of their business methods, and in the
past they didn't exactly smell of roses, their software quality control
and their hardware build quality are both hard to beat. I've used S/38
and AS/400 quite a bit and never found bugs in their system software or
lost work time due to hardware problems.

For elegant systems design ICL had them beat hands down, but although ICL
quality was OK by IT standards the IBM kit was more reliable.

IME anyway.


--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |

Lew

unread,
Jul 27, 2010, 8:31:06 PM7/27/10
to
Alan Gutierrez wrote:
> The RAM is cheaper than programmer time argument is useful to salt the
> tail of the newbie that seeks to dive down every micro-optimization
> rabbit hole that he come across on the path to the problems that truly
> deserve such intense consideration. You have to admire the moxie of the
> newbie that wants to catenate last name first as fast as possible, but
> you explain to them that their are plenty of dragons to slay further
> down the road.
>
> It is not a good argument for someone who brings a problem that is truly
> limited by available memory. Memory management is an appropriate
> consideration for the problem. Memory management is the problem.
>
> Memory procurement is the non-programmer solution. Throw money at it.
> Scale up rather than scaling out, because we can scale up with cash, but
> scaling out requires programmers who understand algorithms.
>
> You're right that scaling up hits a foreseeable limit. I like to have
> the limitations of my program be unforeseeable. That is, if I'm going to
> read something into memory, say, every person in the world who would
> loan money to me personally without asking questions, I'd like to know
> that hitting the limits of the finite resource employed on a
> contemporary computer system correlates to situation in reality that is
> unimaginable.
>
> Moore's Law does not excuse brute force.
>
> Which is why I am similarly taken aback to hear RAM prices quoted for
> something that has obvious solutions in plain old Java.

I'm pretty surprised to hear a clean object model described as "brute force",
but OK. The point of a spirited discussion is to expose all sides of an issue.

I'd go with clean design first, which to my mind an object model is, then play
around with non-expandable, hard-to-maintain, bug-prone parallel-array
solutions if the situation truly demanded it, but I just don't see that demand
in the scenario under discussion.

--
Lew

Alan Gutierrez

unread,
Jul 27, 2010, 9:10:01 PM7/27/10
to

The scenario under discussion is, I want to do something that will reach
the limits of system memory. Your solution is procure memory. My
solution is to use virtual memory.

Again, it seems to me that `MappedByteBuffer` and a bunch of little
facades to the contents of the `MappedByteBuffer` is a preferred
solution that respects memory usage. The design is as expandable,
easy-to-maintain and bug free as a great big array of objects, without
having to think much about memory management at all.

I don't know where "parallel" arrays come into play in the problem
described. I'm imagining that, if the records consist entirely of
numeric values, that you can treat them as fixed length records.

Alan Gutierrez

unread,
Jul 27, 2010, 9:15:06 PM7/27/10
to

I wasn't really picking on IBM.

I was addressing the fallacy of the appeal to authority. The argument
that a monolithic system contains institutionalized knowledge that is
superior to any other solution offered to a problem that the monolithic
system could conceivably address.

Lew

unread,
Jul 27, 2010, 9:25:44 PM7/27/10
to
Alan Gutierrez wrote:
> The scenario under discussion is, I want to do something that will reach
> the limits of system memory. Your solution is procure memory. My
> solution is to use virtual memory.
>
> Again, it seems to me that `MappedByteBuffer` and a bunch of little
> facades to the contents of the `MappedByteBuffer` is a preferred
> solution that respects memory usage. The design is as expandable,
> easy-to-maintain and bug free as a great big array of objects, without
> having to think much about memory management at all.

I like that idea.

> I don't know where "parallel" arrays come into play in the problem

Did you read this thread? Like, say, yesterday, when Tom McGlynn wrote:
>>> E.g., suppose I were running a simulation of galaxy mergers
>>> of two 100-million-star galaxies. Stars differ only in position,
>>> velocity and mass. Rather than creating 200 million Star objects
>>> I might create a combination flyweight/singleton Star where each

>>> method call includes an index that is used to find the mutable
>>> state in a few external arrays.

Alan Gutierrez wrote:
> described. I'm imagining that, if the records consist entirely of
> numeric values, that you can treat them as fixed length records.

--
Lew

Alan Gutierrez

unread,
Jul 27, 2010, 10:03:32 PM7/27/10
to
Lew wrote:
> Alan Gutierrez wrote:
>> The scenario under discussion is, I want to do something that will reach
>> the limits of system memory. Your solution is procure memory. My
>> solution is to use virtual memory.
>>
>> Again, it seems to me that `MappedByteBuffer` and a bunch of little
>> facades to the contents of the `MappedByteBuffer` is a preferred
>> solution that respects memory usage. The design is as expandable,
>> easy-to-maintain and bug free as a great big array of objects, without
>> having to think much about memory management at all.
>
> I like that idea.

Oh, yeah! Well another thing mister... You, I, uh, but... Wait...

Well, golly gee. Thanks.

I'd run off to write some code to illustrate my point.

package comp.lang.java.programmer;

import java.nio.ByteBuffer;

public interface ElementIO<T> {
public void write(ByteBuffer bytes, int index, T item);
public T read(ByteBuffer bytes, int index);
public int getRecordLength();
}

package comp.lang.java.programmer;

import java.nio.MappedByteBuffer;
import java.util.AbstractList;

public class BigList<T> extends AbstractList<T> {
private final ElementIO<T> io;

private final MappedByteBuffer bytes;

private int size;

public BigList(ElementIO<T> io, MappedByteBuffer bytes, int size) {
this.io = io;
this.bytes = bytes;
this.size = size;
}

// result is not `==` to value `set` so only use element type that
// defines `equals` (and `hashCode`).
@Override
public T get(int index) {
return io.read(bytes, index * io.getRecordLength());
}

@Override
public T set(int index, T item) {
if (index < 0 || index >= size) {
throw new IndexOutOfBoundsException();
}
T result = get(index);
io.write(bytes, index * io.getRecordLength(), item);
return result;
}

@Override
public void add(int index, T element) {
size++;
// probably off by one, but you get the idea...
for (int i = size - 2; i >= index; i--) {
set(index + 1, get(index));
}
set(index, element);
}

// and `remove` and the like, but of course only `get`, `set`
// and `add` to the very end can be counted on to be performant.

@Override
public int size() {
return size;
}
}

Create the above with however much `MappedByteBuffer` you need for your
Universe. Define `ElementIO` to read and write your `Star` type. Each
time you read a `Star` in `ElementIO` you do mint a new `Star` so that
is like Flyweight in some way, but seems like a little `Bridge` or
`Adaptor`.

If you shutdown soft and record the size, you can reopen the list. If
you change `Star` you need need to update `ElementIO` and rebuild your
list, but not probably not your code that references `Star` or the
`BigList`.

>> I don't know where "parallel" arrays come into play in the problem
>
> Did you read this thread? Like, say, yesterday, when Tom McGlynn wrote:
>>>> E.g., suppose I were running a simulation of galaxy mergers
>>>> of two 100-million-star galaxies. Stars differ only in position,
>>>> velocity and mass. Rather than creating 200 million Star objects
>>>> I might create a combination flyweight/singleton Star where each
>>>> method call includes an index that is used to find the mutable
>>>> state in a few external arrays.

I see it now. Looking for the word parallel in the long thread didn't
find it for me, but that's what is described here. That does sound a bit
fragile.

Anyway, it seems like there is a middle ground between ORM+RMDBS and
everything in memory. My hobby horse. (Rock, rock, rock.)

Andreas Leitgeb

unread,
Jul 28, 2010, 4:49:18 AM7/28/10
to
Lew <no...@lewscanon.com> wrote:
> I'd go with clean design first, which to my mind an object model is,
> then play around with non-expandable, hard-to-maintain, bug-prone
> parallel-array solutions if the situation truly demanded it,

Not sure, whether this "if the situation truly demanded it" is actually
an "if (false)" for you, but in case it isn't, then we reached agreement.

We might still disagree for certain real situations, though ;-)

Martin Gregorie

unread,
Jul 28, 2010, 6:48:11 AM7/28/10
to
On Tue, 27 Jul 2010 20:15:06 -0500, Alan Gutierrez wrote:

> I wasn't really picking on IBM.
>

Fair point. The 'nobody got fired for buying IBM' hit my reaction button.
It had rather dire connotations in the past, as in 'if you DON'T buy IBM,
our senior execs will visit your senior execs and you *will* be fired and
put on our black list'. I knew one or two people whose bosses had
received that visit when they bought 3rd party disks.

> I was addressing the fallacy of the appeal to authority. The argument
> that a monolithic system contains institutionalized knowledge that is
> superior to any other solution offered to a problem that the monolithic
> system could conceivably address.
>

Sure: a myth that's perpetrated by said monoliths and bought into by
their adherents: it saves the adherents from having to think.

Tom McGlynn

unread,
Jul 28, 2010, 7:23:59 AM7/28/10
to
On Jul 27, 9:25 pm, Lew <no...@lewscanon.com> wrote:

> Alan Gutierrez wrote:
id you read this thread? Like, say, yesterday, when Tom McGlynn
wrote:
>
> >>> E.g., suppose I were running a simulation of galaxy mergers
> >>> of two 100-million-star galaxies. Stars differ only in position,
> >>> velocity and mass. Rather than creating 200 million Star objects
> >>> I might create a combination flyweight/singleton Star where each
> >>> method call includes an index that is used to find the mutable
> >>> state in a few external arrays.
> Alan Gutierrez wrote:
> > described. I'm imagining that, if the records consist entirely of
> > numeric values, that you can treat them as fixed length records.
>


I'm a little intrigued by the discussion of the appropriate choices
for
the architecture for an n-body calculation by a group which likely has
little experience if any in the field. Note that this has been an
area of continuous
study for substantially longer than the concept of relational
databases has existed:
the first N-body calculations by digital computers were made in the
1950's.
My own experience here is woefully out of date, but below are a couple
of reasons
why I might consider an architecture similar to what I gave as an
illustration
earlier. The motivation for that example was to illustrate the
othogonality
of my understanding of singleton and flyweight, but there could be
reasons
to go this route. E.g.,

Direct n-body calculations need to compute the distance between
pairs of
objects. The distances between nearby objects need to be calculated
orders
of magnitude more frequently than between distant objects. If the
data can
be organized such that nearby in [simulated] space stars tend to be
nearby
in memory, then cache misses may be substantially reduced. This can
improve
performance by an order of magnitude or more. In Java the only
structure you
have available that allows for managing this (since nearby pairs
change
with time) is a primitive array. Java gives no way, as far as I know,
to
manage the memory locations of distinct objects.

Since the actual n-body calculation will often have been highly
optimized in some
other language, the role of Java code in an n-body system may be to
provide initial
conditions to, show the status of, or analyze the results of the
calculation.
Communication with the core calculation might use JNI, shared memory,
or other I/O techniques.
In each of these the fact that one dimensional primitive arrays share
a common
model between languages makes them an attractive way of passing the
data.


Note that I'm not saying that this approach must or even should be
used: just that it
can make sense in realistic circumstances. However personally -- and
given the GOFs
endorsement there is some support more broadly -- I don't see the use
of this kind of
simple flyweight as a particularly odious approach.


Regards,
Tom McGlynn

Lew

unread,
Jul 28, 2010, 8:58:41 AM7/28/10
to

Yeah, because people with multi-million dollar/euro/yuan budgets never, ever
think about how they spend their money, and there's just no chance that IBM
got where it is by not delivering what they promise for mission-critical systems.

I thought this was supposed to be a group of intelligent, educated, technical
people.

--
Lew

Martin Gregorie

unread,
Jul 28, 2010, 9:41:11 AM7/28/10
to

I wasn't even thinking about the big shots (and remember the 360/195 that
never was?) - more when back in the day the Big Blue SEs were considered
to be the gods of system design and implementation.

Robert Klemme

unread,
Jul 28, 2010, 10:45:21 AM7/28/10
to
On 25.07.2010 16:39, Lew wrote:
> Magnus Warker a écrit :
>
> Don't top-post!

>
>>> It's the readibility of the code.
>>>
>>> With constant classes I can write something like this:
>>>
>>> public void setColor (Color color)
>>> {
>>> if (color == Color.WHITE)
>
> No. That would be a bug. You'd write 'if ( color.equals( Color.WHITE ) )'.

That depends on the rest of Color's class definition (whether there are
public constructors, whether the class is serializable and whether
custom deserialization is in place - all stuff someone who makes this an
enum does not have to take care of manually btw). For enums (whether as
language construct or properly implemented although this is a bad idea
since Java 5 IMHO) I would rather use "==" here because it is more
efficient and stands out visually.

> Since enums are classes, they can contain behavior. That means you won't
> need if-chains nor case constructs to select behavior; just invoke the
> method directly from the enum constant itself and voilà!

One can even have custom code *per enum value* which makes implementing
state patterns a breeze. See

http://java.sun.com/j2se/1.5.0/docs/guide/language/enums.html

Kind regards

robert


--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Lew

unread,
Jul 28, 2010, 12:13:32 PM7/28/10
to
On Jul 28, 10:45 am, Robert Klemme <shortcut...@googlemail.com> wrote:
> On 25.07.2010 16:39, Lew wrote:
>
> > Magnus Warker a écrit :
>
> > Don't top-post!
>
> >>> It's the readibility of the code.
>
> >>> With constant classes I can write something like this:
>
> >>> public void setColor (Color color)
> >>> {
> >>> if (color == Color.WHITE)
>
> > No. That would be a bug. You'd write 'if ( color.equals( Color.WHITE ) )'.
>
> That depends on the rest of Color's class definition (whether there are
> public constructors, whether the class is serializable and whether
> custom deserialization is in place - all stuff someone who makes this an
> enum does not have to take care of manually btw).  For enums (whether as
> language construct or properly implemented although this is a bad idea
> since Java 5 IMHO) I would rather use "==" here because it is more
> efficient and stands out visually.
>

Yeah, I was corrected on that some time upthread and already
acknowledged the error. I had it in my head that we were dicussing
String contants rather than final instances of the 'Color' class.

As for using == for enums, that is guaranteed to work by the language
and is best practice.

As for writing type-safe enumeration classes that are not enums,
there's nothing wrong with that if you are in one of those corner use
cases where Java features don't quite give you what you want. The
only one I can think of is inheritance from an enumeration. However,
I agree with you in 99.44% of cases - it's almost always bad to extend
an enumeration and Java enums pretty much always do enough to get the
job done. So when one is tempted to write a type-safe enumeration
class that is not an enum, one is almost certainly making a design
mistake and an implementation faux-pas.

> One can even have custom code *per enum value* which makes implementing
> state patterns a breeze.

I have not so far encountered a real-life situation where a 'switch'
on an enum value is cleaner or more desirable than using enum
polymorphism. For state machines I'm more likely to use a Map (or
EnumMap) to look up a consequent state than a switch. Have any of you
all found good use cases for a 'switch' on an enum?

--
Lew

Robert Klemme

unread,
Jul 29, 2010, 1:31:38 AM7/29/10
to
On 07/28/2010 06:13 PM, Lew wrote:
> On Jul 28, 10:45 am, Robert Klemme<shortcut...@googlemail.com> wrote:
>> On 25.07.2010 16:39, Lew wrote:
>>
>>> Magnus Warker a écrit :
>>
>>>>> It's the readibility of the code.
>>
>>>>> With constant classes I can write something like this:
>>
>>>>> public void setColor (Color color)
>>>>> {
>>>>> if (color == Color.WHITE)
>>
>>> No. That would be a bug. You'd write 'if ( color.equals( Color.WHITE ) )'.
>>
>> That depends on the rest of Color's class definition (whether there are
>> public constructors, whether the class is serializable and whether
>> custom deserialization is in place - all stuff someone who makes this an
>> enum does not have to take care of manually btw). For enums (whether as
>> language construct or properly implemented although this is a bad idea
>> since Java 5 IMHO) I would rather use "==" here because it is more
>> efficient and stands out visually.
>
> Yeah, I was corrected on that some time upthread and already
> acknowledged the error. I had it in my head that we were dicussing
> String contants rather than final instances of the 'Color' class.

I hadn't made it through all branches of the thread. I should have read
to the end before posting. Sorry for the additional noise.

> As for using == for enums, that is guaranteed to work by the language
> and is best practice.
>
> As for writing type-safe enumeration classes that are not enums,
> there's nothing wrong with that if you are in one of those corner use
> cases where Java features don't quite give you what you want. The
> only one I can think of is inheritance from an enumeration. However,
> I agree with you in 99.44% of cases - it's almost always bad to extend
> an enumeration and Java enums pretty much always do enough to get the
> job done. So when one is tempted to write a type-safe enumeration
> class that is not an enum, one is almost certainly making a design
> mistake and an implementation faux-pas.
>
>> One can even have custom code *per enum value* which makes implementing
>> state patterns a breeze.
>
> I have not so far encountered a real-life situation where a 'switch'
> on an enum value is cleaner or more desirable than using enum
> polymorphism. For state machines I'm more likely to use a Map (or
> EnumMap) to look up a consequent state than a switch. Have any of you
> all found good use cases for a 'switch' on an enum?

Personally I cannot remember a switched usage of enum, The only reasons
to do it that come to mind right now are laziness and special
environments (e.g. where you must reduce the number of classes defined
for resource reasons, maybe Java mobile). But then again, you would
probably rather use ints instead of an enum type...

Tom Anderson

unread,
Jul 29, 2010, 3:49:59 AM7/29/10
to
On Wed, 28 Jul 2010, Lew wrote:

>> One can even have custom code *per enum value* which makes implementing
>> state patterns a breeze.
>
> I have not so far encountered a real-life situation where a 'switch' on
> an enum value is cleaner or more desirable than using enum polymorphism.
> For state machines I'm more likely to use a Map (or EnumMap) to look up
> a consequent state than a switch. Have any of you all found good use
> cases for a 'switch' on an enum?

We've done it once. We have some internationalisation code where,
approximating wildly, items can be internationalised by locale or by
country (ie shared across all languages in a locale - prices are the
classic example). We have an enum called something like LocalisationType
with values LOCALE and COUNTRY to identify which is being done. We do have
some polymorphism around it (not actually in the enum, although for the
purposes of this story, that's not interesting), related to the core
business of finding out the right localisation key (locale code or country
code) and resolving the item value for it.

But we also have other bits of code which are not core functionality which
need to do different things for locale- and country-keyed items. The one
that springs to mind is a locale copying utility - if you're copying fr_FR
into CA to create fr_CA (obviously, only as a starting point for manual
editing), you want to copy locale-mapped items (which will be
French-language text) but not location-mapped items (which will be prices
in euros and so on). We could have put a shouldCopyWhenCopyingLocale()
method on the enum, or even a copyIntoNewLocale() method which did nothing
in the location case, but this seemed like polluting the enum with
behaviour that belonged in the copier. So we put a switch in the copier
instead.

There's probably a more OO-clean way to break the decision up, but this
was simple and worked, so that was good enough for us.

tom

--
Know who said that? Fucking Terrorvision, that's who. -- D

Tom Anderson

unread,
Jul 29, 2010, 7:20:21 AM7/29/10
to
On Tue, 27 Jul 2010, Tom McGlynn wrote:

> On Jul 26, 1:33 pm, Tom Anderson <t...@urchin.earth.li> wrote:
>> On Mon, 26 Jul 2010, Tom McGlynn wrote:
>>> On Jul 26, 8:01 am, Tom Anderson <t...@urchin.earth.li> wrote:
>>
>>>> But Joshua was talking about using instances of Color, where those
>>>> instances are singletons (well, flyweights is probably the right term
>>>> when there are several of them)
>>>

>>> I don't think flyweights is the right word.  For me flyweights are
>>> classes where part of the state is externalized for some purpose. This
>>> is orthogonal to the concept of singletons. E.g., suppose I were
>>> running a simulation of galaxy mergers of two 100-million-star
>>> galaxies.  Stars differ only in position, velocity and mass.  Rather
>>> than creating 200 million Star objects I might create a combination
>>> flyweight/singleton Star where each method call includes an index that
>>> is used to find the mutable state in a few external arrays.
>>
>> I am 90% sure that is absolutely not how 'flyweight' is defined in the
>> Gang of Four book
>

> Here's a bit of what the GOF has to say about flyweights. (Page 196
> in my version)....
>
> "A flyweight is a shared object that can be used in multiple contexts
> simultaneously. The flyweight acts as an independent object in each
> context--it's indistinguishable from an instance of the object that's
> not shared.... The key concept here is the distinction between intrinsic
> and extrinsic state. Intrinsic state is stored in the flyweight. It
> consists of information that's independent of the flyweight's context,
> thereby making it shareable. Extrinsic state depends on and varies with
> the flyweights context and therefore can't be shared. Client objects
> are responsible for passing extrinsic state to the flyweight when it
> needs it."
>
> That's reasonably close to what I had in mind.

Yes, point taken. I'm still not happy with your usage, though.

IIRC, the example in GoF is of a Character class in a word processor. So,
a block of text is a sequence of Character objects. Each has properties
like width, height, vowelness, etc and methods like paintOnScreen. But
because every lowercase q behaves much the same as every other lowercase
q, rather than having a separate instance for every letter in the text, we
have one for every distinct letter.

The extrinsic state in this example is the position in the text, the
typeface, the style applied to the paragraph, etc. Certainly, things that
are not stored in the Character. But also not things that intrinsically
belong in the Character anyway; rather, things inherited from enclosing
objects.

Whereas in your case, the array offset *is* something intrinsic to the
Star. If it had been something else, say the coordinates of the centre of
mass of the local cluster, then i'd agree that that was Flyweightish. But
i'm not so sure about the array index.

It might well be that i have an over-narrow idea of what a Flyweight is.

> Getting back to my original concern, I don't think enumeration is a good
> word for the concept either. Enumerations are often used for an
> implementation of the basis set -- favored in Java by special syntax.
> However the word enumeration strongly suggests a list. In general the
> set of special values may have a non-list relationship (e.g., they could
> form a hierarchy). I like the phrase 'basis set' I used above but that
> suggests that other elements can be generated by combining the elements
> of the basis so it's not really appropriate either.

I can't think of a good word for this. Do we need one? What are some
examples of this pattern in the wild?

tom

--
If you're going to print crazy, ridiculous things, you might as well
make them extra crazy. -- Mark Rein

Tom Anderson

unread,
Jul 29, 2010, 7:29:38 AM7/29/10
to
On Wed, 28 Jul 2010, Tom McGlynn wrote:

> I'm a little intrigued by the discussion of the appropriate choices for
> the architecture for an n-body calculation by a group which likely has
> little experience if any in the field.
>

> Direct n-body calculations need to compute the distance between pairs of
> objects. The distances between nearby objects need to be calculated
> orders of magnitude more frequently than between distant objects. If
> the data can be organized such that nearby in [simulated] space stars
> tend to be nearby in memory, then cache misses may be substantially
> reduced. This can improve performance by an order of magnitude or more.
> In Java the only structure you have available that allows for managing
> this (since nearby pairs change with time) is a primitive array. Java
> gives no way, as far as I know, to manage the memory locations of
> distinct objects.

True, although it doesn't prevent cache locality - whereas the parallel
arrays approach immediately rules out locality of the coordinates of a
single star, because they'll be in different arrays. If you want locality,
you'd have to pack the values of all three coordinates into one big array,
which of course is possible.

The dual of the fact that java doesn't let you control locality is that
JVMs are free to control it. There is research going back at least ten
years now into allocation and GC strategies that improve locality. Indeed,
for some popular kinds of collectors, locality is a standard side-effect -
any moving collector where objects are moved in a depth-first traversal of
(some subgraph of) the object graph will tend to put objects shortly after
some other object that refers to them, and thus also close to objects that
are also referred to by that object. It may not help enormously for
mesh-structured object graphs, but it works pretty well for the trees that
are common in real life. If these Stars are held in an octree, for
example, we might expect decent locality.

Tom McGlynn

unread,
Jul 29, 2010, 8:51:33 AM7/29/10
to

Putting everything in a single array is certainly fine. It doesn't
change the basic flyweight idea here.

Since in most n-body codes stars are never destroyed, garbage
collection as such may not come into play much. I'd be curious how
much storage reallocation is done in current JVMs where there is very
little creation or destruction of objects. Of course if one built a
changeable star hierarchy then one would be continually creating and
destroying branch nodes of the hierarchy (as stars move) even though
the leaf nodes would be immortal. Perhaps the churn of branch nodes
would be enough for the gc to move the leafs appropriately.

It is certainly possible that a clever enough JVM could address this
automatically and efficiently. The n-body code may have the advantage
in that it can do things predictively rather than reactively, but a
system approach would likely be able to adapt to changes in the local
environment better.

But I'm not trying to persuade people that using flyweights in the
sense that I suggested in my example is necessarily the right thing to
do in all circumstances, merely to note that it may be a rational or
even desirable approach in some. That example was given to give a
concrete realization of an object that simultaneously implemented
flyweight and singleton, not for its intrinsic merit but I was a
little bemused by reaction to it.

Regards,
Tom McGlynn

Lew

unread,
Jul 29, 2010, 9:00:43 AM7/29/10
to
Tom McGlynn wrote:
> But I'm not trying to persuade people that using flyweights in the
> sense that I suggested in my example is necessarily the right thing to
> do in all circumstances, merely to note that it may be a rational or
> even desirable approach in some. That example was given to give a
> concrete realization of an object that simultaneously implemented
> flyweight and singleton, not for its intrinsic merit but I was a
> little bemused by reaction to it.

When dealing with 200 M stars, one might be tempted to use a multi-threaded
approach. Sharing a singleton among threads introduces the complexity and
overhead of synchronization. The straightforward object approach, combined
with appropriate caching (the buffer approach or DBMS approach) simplifies
concurrent implementations.

--
Lew

Tom McGlynn

unread,
Jul 29, 2010, 9:24:56 AM7/29/10
to

Hmmm.... The GOF has the location of the letter as extrinsic state,
while I'm suggesting the location of the star. Seems pretty
comparable. My use of the
array index is simply the way I supply the extrinsic information, it's
not intrinsic to a given Star [and in fact the index of the same
'Star' might change during the simulation, since the array is likely
to be resorted continually in the process]. I wonder if the sticking
point is the lack of any internal state. But the GOF notes that the
FlyWeight is particularly applicable when "Most object state can be
made extrinsic". This doesn't mean that Star isn't a real class. We
can have lots of methods in the Star class, e.g.,
distanceFromNeighbor(i), move(), accelerate(), force(), ...

....back to what word to use...


>
> I can't think of a good word for this. Do we need one? What are some
> examples of this pattern in the wild?
>

I was inspired to start this subthread by the sense that something was
missing given your use of flyweight for this concept. As for
examples:

Enumerations are one of course. Another might be the states in finite
state machines.

I think the concept of "An a priori known and unalterable set of
instances" comes up fairly commonly in code and it might be useful to
be able to convey that quickly. If thats true perhaps the first step
is to agree about what the concept is and then worry about the word.


Regards,
Tom McGlynn

Tom Anderson

unread,
Jul 29, 2010, 9:25:15 AM7/29/10
to
On Thu, 29 Jul 2010, Lew wrote:

> Tom McGlynn wrote:
>
>> But I'm not trying to persuade people that using flyweights in the
>> sense that I suggested in my example is necessarily the right thing to
>> do in all circumstances, merely to note that it may be a rational or
>> even desirable approach in some. That example was given to give a
>> concrete realization of an object that simultaneously implemented
>> flyweight and singleton, not for its intrinsic merit but I was a little
>> bemused by reaction to it.
>
> When dealing with 200 M stars, one might be tempted to use a
> multi-threaded approach. Sharing a singleton among threads introduces
> the complexity and overhead of synchronization.

McGlynn's flyweights are immutable, so sharing them between threads should
be fine.

The mutable state lives in the parallel arrays, and access to those would
need to be controlled. *That* could be tricky, because there's no natural
object to lock on if you want to access a given row. You certainly can't
do it with a flyweight, because the same flyweight instance will be in use
by other threads to represent entirely different stars!

> The straightforward object approach, combined with appropriate caching
> (the buffer approach or DBMS approach) simplifies concurrent
> implementations.

Hmm. I don't think Star-level locking would be a good idea, regardless of
how Stars work. You'd want to partition larger units between threads.
Given that you'll have to build a mechanism for controlling concurrency at
that level anyway, whether you have real stars or flyweights might not
matter very much.

tom

--
The art of medicine consists in amusing the patient while nature cures
the disease. -- Voltaire

Alan Gutierrez

unread,
Jul 29, 2010, 10:39:27 AM7/29/10
to

I don't think the state of the `Star` can be extract from the `Star`.
The idea behind the Word Processor example is that you can cache the
font size in an object along with the character code itself, and have an
object that can be reused and reset into the document at any location,
and then participate in a `Composite` pattern, where the characters
participate as tiny graphical objects.

But this implies that 11pt Helvetica 'C' is one object that is reused.
I'm assuming that each of these `Star` objects will have different vales
entirely, therefore `Flyweight` does not apply much at all.

I've called this a tiny `Adaptor` because you're going to take a
`MappedByteBuffer` or parallel arrays of primitives, or something
structure that stores the state of the object, and when you need an
object, create a temporary wrapper around the state of a `Star` stored
at a particular index.

Tom McGlynn

unread,
Jul 29, 2010, 12:34:47 PM7/29/10
to

By the GOF's definition an Adapter is used to convert one interface to
another. So given what I would call a FlyWeight style interface

interface IndexedStar {
double[] getPosition(i)
}

then if you want to have a
interface NonIndexedStar {
double[] getPosition()
}

you could create an adapter class

class StarAdapter {
int index;
IndexedStar base;
StarAdapter(int i, IndexedStar star} {
index = i;
this.base = star;
}
double[] getPosition() {
return base.getPosition(i);
}
}

[I'm not suggesting this is a good way to go, just trying
to clarify what the terms mean to me.]

Why is the position of a star any more 'intrinsic' to the star, than
the position of a character is to the character? I think the
difference in perception comes from the fact that as this toy problem
has been set up, there is no internal state for the star. Suppose we
make the problem a little more complex. We have 25 classes of star:
5 ages x 5 spectral types with 4,000,000 of each. Now there are 25
flyweight instances which have different masses, temperatures,
brightness, color, metallicity, magnetic fields, whatever... If I
want to generate an image of the simulation at some time, I need to
use the internal state of the flyweights to generate the image just as
we need the actual patterns of each glyph to generate a page of text.
If this isn't a flyweight what's missing? If it is, note that I
could be using an identical mechanisms to store the position/
velocity... as before.

So I think the issue is that people are unfamiliar with FlyWeights
with little internal state but perhaps I'm missing some more essential
difference.

Maybe this goes back to your thoughts on this being an adapter, which
I'll recast in a positive way: Use of a no-internal state flyweight
can be used to as an adapter to give an object oriented interface for
elements of arrays or other structures.

Regards,
Tom McGlynn

Alan Gutierrez

unread,
Jul 31, 2010, 9:06:39 PM7/31/10
to

On another branch of this thread, I wrote some code that actually
compiles, that described a `BigList` that had an `ElementIO`. I'm going
to put it here again for your reference. Hope no one minds.

package comp.lang.java.programmer;

import java.nio.ByteBuffer;

package comp.lang.java.programmer;

import java.nio.MappedByteBuffer;
import java.util.AbstractList;

private final MappedByteBuffer bytes;

private int size;

With these generics we could build an implementation of `StarIO`.

public class StarIO implements ElementIO<Star> {
public void write(ByteBuffer bytes, int index, Star item) {
bytes.putDouble(index, item.getPosition[0]);
bytes.putDouble(index + (Double.SIZE / Byte.SIZE),
item.getPosition[0]);
}

public Star read(ByteBuffer bytes, int index) {
double[] position = new double[] {
bytes.getLong(index),
bytes.getLong(index + (Double.SIZE / Byte.SIZE))
};
return new Star(position);
}

public int getRecordLength() {
return (Double.SIZE / Byte.SIZE) * 2;
}
}

Usage is then like one big list. When you write, you don't actually put
the `Star` in an array, you write out the values. When you read you
create a new `Star` read from the underlying `MappedByteBuffer`.

Therefore, to my mind, the `Star` is an Adaptor for the
`MappedByteBuffer` where the strategy for an Adaptor is to:

Convert the interface of a class into another interface clients expect.
Adaptor lets classes work together that couldn't otherwise because of
incompatible interfaces.

Whereas the Flyweight strategy is to:

Use sharing to support large numbers of fine grained objects efficiently.

The solution we are discussing address the problem of supporting large
numbers of fine grained efficiently, but not through sharing.

Thus, there's an opportunity to name a new pattern, if you'd like.

I am not arguing semantics, I don't think, but really trying to
understand what patterns are in play. The interesting concept that makes
one think of flyweight is that the Adaptor is short-lived, which is why,
when you say Flyweight, the name seems apropos, but I believe there's a
pattern here that needs its own name.

This is the pattern that is used by any of the ORM tools. Create a short
lived typed object around a string or binary data for the sake of the
client.

Tom McGlynn

unread,
Aug 3, 2010, 12:02:19 AM8/3/10
to
On Jul 31, 9:06 pm, Alan Gutierrez <a...@blogometer.com> wrote:


> public class StarIO implements ElementIO<Star> {
> public void write(ByteBuffer bytes, int index, Star item) {
> bytes.putDouble(index, item.getPosition[0]);
> bytes.putDouble(index + (Double.SIZE / Byte.SIZE),
> item.getPosition[0]);
> }
>
> public Star read(ByteBuffer bytes, int index) {
> double[] position = new double[] {
> bytes.getLong(index),
> bytes.getLong(index + (Double.SIZE / Byte.SIZE))
> };
> return new Star(position);
> }
>
> public int getRecordLength() {

While this might be a fine way to implement things, I don't think
Star's
created this way are FlyWeights. It looks like all of a Star's state
is internal so they are just standard objects. By definition (at
least
the GOF's) FlyWeights have external state. Not that you have to use
FlyWeight's but it was in trying to illustrate them that I brought
up the example.


> Usage is then like one big list. When you write, you don't actually put
> the `Star` in an array, you write out the values. When you read you
> create a new `Star` read from the underlying `MappedByteBuffer`.
>
> Therefore, to my mind, the `Star` is an Adaptor for the
> `MappedByteBuffer` where the strategy for an Adaptor is to:
>
> Convert the interface of a class into another interface clients expect.
> Adaptor lets classes work together that couldn't otherwise because of
> incompatible interfaces.

For me an Adaptor class doesn't change the semantics of the interface
but just the details of the implementation. E.g., one interface has
double getX(); double getY(); double getZ();
and the other has
double[] getPosition();
They both have the idea of "get the position", but differ in the
implementation
details. An Adapter bridges the difference and allows a Class
expecting
objects of the first type to use the second.

n your example I see our Star as an implementation of the
MappedByteBuffer
but I'm not sure which -- if any -- of the GOF patterns the
relationship of Star and MappedByteBuffer represents for me.


>
> Whereas the Flyweight strategy is to:
>
> Use sharing to support large numbers of fine grained objects efficiently.
>
> The solution we are discussing address the problem of supporting large
> numbers of fine grained efficiently, but not through sharing.

Not sure what you are saying here.

My original approach had sharing.

There was a single actual star instance that's
shared by each logical star. How much more sharing can you get! My
guess
is that it's the fact that sharing is taken to the limit that causes
some of the discomfort people evinced here. They don't like a
FlyWeight
that uses only a single actual instance.

But you're right that you don't need FlyWeight's to address
the issue. Your approach using an external cache might well work fine
though
I'd be concerned if I had to create an object every access. Even the
relatively
efficient object creation that Java now has needs to be amortized over
a fair bit
of computation. Perhaps your approach could be called something like
'lazy
transient instantiation with an external cache'. But it wouldn't have
made my
original point of demonstrating the orthogonality of FlyWeights and
Singletons.

>
> Thus, there's an opportunity to name a new pattern, if you'd like.
>
> I am not arguing semantics, I don't think, but really trying to
> understand what patterns are in play. The interesting concept that makes
> one think of flyweight is that the Adaptor is short-lived, which is why,
> when you say Flyweight, the name seems apropos, but I believe there's a
> pattern here that needs its own name.

For me FlyWeight's need to share external state. They need to be
'lighter'
than an equivalent 'normal' object would be.
My guess is that a lot of this has to do with the relationship you
want to emphasize. Given implementations will weave multiple patterns
together. By picking out one aspect we may emphasize one pattern.


>
> This is the pattern that is used by any of the ORM tools. Create a short
> lived typed object around a string or binary data for the sake of the
> client.
>

It's always more interesting when other people's views differ from
one's
own (though perhaps less gratifying), and I've found the the
discussion
very intriguing and it's alway useful to reread the source text.

Regards,
Tom

0 new messages