Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Best way to make a really big ImmutableMap

909 views
Skip to first unread message

fishtoprecords

unread,
Dec 13, 2009, 6:47:53 PM12/13/09
to Google Collections Library - users list
I've got a class that works fine, but its too big to make the javadoc
compiler happy. It runs out o memory, thinking its recursing forever.

private static final ImmutableMap<String, Float> names =
new ImmutableMap.Builder<String, Float>()
.put("rebecca", 1.0000f)
.put("cynthia", 0.9830f)
.put("emily", 0.9533f)
///

There are about 2000 lines of entries, and while the compiler has no
problem, javadoc blows up. I'm sure that some combination of -
Jmxvoodoo settings will let it work, but I was taking a more brute
force approach.

I chop up the big list into smaller parts, and want to then combine
the partitions into one really big ImmutableMap.

So assume I have "name1", name2, name3, ... and want to do something

public static final ImmutableMap<String, Float> bigListOfNames =
ImmutableMap.addAll(name1).addAll(name2).... name12...,builder().

But I can't find the right function, set of functions, etc.

What's the best way to do this?


limpb...@gmail.com

unread,
Dec 13, 2009, 10:18:17 PM12/13/09
to Google Collections Library - users list
On Dec 13, 3:47 pm, fishtoprecords <pat22...@gmail.com> wrote:
> I've got a class that works fine, but its too big to make the javadoc
> compiler happy. It runs out o memory, thinking its recursing forever.

Do you have any success when you remove the method chaining?

private static final ImmutableMap<String, Float> names;
static {
ImmutableMap.Builder<String, Float> builder = ImmutableMap.builder
();
builder.put("rebecca", 1.0000f);
builder.put("cynthia", 0.9830f);
...
names = builder.build();
}

Pat Farrell

unread,
Dec 13, 2009, 11:13:41 PM12/13/09
to limpb...@gmail.com, Google Collections Library - users list
Do you have any success when you remove the method chaining?

private static final ImmutableMap<String, Float> names;
static {
 ImmutableMap.Builder<String, Float> builder = ImmutableMap.builder
();
 builder.put("rebecca", 1.0000f);
 builder.put("cynthia", 0.9830f);
 ...
 names = builder.build();
}


I didn't think to try that.
I did have success breaking the 2000 entry list into 200 lines at a time, but then I had to use

private static HashMap<String, Float> tempMap = new HashMap<String, Float>();
static {
    tempMap.putAll(name1);
    tempMap.putAll(name3);
    tempMap.putAll(name5);
    tempMap.putAll(name7);
}
public static final ImmutableMap<String, Float> names = ImmutableMap.copyOf(tempMap);

which strikes me as a really ugly hack, forcoing a non-Immutable map just for the second or two.

One could argue that anything that is 2000 lines long is just too long, but its very fixed.
(Its the US Social Security Administrations list of popular boy and girl names for the past 60 years. Rank ordered.
At least the historical data is never going to change, so it begs for an Immutable Map.)

Willi Schönborn

unread,
Dec 14, 2009, 4:11:35 AM12/14/09
to Google Collections Library - users list
In my opinion it begs for a database.
> --
> Google Collections Library - users list
> http://groups.google.com/group/google-collections-dev?hl=en
>
> To unsubscribe, send email to:
> google-collections...@googlegroups.com

Youssef Mohammed

unread,
Dec 14, 2009, 6:00:03 AM12/14/09
to Google Collections Library - users list

In my opinion it begs for a database.

+1 for that but even if you are not using a db, this is a clear duplicate. You should refactor your code. Data would go into a file (xml or csv) and your method would be only a couple of lines of code. 


 

Pat Farrell

unread,
Dec 14, 2009, 12:46:43 PM12/14/09
to Willi Schönborn, Google Collections Library - users list

In my opinion it begs for a database.


You are, of course, entitled to any opinion you want, but why do you suggest this? The data is fixed, it will never change, putting it in the source code eliminates all the concerns about losing the DBMS, JDBC, someone changing it, etc.

The data is really an ImmutableMap. Seems natural to me to keep it in one. 

Favio DeMarco

unread,
Dec 14, 2009, 1:11:12 PM12/14/09
to Google Collections Library - users list
I also think a db, or a file (yaml, xml, properties, etc.), is a better place.
But your answer makes me think: why an ImmutableMap and not an Enum?
"An enum type is a type whose fields consist of a _fixed set of constants_."
http://java.sun.com/docs/books/tutorial/java/javaOO/enum.html
I'm just curious.
You can use Enum.valueOf().getRank() to get a value like you do with
the Map. And you lose the iterators, that's a downside.

2009/12/14 Pat Farrell <pat2...@gmail.com>:

Willi Schönborn

unread,
Dec 14, 2009, 1:24:34 PM12/14/09
to Google Collections Library - users list
Favio DeMarco wrote:
> I also think a db, or a file (yaml, xml, properties, etc.), is a better place.
> But your answer makes me think: why an ImmutableMap and not an Enum?
> "An enum type is a type whose fields consist of a _fixed set of constants_."
> http://java.sun.com/docs/books/tutorial/java/javaOO/enum.html
> I'm just curious.
> You can use Enum.valueOf().getRank() to get a value like you do with
> the Map. And you lose the iterators, that's a downside.
>
> 2009/12/14 Pat Farrell <pat2...@gmail.com>:
>
>>> In my opinion it begs for a database.
>>>
>>>
>> You are, of course, entitled to any opinion you want, but why do you suggest
>> this?
Thats just personal preference. I am working on a project right now
which was
started by another company and the old developer thought it would be a good
idea to hardcode all country codes inside a class. You might think thats
also
fixed data, but thats not true. So i might be biased ;)
Nevertheless, after thinking about it, i would also suggest an enum or a
file.
Regarding the iterator feature, you could easily write an adapter using
Enum.values(), Enum.valueOf(), etc to get the keys/values.

Greetings

Nikolas Everett

unread,
Dec 14, 2009, 1:25:42 PM12/14/09
to Favio DeMarco, Google Collections Library - users list
If the application doesn't already use a DB it'd be silly to do so.  If you had 200,000 names instead of 20,000 it'd start to really call for a db.  Holding and loading all 200,000 names in memory might be a bad idea.  Maybe.  Eh.

Having a source file with all the names would just be cumbersome when you have to change them.  I'm not so much thinking of when the data itself changes as when you want to store something else about it.  Maybe you'll want absolute counts one day.  I dunno.  I'd shove all the names in a file if I were you.

It'd be fun to have a script rip through a file containing all the names and build a java source file as part of the build process.  I'm not sure how useful it would be compared to just storing the file as a resource but its kind of neat nonetheless.  I'm probably just itching for an excuse to use Maven's scala:script.

Out of curiousity, what yaml parser are people using for java?

fishtoprecords

unread,
Dec 14, 2009, 5:26:55 PM12/14/09
to Google Collections Library - users list
> +1 for that but even if you are not using a db, this is a clear duplicate.
> You should refactor your code. Data would go into a file (xml or csv) and
> your method would be only a couple of lines of code.

So you are saying that XML or CSV is better than Java? Its all just
text, and there is nothing vaguely type safe about CSV.

I don't buy this at all. I see nothing duplicative about having
Immutable data. Once you have that, then
the choice of file format is an engineering decision, or perhaps
personal preference.

Going from the fairly clean code in the Java class to XML would be
much more verbose and error prone, and a step backwards IMHO.

My class has no real code in it, nothing executable. Its all just
data.

This is moving off into philosophy or theology, not addressing the
fundamental question. But so far, the idea that a Relational DBMS or
XML is better is pretty much ungrounded.

Johan Van den Neste

unread,
Dec 15, 2009, 3:41:53 AM12/15/09
to fishtoprecords, Google Collections Library - users list
Personally I think a DB increases your maintenance cost by a lot. If
the data doesn't change or doesn't change very often, you'll be
wasting a lot of time. If, on the other hand, non-programmers need to
update the data, or it needs to be updateable at runtime, then it
might be a plausable solution.

I'd keep it simple if simple is all you need in the foreseeable
future. I understand the urge to plan everything for the unknown, but
it is not always necessary, Consider the cost.

--
Johan Van den Neste

Pat Farrell

unread,
Dec 15, 2009, 12:04:57 PM12/15/09
to Johan Van den Neste, Google Collections Library - users list
On Tue, Dec 15, 2009 at 3:41 AM, Johan Van den Neste <jvdn...@gmail.com> wrote:
If, on the other hand, non-programmers need to
update the data, or it needs to be updateable at runtime, then it
might be a plausable solution.


Agreed. I don't see that the other alternatives help much with any potential non-programmers editing it. (which is not in my design space). I don't consider XML to be easy to edit for mortals. For most applications, if you expect "users" to change stuff, then you have to have a GUI editor with at least CRUD capabilities.

Again, we are not addressing my original topic question, I'd like a cleaner way than static HashMap that get used in an ImmutablMap. 

Kevin Bourrillion

unread,
Dec 15, 2009, 12:26:14 PM12/15/09
to Pat Farrell, Johan Van den Neste, Google Collections Library - users list
On Tue, Dec 15, 2009 at 9:04 AM, Pat Farrell <pat2...@gmail.com> wrote:

Again, we are not addressing my original topic question, I'd like a cleaner way than static HashMap that get used in an ImmutablMap. 

I thought Jesse solved that, in the first response of the thread?  Don't use chaining, but do everything else the same.

Incidentally, in my humble opinion, a database is the right choice for data that needs to change transactionally; configuration is a good choice for settings you might need to change without requiring a full QA cycle and push.  (That implies they're changes that would be *safe* to make without a full QA cycle.)  Code is perfect for everything else.


--
Kevin Bourrillion @ Google
internal:  http://go/javalibraries
external: guava-libraries.googlecode.com

fishtoprecords

unread,
Dec 15, 2009, 1:04:32 PM12/15/09
to Google Collections Library - users list
>>> Again, we are not addressing my original topic question,
> I thought Jesse solved that, in the first response of the thread?  Don't use
> chaining, but do everything else the same.

Sorry, I had not tested that, and got distracted by the other
discussion.
I did test it and it works fine. Thanks Jesse


>  Code is perfect for everything else.

I like using Java because it has expressions. And occasionally its
nice to be able to have more than just fixed text.
Reply all
Reply to author
Forward
0 new messages