MultiKey has several possible uses, most of which we've tried to
support with more specialized classes.
1. map.put(new MultiKey(url, today), latencyHistogram);
Here we're storing the latency distribution for a given URL on a given
day. The keys are more or less independent. That's kind of a vague
statement, and I can't come up with a better one, so I'll give two
ideas that I hope will convey the flavor of what I mean by
"independent keys": First, if I know we have data for
http://www.google.com and I know we have data for Tuesday, we might
well have data for http://www.google.com on Tuesday. Second, I might
be interested in "all latency distributions for http://www.google.com,
regardless of day" or "all latency distributions today, regardless of
URL."
For this kind of data, we have the Table class:
Table<Keyword, LocalDate, LatencyHistogram>
Here it makes sense to ask for "all rows" (rowKeySet(); "all
keywords") or "all columns" (columnKeySet(); "all dates"). It also
makes sense to as for either "all values for keyword 'guava'"
(row(guava)) or "all values for today" (column(today)). Also,
Table<Keyword, LocalDate, LatencyHistogram> is much more explicit than
Map<MultiKey, LatencyHistogram>, especially if you end up with
multiple MultiKey-keyed maps.
2. map.put(new MultiKey(userName, jobName), jobDetails);
Here we still have a fixed number of keys, but there's a clearer
relationship between them. There might be two users running jobs with
the same name, but it's probably not very interesting to see all users
who happened to name one of their jobs "collector." There's a clear
hierarchy: user, then job.
Table can still be useful here. HashBasedTable and TreeBasedTable
aren't stored in memory as 2D arrays, so it's not wasteful for one row
to have values for columns A, B, and C and another key to have values
for columns for D, E, and F. Table again provides you with the
ability to ask for all users or all jobs for a given user, something
that you can't easily do with MultiKey or other solutions.
However, a lot of times it makes sense to simply define your own class:
public final class JobId {
private final String userName;
private final String jobName;
public JobId(String userName, String jobName) {
this.userName = checkNotNull(userName);
this.jobName = checkNotNull(jobName);
}
public String getUserName() {
return userName;
}
public String getJobName() {
return jobName;
}
@Override
public boolean equals(Object o) {
if (o instanceof JobId) {
JobId other = (JobId) o;
return userName.equals(other.userName)
&& jobName.equals(other.jobName);
}
return false;
}
@Override
public int hashCode() {
return userName.hashCode() * 37 + jobName.hashCode();
}
}
Java makes this much harder than it ought to be, but it took me maybe
2 minutes, even without an IDE to autogenerate much of it for me.
As in my first example, we benefit here from a clearer type name:
Map<JobId, JobDetails> vs. Map<MultiKey, JobDetails>. We also gain
the ability to use JobId elsewhere, like as a *value* to a Map.
(True, we could have a Map<Port, MultiKey> if we really wanted, but if
the <userName, jobName> identifier is coming up that often, it's
probably time to acknowledge it as a first-class concept and give it
its own type.) Nominal types like these can also help cut down on
methods with parameter types <String, String, String, String> (src
user, src job, dest user, dest job), replacing them with <JobId,
JobId> (src job, dest job).
3. map.put(pathComponents, fileMetadata);
This is somewhat contrived because I couldn't come up with a
non-contrived example, but it's conceivable that we'd store ["etc",
"passwd"] instead of "/etc/passwd" for some reason or other. Here we
have keys of arbitrary length, so we can't use the fixed-key-count
Table or custom-class solutions. If the key type is really an
arbitrarily sized series of items, we need another solution.
Luckily, we have a word for "arbitrarily sized series of items" in Java: List.
Map<ImmutableList<PathComponent>, FileMetadata>
List provides us with access to the components, should we require it,
including its usual methods like subList. Additionally, we can
specialize the element type (here, PathComponent, or more likely plain
"String") instead of MultiKey's "Object."
There are cases that these three approaches can't quite cover, like a
3D Table-like structure. But MultiKey is even less flexible. (Think
of it this way: If nothing else, you can always substitute
ImmutableList<?> for MultiKey.)
One more advantage I'd meant to include: If you ever need to add a new
dimension to a key (maybe it's OK for a user to run two jobs with the
same name as long as they're in different VMs), none of your Maps need
to change. You simply update the JobId class to include a vmId field,
and only at the edges of the system, where you initially create the
JobId objects, do you need to change your code. And the compiler can
help you, unlike in the MultiKey case.
2. map.put(new MultiKey(userName, jobName), jobDetails);
...
However, a lot of times it makes sense to simply define your own class:
...
Java makes this much harder than it ought to be, but it took me maybe
2 minutes, even without an IDE to autogenerate much of it for me.
3. map.put(pathComponents, fileMetadata);
Map<ImmutableList<PathComponent>, FileMetadata>
This is the crux of the matter, and there's certainly no good option,
just various bad ones. The intermediate option, which I've carefully
avoided mention of, is of course Pair<A, B> (and perhaps Triple<A, B,
C>, etc.). These classes are... contentious :) We've also got some
internal , reflection-based value-type magic, which may some day see
the light of day.
http://groups.google.com/group/guava-discuss/browse_thread/thread/185e10c81bb482c2/bbbf4f0f1dcffaf9
You can probably tell that I'm not a big fan of MultiKey and Pair, but
I'll admit that my dislike for them probably goes beyond what they
objectively deserve. It's probably become symbolic to me of the
conflict between hacky, short-term solutions and overengineered,
long-term solutions. And I'm nothing if not an overengineer.
>> 3. map.put(pathComponents, fileMetadata);
>>
>> Map<ImmutableList<PathComponent>, FileMetadata>
>
> I can't use this one since the items the make the key are not from the same
> class (and I don't want to cast everything to Object).
For the closest possible equivalent to MultiKey, declare the key type
as ImmutableList<?> to avoid having to cast your arguments or
explicitly specify ImmutableList.<Object>of(...):
Map<ImmutableList<?>, FileMetadata> metadata = newHashMap();
metadata.put(ImmutableList.of("etc", "passwd"), defaultMetadata); // fine
True story: I was pair programming (basically just watching), and the
need comes up to use as a map key the pair of two things, which
unfortunately were both strings too. So, Pair.of(thingy1, thingy2) of
course! I mentioned that this is, but let it go there were more
important things to focus on than the drudgery of writing the
boilerplate of a specific class. (But my "protest" was probably
communicated indirectly, since I was asking all the time what's the
first string and what's the second!). There were few mentions of this
Pair<String, String>. I left, and the next day I reviewed the code.
Now, there were 22 appearances of Pair<String, String> in a class of
~200 lines body! So I said that now we really ought to create a new
type here, and lo and behold, the type was created....and it was a
subclass of Pair<String, String> :)
> --
> guava-...@googlegroups.com.
> http://groups.google.com/group/guava-discuss?hl=en
> unsubscribe: guava-discus...@googlegroups.com
>
> This list is for discussion; for help, post to Stack Overflow instead:
> http://stackoverflow.com/questions/ask
> Use the tag "guava".
>
Vg
We don this kind of thing all the time and Project Lombok makes those classes 5 lines instead 25.
Nik
I'd like to avoid adding code just for a key class. It might be 2 minutes to implement, but about 25 more lines of code that are just noise.
I am a big fan of Pair<F, S>The worst use case of Pair<F, S> is precisely when both F and S are of the same type.Its pretty creative to subclass Pair in that case.But you cant always can come up with better names then first-second, first-last, previous-new.
Still, when you really just want to pair two different types of object,Pair<String, SomePojo> will be much more explicit than StringSomePojo.java about what is going on.For the reader, the Pair makes it clear that this is a pair, just a pair, with no hidden wisdom.
The reader will eventually get there too with StringSomePojo.java, but just after scanning those extra 25 lines once or twice, in fear of hidden wisdom, as usual.
People are (or should be) more interested in readability, which is not something easily described by LoC.
Less code than what? More readable than what? How about
Value value = myHashMap.get(
keyFromObj(obj));
Looks nicer to me.
--tim
Less code than what? More readable than what? How about
Value value = myHashMap.get(
keyFromObj(obj));Looks nicer to me.
--
guava-...@googlegroups.com.
http://groups.google.com/group/guava-discuss?hl=en
unsubscribe: guava-discus...@googlegroups.com
This list is for discussion; for help, post to Stack Overflow instead:
http://stackoverflow.com/questions/ask
Use the tag "guava".
I agree with you completely, Tim -- but the magnitude of those 25 lines demands a better solution be invented. Pair certainly ain't it. Lombok has its ups and downs (handwave). Our internal reflection-based utility is workable, but if you start to use it a lot the performance tax is very real. We still need something better....
Lombok has its ups and downs (handwave).