Broken Map serialization in Guava 10.0

已查看 509 次
跳至第一个未读帖子

Niraj Tolia

未读,
2011年9月30日 13:49:262011/9/30
收件人 guava-discuss
While trying to upgrade to Guava 10.0, I noticed that a Map (generated
using MapMaker) that used to serialize just fine with r09 no longer
does. I get exceptions similar to the following with the sample code
included below. I understand that expireAfterAccess() and
evictionListener() is deprecated but was breaking previously supported
functionality intentional or an accident?

Cheers,
Niraj

java.io.NotSerializableException: com.google.common.collect.MapMaker$1
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1164)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1518)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1483)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1400)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1158)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:330)
at SerializationTest.testMapSerialization(SerializationTest.java:175)

Code:
private enum EvictionListener implements
MapEvictionListener<Long, String> {
/** The enum that be used as an eviction listener. */
EVICTION_LISTENER;

@Override
public void onEviction(final Long key, final String value) {
System.out.println(value);
}
}

@Test
public void testMapSerialization() throws Exception {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(baos);
final Map<Long, String> map = new MapMaker()
.expireAfterAccess(5, TimeUnit.SECONDS)
.evictionListener(EvictionListener.EVICTION_LISTENER)
.makeMap();
oos.writeObject(map);
}

Charles Fry

未读,
2011年9月30日 14:35:122011/9/30
收件人 Niraj Tolia、guava-discuss
As documented expireAfterAccess will be deleted in release 11.0. While it was clearly not intentional to break serialization, I think the most prudent step from your end would be to simply migrate to CacheBuilder. Would that work for you?

Charles

Niraj Tolia

未读,
2011年9月30日 14:44:032011/9/30
收件人 Charles Fry、guava-discuss
On Fri, Sep 30, 2011 at 11:35 AM, Charles Fry <f...@google.com> wrote:
> As documented expireAfterAccess will be deleted in release 11.0. While it
> was clearly not intentional to break serialization, I think the most prudent
> step from your end would be to simply migrate to CacheBuilder. Would that
> work for you?

Hi Charles,

Unfortunately, no. I have already migrated to CacheBuilder wherever
possible but, as I had mentioned in an earlier thread, the lack of
Cache.asMap().put() is a blocker in this particular case.

Also, do we have a timeline for when 11.0 might be out? If I
understand correctly, .put() should be supported there, right?

Cheers,
Niraj

Charles Fry

未读,
2011年9月30日 14:46:022011/9/30
收件人 Niraj Tolia、guava-discuss
Also, do we have a timeline for when 11.0 might be out? If I
understand correctly, .put() should be supported there, right?

Yes, that is the current plan.

Charles 

Ted Yu

未读,
2011年9月30日 14:47:432011/9/30
收件人 Charles Fry、Niraj Tolia、guava-discuss
Looking forward to 11.0.
We need it too.

Niraj Tolia

未读,
2011年9月30日 15:00:262011/9/30
收件人 Charles Fry、guava-discuss

And, do we have a potential release date?

Niraj

> Charles
>

Kevin Bourrillion

未读,
2011年9月30日 15:03:072011/9/30
收件人 Niraj Tolia、Charles Fry、guava-discuss
End of year.

We basically aim for the end of every quarter.  We just skipped a quarter with this last one, on account of the utter horror that was getting our tests released. (sure hope you all get some use out of being able to access and run them. :-))



> Charles
>




--
Kevin Bourrillion @ Google
Java Core Libraries Team
http://guava-libraries.googlecode.com

Charles Fry

未读,
2011年9月30日 15:07:102011/9/30
收件人 Niraj Tolia、guava-discuss
The other alternative is for us to do a patch release, which we've never done before, but which we could do now that we've revved our revision numbering scheme.

Would you be interested in putting together a patch and an associated test which I could then roll into release 10.1?

Sorry for the trouble, in any case.

Charles

Niraj Tolia

未读,
2011年9月30日 16:15:032011/9/30
收件人 Kevin Bourrillion、Charles Fry、guava-discuss
On Fri, Sep 30, 2011 at 12:03 PM, Kevin Bourrillion <kev...@google.com> wrote:
> End of year.
> We basically aim for the end of every quarter.  We just skipped a quarter
> with this last one, on account of the utter horror that was getting our
> tests released. (sure hope you all get some use out of being able to access
> and run them. :-))

I, for one, definitely do. However, are 6 test failures expected against HEAD?

Cheers,
Niraj

Niraj Tolia

未读,
2011年9月30日 16:16:152011/9/30
收件人 Charles Fry、guava-discuss
On Fri, Sep 30, 2011 at 12:07 PM, Charles Fry <f...@google.com> wrote:
> The other alternative is for us to do a patch release, which we've never
> done before, but which we could do now that we've revved our revision
> numbering scheme.
> Would you be interested in putting together a patch and an associated test
> which I could then roll into release 10.1?

Let me see if I can quickly identify what caused the regression. If I
can track it down, I will be more than happy to contribute a patch +
unit tests back.

> Sorry for the trouble, in any case.

No worries.

peter

未读,
2011年9月30日 17:47:062011/9/30
收件人 guava-discuss
The best way to handle serialisation evolution is with a serialisation
proxy, since this allows you to use only the public api of your class,
eg constructors.

The next best way is to use writeObject and readObject methods to
serialise your data structures in their simplest form and rebuild them
during deserialisation.

Doing so allows you to change your implementation. Don't accept the
default serialised form of your objects, design for evolution instead.

Luckily on this occasion, these guys are prepared to maintain backward
compatibility for you.

Kevin Bourrillion

未读,
2011年9月30日 17:58:052011/9/30
收件人 peter、guava-discuss
Well hang on; we're unbreaking the serialization, but we don't promise any backward compatibility for serialization; never have, never will.



Niraj Tolia

未读,
2011年9月30日 18:32:462011/9/30
收件人 Charles Fry、guava-discuss
Hi Charles,

Before I go ahead and spend more time on this, would a patch along the following lines be acceptable to you guys? My personal unit tests seem to pass but I need to add tests to Guava and figure out the right codestyle before sending the completed patch out. Note that this is against HEAD.

Also, how would you like me to submit this? Happy to do whatever works best for you guys and, while an email is less overhead for me, I know you are trying to figure out the public submission process.

Re: Kevin's point made in a later email. No, we are not interested in backwards compatibility for serialization.

Cheers,
Niraj

From 5a27f71f2433b66cb203bc3c5fe2187c3b9128d4 Mon Sep 17 00:00:00 2001
From: Niraj Tolia <nto...@maginatics.com>
Date: Fri, 30 Sep 2011 15:21:18 -0700
Subject: [PATCH] Encapsulate MapMapker's removal listener.

This re-enables serialization if the original removal listener was
serializable to begin with.
---
 guava/src/com/google/common/collect/MapMaker.java |   32 +++++++++++++++-----
 1 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/guava/src/com/google/common/collect/MapMaker.java b/guava/src/com/google/common/collect/MapMaker.java
index 53c377a..2cd8e69 100644
--- a/guava/src/com/google/common/collect/MapMaker.java
+++ b/guava/src/com/google/common/collect/MapMaker.java
@@ -28,6 +28,7 @@ import com.google.common.base.Equivalences;
 import com.google.common.base.Function;
 import com.google.common.base.Objects;
 import com.google.common.base.Ticker;
+import com.google.common.collect.MapEvictionListener;
 import com.google.common.collect.ComputingConcurrentHashMap.ComputingMapAdapter;
 import com.google.common.collect.CustomConcurrentHashMap.Strength;
 
@@ -569,6 +570,28 @@ public final class MapMaker extends GenericMapMaker<Object, Object> {
   }
 
   /**
+   * Class to hold the MapEvictionListener. Enables serialization if the
+   * eviction listener is serializable to begin with.
+   */
+  static final class MapMakerRemovalListener<K, V> implements
+      RemovalListener<K, V>, Serializable {
+    private static final long serialVersionUID = 0;
+     
+    private final MapEvictionListener<K, V> listener;
+     
+    public MapMakerRemovalListener(final MapEvictionListener<K, V> listener) {
+      this.listener = listener;
+    }
+  
+    @Override
+    public void onRemoval(final RemovalNotification<K, V> notification) {
+      if (notification.wasEvicted()) {
+        listener.onEviction(notification.getKey(), notification.getValue());
+      }
+    }
+  }

+  /**
    * Specifies a listener instance, which all maps built using this {@code MapMaker} will notify
    * each time an entry is evicted.
    *
@@ -614,14 +637,7 @@ public final class MapMaker extends GenericMapMaker<Object, Object> {
     // safely limiting the kinds of maps this can produce
     @SuppressWarnings("unchecked")
     GenericMapMaker<K, V> me = (GenericMapMaker<K, V>) this;
-    me.removalListener = new RemovalListener<K, V>() {
-      @Override
-      public void onRemoval(RemovalNotification<K, V> notification) {
-        if (notification.wasEvicted()) {
-          listener.onEviction(notification.getKey(), notification.getValue());
-        }
-      }
-    };
+    me.removalListener = new MapMakerRemovalListener<K, V>(listener);
     useCustomMap = true;
     return me;
   }
--



On Fri, Sep 30, 2011 at 12:07 PM, Charles Fry <f...@google.com> wrote:

Charles Fry

未读,
2011年9月30日 18:40:192011/9/30
收件人 Niraj Tolia、guava-discuss
Yeah, something like that sounds great.

The easiest way to contribute a patch is to clone our repository in code.google.com (there is a button to do that), make the change there, and then send me a link.

Thanks for digging into this, and sorry for the breakage.

Charles

peter

未读,
2011年9月30日 23:49:322011/9/30
收件人 guava-discuss
Implementing Serializable and not maintaining backward compatibility,
means it will break again.

Why not instead deprecate serialisation, giving developers time to
refactor it out?

If your not ready to implement Serializable because your classes are
still evolving, or you don't want to be locked in, don't do it at
all. Developers can determine their own way of serialising the data
and it doesn't get broken later.


On Oct 1, 7:58 am, Kevin Bourrillion <kev...@google.com> wrote:
> Well hang on; we're un*breaking* the serialization, but we don't promise any
> > To get help:http://stackoverflow.com/questions/ask(use the tag "guava")

Sam Berlin

未读,
2011年10月1日 00:01:352011/10/1
收件人 peter、guava-discuss
The problem as I understood it with serialization and this release is that some MapMaker maps aren't serializing at all (due to something in the pipeline not implementing Serializable).  That's a very different problem than not maintaining backward compatibility.  Many developers find it useful to have transient serializability (if you don't consider that an oxymoron!).  Serialization like this is useful for things like session stores (communicating the session state back & forth), or for multi-process jobs.  It isn't useful (and isn't intended to be useful) for long-term storage that stays persisted between different revisions of the library or application.

sam

Kevin Bourrillion

未读,
2011年10月1日 00:15:082011/10/1
收件人 Sam Berlin、peter、guava-discuss
We use serialization of these classes in various Google apps, so we're not going to be deprecating it any time soon.  But we always deserialize with the same version of the code we serialize with.  Trying to provide for cross-version compatibility made things a hundred times more difficult and we gave up on it before Guava even started.

cowwoc

未读,
2011年10月1日 00:30:502011/10/1
收件人 guava-...@googlegroups.com

    You're not the first to note the difficulty. Sounds like an interesting puzzle to solve :)

    Is this problem similar to database schema migration? Could we solve it using a mechanism similar to MyBatis Migrations or Flyway?

Gili

Niraj Tolia

未读,
2011年10月1日 04:06:082011/10/1
收件人 Charles Fry、guava-discuss
OK. Initial patch up at https://code.google.com/r/ntolia-guava-lib/source/detail?r=53aa3d7d5e87ac55ae39b8d6acf91880dc9e9e9e. Feedback welcome.

Note that I had to change a file name to enable MapMaker tests. Also, not all tests pass for me. With the renamed test file, I get 5 consistent test failures and an additional 2 that are non-deterministic. However, the unit test I added seems to pass. The failures from the last run are documented below.

Cheers,
Niraj

Test Failures:
  testDefault(com.google.common.base.StopwatchTest): null
  testNullStats(com.google.common.cache.ComputingCacheTest): null
  testPut_populated(com.google.common.cache.PopulatedCachesTest): expected same:<com.google.common.cache.CustomConcurrentHashMap$WeakExpirableEvictableEntry@4f823fdd> was not:<null>
  testRemove_nullPresent[ForwardingSortedMap[SafeTreeMap] with natural comparator and standard implementations [collection size: several]](com.google.common.collect.testing.testers.MapRemoveTester): remove(null) should return the associated value expected:<February> but was:<null>
  testRemove_nullPresent[ForwardingSortedMap[SafeTreeMap] with natural comparator and standard implementations [collection size: several]](com.google.common.collect.testing.testers.MapRemoveTester): remove(null) should return the associated value expected:<February> but was:<null>
  testNullParameters(com.google.common.collect.MapMakerTest): No exception thrown from evictionListener([interface com.google.common.collect.MapEvictionListener])[null] for class com.google.common.collect.MapMaker

Peter

未读,
2011年10月1日 09:01:082011/10/1
收件人 Kevin Bourrillion、Sam Berlin、guava-discuss

Hmm interesting.

You use serialisation for temporary persistence.

I use serialisation for distributed computing, so I hadn't considered it like that before. I also understand the difficulties of cross version compatibility, so know what you mean.

There's a neat solution to this difficult problem, for package private classes that implement a public interface and share a builder.

In fact it's so flexible, you might even like to consider using it for temporary persistence.

First make all your package private implementations able to produce their builder, so that the builder.build() produces an identical copy of the implementation.

Then make the builder a serialisation proxy, each implementation uses writeReplace to serialise the builder instead. The builder's build method is later called during deserialisation.

So the serialised form of all your implementations is the builder.

Have the builder serialise its fields and deserialise them in writeObject and readObject methods. More fields can be added in later versions, or unused ones set to null.

Then have the builder choose the most appropriate implementation based on the data.

So now when you serialise then deserialise, you could end up with a totally different implementation class chosen by the builder, but also more appropriate than the original.

In a distributed envirnoment, this allows you to derserialise objects where the object's class isn't available, but instead use the most appropriate class available instead.

Because all implementations share a common public interface, the client's none the wiser.

Your now free to replace your implementation without breaking serialisation.

I can provide some example code if you like.

Cheers,

Peter.



----- Original message -----
> We *use* serialization of these classes in various Google apps, so we're not

> going to be deprecating it any time soon.  But we always deserialize with
> the same version of the code we serialize with.  Trying to provide for

> cross-version compatibility made things a *hundred* times more difficult and

Jared Levy

未读,
2011年10月1日 11:04:312011/10/1
收件人 Peter、Kevin Bourrillion、Sam Berlin、guava-discuss
Your builder approach is basically a customized serialized form. Many Guava classes already serialized forms of that sort.

The underlying problem is still there: ensuring that the serialized forms are compatible between all Guava versions. That was a goal of mine when working towards Google Collections 1.0, but I abandoned that goal after realizing its difficulty. Implementing and testing cross-version compatibility wasn't worth the effort.

Peter

未读,
2011年10月1日 20:50:062011/10/1
收件人 Jared Levy、guava-...@googlegroups.com、kev...@google.com、sbe...@gmail.com

Dropping backwards compatibility, while maintaining forward evolution compatibility is an acceptable compromise, it removes the heart ache and tie in of serial state. Future builders have knowledge of earlier versions, so can use logic instead of relying on serial form.

With the builder approach, you keep all non transient field names, even when they're no longer in use for evolution compatibility.

Because the builder takes responsibility for serialisation, implementation class private state is never published, they're only ever created using constructors (they can even have an inheritance hierarchy) so all invariants can be checked or defensive copies made and final fields used, to guard against stolen references from deserialisation attacks.

You can completely remove a class in a future version and the builder can substitute it with another.

Remember it's only the builder's non transient fields that must be preserved, you can change the logic as much as you like, remove add methods etc.

It helps to use a static class revision field in your builder and an object field that indicates the revision of the builder used to serialise state.

I can offer some assistance if you're interested.


Cheers,

Peter.

----- Original message -----
> Your builder approach is basically a customized serialized form. Many Guava
> classes already serialized forms of that sort.
>
> The underlying problem is still there: ensuring that the serialized forms
> are compatible between all Guava versions. That was a goal of mine when
> working towards Google Collections 1.0, but I abandoned that goal after
> realizing its difficulty. Implementing and testing cross-version
> compatibility wasn't worth the effort.
>
>
> On Sat, Oct 1, 2011 at 6:01 AM, Peter <ji...@zeus.net.au> wrote:
>

> > **

Louis Wasserman

未读,
2011年10月2日 18:03:232011/10/2
收件人 Peter、Jared Levy、guava-...@googlegroups.com、kev...@google.com、sbe...@gmail.com
+1 on putting a patch for this in 10.1.
-1 on making any guarantee, now or ever, as to serialization compatibility between Guava releases.

Charles Fry

未读,
2011年10月3日 08:23:052011/10/3
收件人 Niraj Tolia、guava-discuss
Thank you for that patch! I added a missing checkNotNull and submitted to the release branch. We're exploring a few other additions to that branch, which should then be released this week as 10.1.

cheers,
Charles

Peter Firmstone

未读,
2011年10月3日 17:21:572011/10/3
收件人 Louis Wasserman、Jared Levy、guava-...@googlegroups.com、kev...@google.com、sbe...@gmail.com
Looking at ImmutableMap, you already use a very similar pattern, the
SerializedFrom is used to create a builder, which is then used to
re-construct the deserialised object from the SerializedForm which all
subclasses use.

It's almost there, you almost solved it, however the implementation is
still locked into SerializedForm. If you re-factored it and moved
SerializedFrom to be the Builder's SerializedFrom, then you remove the
dependency.

When you use the builder as the serialized form, if you don't like the
serialized form of the builder, create a new builder (with a new class
name). The builder is responsible for the creation of your
ImmutableMap, and serialization is akin to a constructor, the remote jvm
doesn't deserialise ImmutableMap.SerializedForm, it deserialises the
Builder, which creates ImmutableMap.

Have a depreciation period for the old builder, then an old version of
your ImmutableMap sends an old builder, via serialization, it still
creates an ImmutableMap, during deserialisation. Then when a later
version sends you the new builder, it too creates an ImmutableMap during
deserialization. You can of course now support two completely different
and separate serial forms of your classes.

Then when you resend both ImmutableMap instances (via serialization),
only the new builder instance is sent.

This is forward or evolution compatible. You don't always have to
replace the builder, but it's nice to know the option's there.

Now if someone want's backward compatibility as well, the tricky part,
provided your new builder uses a new class name which doesn't collide
with any classes in the old name space. They can use an RMI codebase
annotation, the code is then dynamically downloaded to the jvm that has
the old version of the library, it loads the class file for the new
builder and builds an ImmutableMap.

This makes it possible to perform a hot upgrades, transferring state
from one jvm to another, then if it something goes wrong, roll back to
the known working state.

In a distributed environment, where you can't guarantee the recipient
has the same class version, you'd treat these classes as though they
were not serialiseable, which is no great hassle, it just means that if
you use dependency injection, you're going to have to add the dependency
in order to rebuild it at the remote end, rather than rely on
serialising the field directly.

The dependency injection could be fixed by creating wrapper used to
encapsulate the dependency, that becomes responsible for serialisation.

Distributed programming is an order of magnitude more difficult, but I'm
working out ways to make it easier, fitting within the limitations of
the jvm.

We have a test framework that makes it possible to test this type of
serialisation, but I can see why you gave up on serialisation compatibility.

Cheers,

Peter.

Louis Wasserman wrote:
> +1 on putting a patch for this in 10.1.
> -1 on making any guarantee, now or ever, as to serialization
> compatibility between Guava releases.
>
> Louis Wasserman

> wasserm...@gmail.com <mailto:wasserm...@gmail.com>

Charles Fry

未读,
2011年10月3日 17:32:072011/10/3
收件人 Peter Firmstone、Louis Wasserman、Jared Levy、guava-...@googlegroups.com、kev...@google.com、sbe...@gmail.com
More to the point this is not something we need, so we lack justification to spend time working on it...

On Mon, Oct 3, 2011 at 17:21, Peter Firmstone <ji...@zeus.net.au> wrote:
Looking at ImmutableMap, you already use a very similar pattern, the SerializedFrom is used to create a builder, which is then used to re-construct the deserialised object from the SerializedForm which all subclasses use.

It's almost there, you almost solved it, however the implementation is still locked into SerializedForm.  If you re-factored it and moved SerializedFrom to be the Builder's SerializedFrom, then you remove the dependency.

When you use the builder as the serialized form, if you don't like the serialized form of the builder, create a new builder (with a new class name).  The builder is responsible for the creation of your ImmutableMap, and serialization is akin to a constructor, the remote jvm doesn't deserialise ImmutableMap.SerializedForm, it deserialises the Builder, which creates ImmutableMap.

Have a depreciation period for the old builder, then an old version of your ImmutableMap sends an old builder, via serialization, it still creates an ImmutableMap, during deserialisation.  Then when a later version sends you the new builder, it too creates an ImmutableMap during deserialization.  You can of course now support two completely different and separate serial forms of your classes.

Then when you resend both ImmutableMap instances (via serialization), only the new builder instance is sent.

This is forward or evolution compatible.  You don't always have to replace the builder, but it's nice to know the option's there.

Now if someone want's backward compatibility as well, the tricky part, provided your new builder uses a new class name which doesn't collide with any classes in the old name space. They can use an RMI codebase annotation, the code is then dynamically downloaded to the jvm that has the old version of the library, it loads the class file for the new builder and builds an ImmutableMap.

This makes it possible to perform a hot upgrades, transferring state from one jvm to another, then if it something goes wrong, roll back to the known working state.

In a distributed environment, where you can't guarantee the recipient has the same class version, you'd treat these classes as though they were not serialiseable, which is no great hassle, it just means that if you use dependency injection, you're going to have to add the dependency in order to rebuild it at the remote end, rather than rely on serialising the field directly.

The dependency injection could be fixed by creating wrapper used to encapsulate the dependency, that becomes responsible for serialisation.

Distributed programming is an order of magnitude more difficult, but I'm working out ways to make it easier, fitting within the limitations of the jvm.

We have a test framework that makes it possible to test this type of serialisation, but I can see why you gave up on serialisation compatibility.

Cheers,

Peter.

Louis Wasserman wrote:
+1 on putting a patch for this in 10.1.
-1 on making any guarantee, now or ever, as to serialization compatibility between Guava releases.

Louis Wasserman

http://profiles.google.com/wasserman.louis



On Sat, Oct 1, 2011 at 7:50 PM, Peter <ji...@zeus.net.au <mailto:ji...@zeus.net.au>> wrote:

   Dropping backwards compatibility, while maintaining forward
   evolution compatibility is an acceptable compromise, it removes
   the heart ache and tie in of serial state. Future builders have
   knowledge of earlier versions, so can use logic instead of relying
   on serial form.

   With the builder approach, you keep all non transient field names,
   even when they're no longer in use for evolution compatibility.

   Because the builder takes responsibility for serialisation,
   implementation class private state is never published, they're
   only ever created using constructors (they can even have an
   inheritance hierarchy) so all invariants can be checked or
   defensive copies made and final fields used, to guard against
   stolen references from deserialisation attacks.

   You can completely remove a class in a future version and the
   builder can substitute it with another.

   Remember it's only the builder's non transient fields that must be
   preserved, you can change the logic as much as you like, remove
   add methods etc.

   It helps to use a static class revision field in your builder and
   an object field that indicates the revision of the builder used to
   serialise state.


Peter Firmstone

未读,
2011年10月3日 19:03:212011/10/3
收件人 Charles Fry、Louis Wasserman、Jared Levy、guava-...@googlegroups.com、kev...@google.com、sbe...@gmail.com
And you don't seem to be open to the idea of someone else making a
contribution...

...you're generous enough to put your code out there.

I can see by the code that there are some smart developers involved.

So there appears to be two options, if you want to use Guava in
distributed code, or non temporarily Serialize:

1. Use one version of Guava only.
2. Or if you only want to use a few classes, copy them to a new name
space, fix the serialization and monitor and back port any fixes.

Cheers,

Peter.

> wasserm...@gmail.com <mailto:wasserm...@gmail.com>
> <mailto:wasserm...@gmail.com
> <mailto:wasserm...@gmail.com>>


> http://profiles.google.com/wasserman.louis
>
>
>
> On Sat, Oct 1, 2011 at 7:50 PM, Peter <ji...@zeus.net.au

> <mailto:ji...@zeus.net.au> <mailto:ji...@zeus.net.au

> guava-...@googlegroups.com <mailto:guava-...@googlegroups.com>

Louis Wasserman

未读,
2011年10月3日 19:40:042011/10/3
收件人 Peter Firmstone、Charles Fry、Jared Levy、guava-...@googlegroups.com、kev...@google.com、sbe...@gmail.com
In all fairness, making this change would require that Guava continue to support version-compatible serialization for all eternity -- even if Peter is writing the original patch, this would require significant work on the part of the Guava maintainers in the future.

I think it's perfectly reasonable to reject that burden if the payoff doesn't seem worthwhile.

Finally, if you're using Guava in a distributed system, you can upgrade Guava versions whenever you like so long as every part of your system is using the same Guava version.  (Google uses these tools for its Java-based projects, too, and manages just fine.)
       <mailto:wasserman.louis@gmail.com

       <mailto:wasserman.louis@gmail.com>>

Peter Firmstone

未读,
2011年10月3日 20:36:432011/10/3
收件人 Louis Wasserman、Charles Fry、Jared Levy、guava-...@googlegroups.com、kev...@google.com、sbe...@gmail.com
Louis Wasserman wrote:
> In all fairness, making this change would require that Guava continue
> to support version-compatible serialization for all eternity -- even
> if Peter is writing the original patch, this would require significant
> work on the part of the Guava maintainers in the future.

Not really, it's actually no work to change the class if your builder
hasn't changed, since serialisation is no longer it's concern. Although
the builder implementations should be package private and separate (not
a static class), so the client doesn't end up depending on the
implementation, then you can change it at will.

You can also completely remove a class, leave the package private
builder implementation behind and have it create another class instead
that implements the same interface or abstract class, and the client
doesn't even know, he/she's just been upgraded to a new implementation.
Then when the new implementation is serialised it uses a new builder
implementation, then after a few versions, remove the old builder, only
support upgrades to the next release version if you want, this reduces
the testing burden and allows a migration path.

For testing, because the new serial form is in a new builder, you just
test each builder deserialises the expected implementation, yes it's
easier to test too, you don't need the old classes, because the old
builders fields haven't changed, just their build target. Serialisation
is decoupled.

Or your existing implementations can change builder, it's just about
decoupling dependencies, you'd do it for any other code would you not?
Didn't Google invent Guice?

Don't really know what all the fuss is about, open your minds. Even if
you don't want to take a contribution for whatever reason (after all
it's your project), there's no harm learning something from outside, if
we share knowledge we all grow, we all have our strengths and weaknesses
or individual experiences, you can teach me something new tomorrow.

>
> I think it's perfectly reasonable to reject that burden if the payoff
> doesn't seem worthwhile.
>
> Finally, if you're using Guava in a distributed system, you can
> upgrade Guava versions whenever you like so long as every part of your

> system is using the /same/ Guava version. (Google uses these tools

> for its Java-based projects, too, and manages just fine.)

That KO's our users, it's not my choice, they just never shut down their
clusters, they perform hot upgrades.

Anyway I've found my solution, thanks for humoring me.

Cheers,

Peter.
>
> Louis Wasserman
> wasserm...@gmail.com <mailto:wasserm...@gmail.com>


> http://profiles.google.com/wasserman.louis
>
>
> On Mon, Oct 3, 2011 at 6:03 PM, Peter Firmstone <ji...@zeus.net.au
> <mailto:ji...@zeus.net.au>> wrote:
>
> And you don't seem to be open to the idea of someone else making a
> contribution...
>
> ...you're generous enough to put your code out there.
>
> I can see by the code that there are some smart developers involved.
>
> So there appears to be two options, if you want to use Guava in
> distributed code, or non temporarily Serialize:
>
> 1. Use one version of Guava only.
> 2. Or if you only want to use a few classes, copy them to a new name
> space, fix the serialization and monitor and back port any fixes.
>
> Cheers,
>
> Peter.
>
> Charles Fry wrote:
>
> More to the point this is not something we need, so we lack
> justification to spend time working on it...
>
> On Mon, Oct 3, 2011 at 17:21, Peter Firmstone
> <ji...@zeus.net.au <mailto:ji...@zeus.net.au>

> <mailto:wasserm...@gmail.com>

> http://profiles.google.com/wasserman.louis
>
>
>
> On Sat, Oct 1, 2011 at 7:50 PM, Peter <ji...@zeus.net.au
> <mailto:ji...@zeus.net.au>
> <mailto:ji...@zeus.net.au <mailto:ji...@zeus.net.au>>
> <mailto:ji...@zeus.net.au <mailto:ji...@zeus.net.au>
>

> <mailto:ji...@zeus.net.au <mailto:ji...@zeus.net.au>>>>

> <mailto:guava-...@googlegroups.com>
> <mailto:guava-...@googlegroups.com
> <mailto:guava-...@googlegroups.com>>

Sam Berlin

未读,
2011年10月3日 21:12:342011/10/3
收件人 Peter Firmstone、Louis Wasserman、Charles Fry、Jared Levy、guava-...@googlegroups.com、kev...@google.com
If you have an existing solution coded & available, it never hurts to take a look at it and see how/if it can fit into Guava.  (I, of course, don't speak for the Guava team itself... more out of my own curiosity.)
 sam

On Mon, Oct 3, 2011 at 8:36 PM, Peter Firmstone <ji...@zeus.net.au> wrote:
Louis Wasserman wrote:
In all fairness, making this change would require that Guava continue to support version-compatible serialization for all eternity -- even if Peter is writing the original patch, this would require significant work on the part of the Guava maintainers in the future.

Not really, it's actually no work to change the class if your builder hasn't changed, since serialisation is no longer it's concern.  Although the builder implementations should be package private and separate (not a static class), so the client doesn't end up depending on the implementation, then you can change it at will.

You can also completely remove a class, leave the package private builder implementation behind and have it create another class instead that implements the same interface or abstract class, and the client doesn't even know, he/she's just been upgraded to a new implementation.  Then when the new implementation is serialised it uses a new builder implementation, then after a few versions, remove the old builder, only support upgrades to the next release version if you want, this reduces the testing burden and allows a migration path.

For testing, because the new serial form is in a new builder, you just test each builder deserialises the expected implementation, yes it's easier to test too, you don't need the old classes, because the old builders fields haven't changed, just their build target. Serialisation is decoupled.

Or your existing implementations can change builder, it's just about decoupling dependencies, you'd do it for any other code would you not?  Didn't Google invent Guice?

Don't really know what all the fuss is about, open your minds.  Even if you don't want to take a contribution for whatever reason (after all it's your project), there's no harm learning something from outside, if we share knowledge we all grow, we all have our strengths and weaknesses or individual experiences, you can teach me something new tomorrow.


I think it's perfectly reasonable to reject that burden if the payoff doesn't seem worthwhile.

Finally, if you're using Guava in a distributed system, you can upgrade Guava versions whenever you like so long as every part of your system is using the /same/ Guava version.  (Google uses these tools for its Java-based projects, too, and manages just fine.)
That KO's our users, it's not my choice, they just never shut down their clusters, they perform hot upgrades.

Anyway I've found my solution, thanks for humoring me.

Cheers,

Peter.
       <mailto:wasserman.louis@gmail.com>
       <mailto:wasserman.louis@gmail.com
       <mailto:wasserman.louis@gmail.com>>
              <mailto:wasserman.louis@gmail.com
       <mailto:wasserman.louis@gmail.com>

              <mailto:wasserman.louis@gmail.com
       <mailto:wasserman.louis@gmail.com>>>
              <mailto:ji...@zeus.net.au <mailto:ji...@zeus.net.au>>>>
       <mailto:guava-discuss@googlegroups.com>
       <mailto:guava-discuss@googlegroups.com

       <mailto:guava-discuss@googlegroups.com>>

Sam Berlin

未读,
2011年10月3日 22:28:262011/10/3
收件人 Peter Firmstone、guava-...@googlegroups.com
I'm leery of any long-term serialization pattern that relies upon Java serialization in any way.  You're stuck with package & class names, unable to ever refactor.  (See: https://github.com/wiregit/wirecode/blob/master/components/common/src/main/java/org/limewire/util/ConverterObjectInputStream.java for some tricks I've used to get around this long ago, but I wouldn't wish that strategy upon anyone.  Also see the various classes in https://github.com/wiregit/wirecode/tree/master/components/gnutella-core/src/main/java/com/limegroup/gnutella/downloader/serial/conversion if you really want the complete picture.)  There's also signification complications with adding or removing variables, or even worse: changing the type signature of a variable.

A practical halfway approach is to use Java serialization, but stick with a plain & simple SerialMemento class (that can never, ever change its name or package) with a single HashMap field.  The writeReplace method would return the memento, and the memento's readResolve method would construct the appropriate implementation (having to know about & convert every possible serialized form from every release, ever).

sam

On Mon, Oct 3, 2011 at 9:51 PM, Peter Firmstone <ji...@zeus.net.au> wrote:
Sam Berlin wrote:
If you have an existing solution coded & available, it never hurts to take a look at it and see how/if it can fit into Guava.  (I, of course, don't speak for the Guava team itself... more out of my own curiosity.)
 sam

Nothing for Guava at the moment, but to see the pattern in use go to:

http://svn.apache.org/viewvc/river/jtsk/skunk/peterConcurrentPolicy/src/org/apache/river/api/security/

Then have a look at PermissionGrantBuilderImp.java

In this case there are several implementations that may be returned by the builder, some classes aren't serializable, so the builder substitutes a NullPermissionGrant instead, which stops null pointer exceptions or class cast exceptions etc..

All of these implementation classes are hidden behind a PermissionGrant interface and a PermissionGrantBuilder abstract class.

I can substitute this builder implementation at any time, changing the serial form, without causing breakage, by just leaving the old builder where it is.

Those with a keen eye might notice that the builder doesn't do much defensive checking of it's invariants and uses an almost default serial form.  Before any permission grants are accepted, the caller must authenticate, the PermissionGrant implementations perform defensive copying and check the invariants.  Once you've got the hang of it, implementing builder serialisation is almost a no brainer.

Cheers,

Peter.

Bob Lee

未读,
2011年10月6日 14:33:352011/10/6
收件人 peter、guava-discuss
On Fri, Sep 30, 2011 at 2:47 PM, peter <ji...@zeus.net.au> wrote:
The best way to handle serialisation evolution is with a serialisation
proxy, since this allows you to use only the public api of your class,
eg constructors.

The next best way is to use writeObject  and readObject methods to
serialise your data structures in their simplest form and rebuild them
during deserialisation.

MapMaker's serialization logic is already sophisticated and even noteworthy: http://is.gd/KELdpR

As you can see, it uses the serialization proxy pattern *and* write/readObject. What's more, it's the first and only usage of the serialization proxy pattern I'm aware of that successfully works around readResolve's broken behavior in the presence of circular dependencies. From the serialization spec:

Note - The readResolve method is not invoked on the object until the object is fully constructed, so any references to this object in its object graph will not be updated to the new object nominated by readResolve. However, during the serialization of an object with the writeReplace method, all references to the original object in the replacement object's object graph are replaced with references to the replacement object. Therefore in cases where an object being serialized nominates a replacement object whose object graph has a reference to the original object, deserialization will result in an incorrect graph of objects. Furthermore, if the reference types of the object being read (nominated by writeReplace) and the original object are not compatible, the construction of the object graph will raise a ClassCastException.

With MapMaker, the serialization proxy also implements ConcurrentMap. If the proxy doesn't get replaced due to a circular dependency, it will simply forward to the deserialized map instance. It introduces a layer of indirection when you deserialize a circular reference, but its behavior is otherwise indistinguishable to the user.

At this point, I doubt the map's serialized form will change in an incompatible way, but I strongly recommend against using serialization for long term persistence. Serialized data is incredibly difficult to migrate, and you end up accumulating and carrying around legacy design cruft for the lifetime of your data.
 
Bob

Bob Lee

未读,
2011年10月6日 14:36:162011/10/6
收件人 Niraj Tolia、Charles Fry、guava-discuss
On Sat, Oct 1, 2011 at 1:06 AM, Niraj Tolia <nto...@maginatics.com> wrote:
There's still a bug here (not introduced by your patch). Each time you serialize and deserialize the map, the removal listener will get wrapped again. When you combine this with the proxy I mentioned in my other email, the levels of indirection will add up quickly!

The serialization proxy should unwrap the listener before serializing it.

Bob 

Peter Firmstone

未读,
2011年10月6日 16:24:192011/10/6
收件人 guava-...@googlegroups.com
Bob Lee wrote:
> On Fri, Sep 30, 2011 at 2:47 PM, peter <ji...@zeus.net.au
> <mailto:ji...@zeus.net.au>> wrote:
>
> The best way to handle serialisation evolution is with a serialisation
> proxy, since this allows you to use only the public api of your class,
> eg constructors.
>
> The next best way is to use writeObject and readObject methods to
> serialise your data structures in their simplest form and rebuild them
> during deserialisation.
>
>
> MapMaker's serialization logic is already sophisticated and even
> noteworthy: http://is.gd/KELdpR
>
> As you can see, it uses the serialization proxy pattern *and*
> write/readObject. What's more, it's the first and only usage of the
> serialization proxy pattern I'm aware of that successfully works
> around readResolve's broken behavior </> in the presence of circular
> dependencies. From the serialization spec:
>
> *Note - *The |readResolve| method is not invoked on the object

> until the object is fully constructed, so any references to this
> object in its object graph will not be updated to the new object
> nominated by |readResolve|. However, during the serialization of

> an object with the |writeReplace |method, all references to the
> original object in the replacement object's object graph are
> replaced with references to the replacement object. Therefore in
> cases where an object being serialized nominates a replacement
> object whose object graph has a reference to the original object,
> deserialization will result in an incorrect graph of objects.
> Furthermore, if the reference types of the object being read
> (nominated by |writeReplace|) and the original object are not

> compatible, the construction of the object graph will raise
> a |ClassCastException|.
>
>
> With MapMaker, the serialization proxy also implements ConcurrentMap.
> If the proxy doesn't get replaced due to a circular dependency, it
> will simply forward to the deserialized map instance. It introduces a
> layer of indirection when you deserialize a circular reference, but
> its behavior is otherwise indistinguishable to the user.
>
> At this point, I doubt the map's serialized form will change in an
> incompatible way, but I strongly recommend against using serialization
> for long term persistence. Serialized data is incredibly difficult to
> migrate, and you end up accumulating and carrying around legacy design
> cruft for the lifetime of your data.
>
> Bob
>
Yes, it's already very good, you've also made a very valid point about
circular references, thank-you and this is also the first time I've seen
someone implement a workaround.

I agree with you about long term persistence, in a distributed
environment, we need a little more compatibility than one version, not
indefinite though, but at least upgradeable.

Summing up the existing good design points:

1. CustomConcurrentHashMap is package private, the client code will
be expecting a ConcurrentMap, making it replaceable / substitutable.
2. The serialization proxy uses a builder, allowing the builder to
substitute CustomConcurrentHashMap, with another implementation if
necessary.
3. The serialization proxy also implements ConcurrentMap, fixing
circular reference issues with readResolve().
4. The serialization proxy delegates to the new ConcurrentMap, built
during deserialization, so any circular references that were not
updated by readResolve() will work as expected. - very clever.
5. The serialization proxy has it's own inheritance hierarchy- good
design.


Its' also worth noting that this wasn't designed with upgrade
flexibility in mind, to create the ultimate serialization pattern (for
distributed computing) based on this design, would only require very
minor refactoring:

1. Move the static internal serialization proxy classes out of
CustomConcurrentHashMap, as package private classes, decoupling
them from CustomConcurrentHashMap.
2. Provide a static factory method on AbstractSerializationProxy to
create a new serialization proxy instance, have
CustomConcurrentHashMap call this instead of constructing the
serialization proxy directly.
3. Rename AbstractSerializationProxy to
AbstractSerializationProxyBuilder, and remove all non transient
state, but provide builder methods to set state in the
serialization proxy. Include a protected method to set a delegate
to maintain the readResolve() circular reference fix.
4. Have the serialization proxy implement all the builder methods and
carry the serialized state.
5. in the readResolve() method, the build() method is called, which
builds the ConcurrentMap, and sets delegation for circular references.


Then you can change the from MapMaker to CacheBuilder at some point in
the future and retrieve a ConcurrentMap implementation from Cache instead.

You can add a new serialization proxy, to change your serialized form
and leave the existing (although it will only get used by older
serialized forms for compatibility).

Then serialization can be utilised whilst performing hot upgrades.

In fact, I like the circular reference solution so much I'd like to
document it on our wiki, provided you're happy for me to do so?

Who can I attribute the readResolve work around to?

http://wiki.apache.org/river/Serialization

Thanks & Regards,

Peter.

Bob Lee

未读,
2011年10月6日 17:04:252011/10/6
收件人 Peter Firmstone、guava-...@googlegroups.com
I agree with your points about decoupling the serialization proxy, and would do that if/when we decided maintaining compatibility was important. Actually, I think CCHM should go in its own private package where it can be shared by CacheBuilder and MapMaker.

On Thu, Oct 6, 2011 at 1:24 PM, Peter Firmstone <ji...@zeus.net.au> wrote:
In fact, I like the circular reference solution so much I'd like to
document it on our wiki, provided you're happy for me to do so?

Sure!
 
Who can I attribute the readResolve work around to?

Me. :-)

Bob

Peter Firmstone

未读,
2011年10月6日 17:39:232011/10/6
收件人 Bob Lee、guava-...@googlegroups.com
Bob Lee wrote:
> I agree with your points about decoupling the serialization proxy, and
> would do that if/when we decided maintaining compatibility was important.

That's basically all the Serialization Builder is, a decoupled proxy, it
uses abstract builder methods to avoid coupling, there are probably
other ways to de-couple, I haven't thought that far... If you think of
any, let me know?

If you guys decide to maintain compatibility, let me know if you want a
hand.

> Actually, I think CCHM should go in its own private package where it
> can be shared by CacheBuilder and MapMaker.
>
> On Thu, Oct 6, 2011 at 1:24 PM, Peter Firmstone <ji...@zeus.net.au
> <mailto:ji...@zeus.net.au>> wrote:
>
> In fact, I like the circular reference solution so much I'd like to
> document it on our wiki, provided you're happy for me to do so?
>
>
> Sure!
>
>
> Who can I attribute the readResolve work around to?
>
>
> Me. :-)
>
> Bob
>

Cool, shall I link a blog, or just a name?

Cheers,

Peter.

Peter Firmstone

未读,
2011年10月6日 20:25:142011/10/6
收件人 Bob Lee、guava-...@googlegroups.com
I've updated our wiki page:

http://wiki.apache.org/river/Serialization

If you've got some time, some peer review would be much appreciated,
along with any suggestions for improvement.

Once implemented, you can even use dependency injection / configuration
to decide which serial versions should be in use by default. You keep
the compatibility with the last version, so when you're performing a hot
upgrade in a distributed environment that's using serialization, you
hold off using the new serialized form until the upgrade is complete,
which may take a few months, but the whole system runs reliability and
business isn't disrupted.

So the upgrade admin could say ok we're hot upgrading from Guava version
10 to 11, set serial form at Guava 10 until the upgrade is complete
(hypothetically of course), then we'll switch to 11. There could be a
number of similar upgrades with other libraries.

Being "The DI King", some guidance or suggestions would be much appreciated?

I'm a committer at river.apache.org, which is an evolution of Jini,
we've still got some problems to solve with security, memory isolation
and class loading, but we're getting closer to solving some of the long
standing hairy problems with distributed computing and Java.

Cheers,

Peter.

Charles Fry

未读,
2011年10月7日 19:06:482011/10/7
收件人 Bob Lee、Niraj Tolia、guava-discuss
There's still a bug here (not introduced by your patch). Each time you serialize and deserialize the map, the removal listener will get wrapped again. When you combine this with the proxy I mentioned in my other email, the levels of indirection will add up quickly!

The serialization proxy should unwrap the listener before serializing it.

Actually I think it's correct since evictionListener does the wrapping, but deserialization calls removalListener.

Niraj Tolia

未读,
2011年10月8日 11:56:292011/10/8
收件人 Charles Fry、Bob Lee、guava-discuss

I seem to have missed the earlier email from Bob. Anyway, I took a
very quick look at the serialization proxy code but, as you guys know
the code better, I will let you figure out if there really is a bug
and would be happy to work on a revised patch.

However, as Charles mentioned, if no bug is present, is there anything
else blocking the next point-release of Guava?

Cheers,
Niraj

Bob Lee

未读,
2011年10月8日 12:13:292011/10/8
收件人 Charles Fry、Niraj Tolia、guava-discuss
On Fri, Oct 7, 2011 at 4:06 PM, Charles Fry <f...@google.com> wrote:
Actually I think it's correct since evictionListener does the wrapping, but deserialization calls removalListener.

Nice. I should have looked more closely.

Bob

Charles Fry

未读,
2011年10月10日 10:08:192011/10/10
收件人 Niraj Tolia、Bob Lee、guava-discuss
However, as Charles mentioned, if no bug is present, is there anything
else blocking the next point-release of Guava?

Nope, it's on its way.

Charles
回复全部
回复作者
转发
0 个新帖子