Am 28.09.2015 um 04:32 schrieb 'Osvaldo Doederlein' via guava-discuss:
> On Sat, Sep 26, 2015 at 3:28 AM, Joachim Durchholz <
j...@durchholz.org> wrote:
>
>> Am 26.09.2015 um 01:30 schrieb 'Osvaldo Doederlein' via guava-discuss:
>>
>> There are scenarios where it's different.
>> E.g. serializing a stream of versions of a large copy-on-write object
>> graph.
>> This could be used for backups, or for failover-to-standby scenarios, in
>> situations where a rare loss of work from a limited period of time is
>> preferrable to regular loss of availability (say, game servers, or compute
>> servers that feed a continuous but interruptible process).
>
> Well the best solution for that is usually protobuf :) But if you need to
> support non-proto objects which contain sharing and cycles, you can still
> customize Java serialization and often achieve significant advantages in
> both speed and stream size compared to default serialization.
Well, I have yet to hear anybody recommending doing that in anger.
Default serialization isn't good, and adapting it is awkward and
error-prone - there's a reason why people still build serialization
libraries for Java :-)
> Avoiding the
> cost of sharing/canonicalization when you don't need it (or doing it
> manually) is one of many tricks that help optimizing serialization.
> go/effectivejava has a whole chapter on Java Serialization, you need to
> check this if your use of serialization is sufficiently heavy or
> sophisticated that you're running into problems like OOME and considering
> solutions like customizing ObjectOutputStream.
Heh.
Well, the project that's sparking my interest in these issues is
more-or-less shelved, so I'm in no hurry to decide anything.
Kryo has been looking best to me: fast, bloat-avoiding, can deal with
sharing (and consequentially cycles), comes with readymade options for
most of the optimizing tricks one would want to do.
Can't guarantee for it since I haven't used it in anger yet, but at
least the project is getting its priorities right.
>>> The PL
>>> research community has put automatic object serialization in the list of
>>> bad ideas some two decades ago,
>>
>> Is there a reference to that?
>> I've been loosely following that community, but that particular conclusion
>> slipped by me. I'd like to check what assumptions this conclusion was bound
>> to, and whether they still hold today.
>
> That's a hard question. :) I remember reading some compendium about either
> OODBs or GC that lambasted serialization, that was in 1999~2000 while doing
> my MSc. There's a good paper specific to Java that I could recall and find,
>
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.6548&rep=rep1&type=pdf.
Ah I see.
It's going through several issues, of which some exist at the JVM level,
some in Java, some just in Serializable. Some points aren't actually a
problem specific to serialization; e.g. section 8 is essentially a
rehash of the well-known fact that requiring read consistency requires
immutability, copy-on-write, or locking.
I think the issues with Serializable can be solved by using a different
library (Kryo and Protobuf have been mentioned).
The other issues need to be worked around, and that seriously limits the
usefulness of serialization in Java.
> But the most important signal is that Serialization is a dead research
> subject; nobody works on this stuff anymore, nether in academia nor in the
> industry.
Yeah, getting serialization right does require some serious thinking and
design to get the semantics right.
Issues I have encountered involve:
* Dealing with once-per-JVM stuff: singletons, static data, file handles.
* Mutability. You can't serialize a mutable object to another JVM and
expect the semantics to remain unchanged. You need to serialize a proxy
instead.
* Class/data versioning. You'll need conversion code to upgrade
old-class data to new classes, and that code would need to be validated
because it's really easy to hide nasty bugs in that.
* Bytecode generation at runtime combines mutability and class/data
versioning issues.
Much of that is hard or impossible to do in Java as it is, so I agree
it's mostly dead in Java.
I'm somewhat suprised that programming language research has given up on
this. None of the points above come without a good solution, with the
excepction of mutability if you have a proxy and an unreliable
authoritaty object; even that is a bit surprising, PL researchers tend
to dislike mutability anyway.
> Java's serialization API, and its implementation, have not changed in
> years, it's a dead-end technology so nobody is interested to invest on it
> even if there are still some opportunities for improvement.
Heh. I certainly wouldn't want to invest into Serializable myself.