RR: IsSerializable vs Serializable

43 views
Skip to first unread message

Emily Crutcher

unread,
Jan 18, 2007, 12:48:54 PM1/18/07
to Google Web Toolkit Contributors

There has been quite a bit of discussion around allowing users to share certain simple objects, such as Data Transfer Objects (DTOs). The first barrier that a typical user runs into is that GWT cannot process the Serializable interface.  An additional consequence of not supporting Serializable is that JRE classes that are designed to be serializable , such as Number and RuntimeException, are difficult to use with GWT RPC, as they require special field serializers.

 

This RR first outlines two solutions that would enable the sharing of server and client code and then describes some additional RPC support that would be needed if either solution   is implemented.

 

RPC Design Goals

For context, we list here the current design goals for RPC.


-    Only translatable objects send down the wire
-    Simplicity/Ease of use. In specific, no external proxy tools.
-    Asynchronous operation
-    JavaScript code side impact should be minimal
-    Smallest wire format possible
-    Fast encode/decode of objects
-    Polymorphic support

  

In both solutions below, custom field serializers would still be respected and IsSerializable would be depreciated.

 

 

 

Removing IsSerializable, Solution 1:

Allow classes to implement the Serializable interface.  GWT would ignore this interface.  

 

All translatable objects would potentially be serializable .  

 

The following algorithm would be used to determine the types that are actually included as serializable types per module

 

Initial serializable set is composed of all types specified in all methods of any Service interface reachable by the module's entry point.

 

 Repeat until stable

        For every type in the serializable set, add all of that type's subtypes.

        For every type in the serializable set, add all that type's non-transient field types

        Report an error if java.lang.Object has been added to the serializable set.            

 

 

Note:  gwt.typeArgs can change the GWT perceived type signature of classes. For instance, an ArrayList associated with a gwt.typeArg of ArrayList<Integer> would only add  ArrayList, Integer, and their subtypes to the serializable set of types, regardless of   the actual class structure.

 

  

Pros

Allows users to freely include JRE types in their service interface declarations

 

Users would not need to modify their classes to implement either Serializable or IsSerializable

 

 

Cons

Code bloat: As the user cannot limit the classes considered for inclusion, it  may cause code bloat by adding too many classes to the javascript executable.

 

Runtime exceptions. It would be relatively easy for the algorithm above to process a type for serialization which was never intended by the user to be serialized, such as Widgets, Elements, XML objects.  Unless the type was included in a blacklist, this would not be detected until runtime, where the exception would be thrown into server side logs.

 

 

 

Removing IsSerializable, Solution 2:

Use Serializable  instead of IsSerializable.  RPC's use of Serializable would not support the full Serializable contract,  because RPC would not respect the writeObject(ObjectOutputStream) and readObject(ObjectInputStream) methods.  

 

Instead of including all types reachable from a service interface as serializable, only those implementing Serializable would be included.

 

 

Design decisions

Why not support writeObject(ObjectOutputStream)/readObject(ObjectInputStream) ?

 

The primary reasons is that there are methods and fields on these streams that we never plan to support, for example annotateProxyClass , useProtocalVersion, write(byte buf), so we cannot actually conform to the contract of  ObjectOutputStream or ObjectInputStream.

 

 

Pros

Allows users to include any JRE or Data Transfer Object that is translatable and implements Serializable

 

Should be relatively intuitive to users who are familiar with Serializable

 

 

Cons

A user could accidentally increase code bloat, though probably not as much as Solution 1.   For example, a user who passed in a Number to a RPC method would automatically include the code for Integer, Long, Float, and Double in their client side code.

 

A user still must mark classes that are going to be used by RPC as Serializable.

 

The full Serializable contact is not respected because readObject/writeObject methods are never used.

 

 

 

Allowing readObject/writeObject

It would be frustrating for users to have objects that are in all other ways translatable, but include readObject/writeObject serialization methods so cannot be translated.    It is theoretically possible to hack the GWT compiler to skip these methods. Of course, the compiler hacker would then have to take a very long bath.   

 

We do not want to support this functionality in the general case, so using annotations, for instance, to indicate a non-tranlatable method body is not currently considered a viable solution.  Similarly, we do not want to imply that InputStreams or OutputStreams are supported, so would not want to include definitions of those methods in the GWT JRE.

 

 

Minimizing code bloat

Allow the user to specify either a Black list or a White list to give more control over exactly what types are serializable.  If a white list is included, those types and any subtypes of those types are considered serializable. If a Black list is included, those types and any subtypes would be excluded.    An error would be generated if there are service interface methods that cannot be fulfilled with the given white list or black list.

 

Unexpected Subtypes

Once a system such as Hibernate or JDO is used to produce DTO objects, then that subsystem can assign subtypes to any field of the DTO.   Some common examples of this problem is  Hibernate's tendency to use proxies that are subclasses of the actual DTO objects and JDBC's tendency to return java.sql.Date objects instead of java.util.Date.

 


--
"There are only 10 types of people in the world: Those who understand binary, and those who don't"

Dan Morrill

unread,
Jan 18, 2007, 1:53:42 PM1/18/07
to Google-Web-Tool...@googlegroups.com

Good stuff, Emily!  I'm sure we've got a lot of users who will be very interested in this thread. :)

I dislike violating the semantics of the Serializable contract.  I like the freedom (from a code perspective) implicit in not having a serializable marker at all, but I am very concerned by the code bloat issue you mention.

So tentatively I favor option 1), but with the following addition:  why don't we require users to specify serializable types externally to the .java source file?  (Alternatively, merely allow them to, so that they can choose to use the feature if they notice code bloat in practice.)

Here is what I am thinking.  Currently, we already require users to tell us something about their Java source code, via a route external to the source files:  namely, whether a class is translatable or not.  This is accomplished via the <source> tag in the Module.gwt.xml file.  The serializability of a class is simply an additional meta-info tag about the class.  ("It is translatable, and further it is serializable.")

Would it be practical to provide a hook in Module.gwt.xml to mark packages containing serializable code?  Something like:

<module>
    <inherits name="com.google.gwt.user.User"/>
    <entry-point class="com.company.app.client.EntryPoint "/>
    <source path="client"/>
    <source path="dto"/>
    <serializable path="dto"/>
</module>

I admit that I don't have a lot of in-depth knowledge of the guts of the serialization stuff, so please forgive if the above makes no sense. :)

- Dan

Bogdan Alexandru Costea

unread,
Jan 18, 2007, 2:23:36 PM1/18/07
to Google Web Toolkit Contributors
I'm in support of 1 also.
IsSerializable is a nuissance, as it forces you to create an extra copy
of your model to use as DTO, if you don't want to tie your server-side
code to GWT.

I think that Dan's proposal is excellent, of creating a serializable
"white list", that the developer maintains.

On compilation, a nice message should appear that instructs the
developer to include his serializable model package in the *.gwt.xml
file.

Bogdan

Scott Blum

unread,
Jan 18, 2007, 7:30:39 PM1/18/07
to Google-Web-Tool...@googlegroups.com
On 1/18/07, Emily Crutcher <e...@google.com> wrote:
> Cons [for Solution 1, Infering Serializable Types]

>
> Code bloat: As the user cannot limit the classes considered for inclusion,
> it may cause code bloat by adding too many classes to the javascript
> executable.
>
> Runtime exceptions. It would be relatively easy for the algorithm above to
> process a type for serialization which was never intended by the user to be
> serialized, such as Widgets, Elements, XML objects. Unless the type was
> included in a blacklist, this would not be detected until runtime, where the
> exception would be thrown into server side logs.

I'm a big fan of the whitelist/blacklist idea to allow the developer
to minimize code size as much as possible. In fact, I'm about to
argue that pushing whitelist/blacklist would most often lead to less
code bloat.

I think the right developer experience involves a feedback cycle.
1) Create a service interface
2) Compile your app
3) A list of all serializable types is produced (*)
4) Developer inspects the output list and adjusts whitelist/blacklist
5) Repeat 2-4 until happy.

Some advantages to this approach include module inheritability of
whitelist/blacklist. For example, we could blacklist all of the
Widgets, because they are inherently not serializable. This would
force a developer to either specifically whitelist them (and accept
the consequences) or mark transient any fields in serializable types
that reference Widgets. It's also a lot easier to allow a developer
to use metadata to control this than to potentially force them to add
"Serializable" to objects that may be part of someone else's library.

*) The other great side effect is this list of serializable types
produced can be used server side to guarantee the server doesn't try
to send any types across that the client cannot handle. The error is
logged on the server rather than caught on the client. This "list"
could either be something like an XML file, or it could actually be a
set of FooSerializer.java classes that become server-side
serialization implementation classes; thus allowing fast server-side
serialization without even having to use reflection, potentially. Or
both. :)

> Unexpected Subtypes
>
> Once a system such as Hibernate or JDO is used to produce DTO objects, then
> that subsystem can assign subtypes to any field of the DTO. Some common
> examples of this problem is Hibernate's tendency to use proxies that are
> subclasses of the actual DTO objects and JDBC's tendency to return
> java.sql.Date objects instead of java.util.Date.

I think this also ties into the idea that the compile process can
generate metadata about what is serializable. In this scenario, it
would be easy to see how the following scenario plays out:

1) Server tries to send a java.sql.Date.
2) Serialization detects that java.sql.Data is not serializable to the
client, but the superclass java.util.Date is.
3) Server serializes as java.util.Date but logs a warning
4) Developer sees the warning and adds metadata to map java.sql.Date
-> java.util.Date.
5) Server no longer complains.

Scott

Dan Morrill

unread,
Jan 18, 2007, 7:40:37 PM1/18/07
to Google-Web-Tool...@googlegroups.com

I have one comment, below...

On 1/18/07, Scott Blum <sco...@google.com> wrote:
I think the right developer experience involves a feedback cycle.
1) Create a service interface
2) Compile your app
3) A list of all serializable types is produced (*)
4) Developer inspects the output list and adjusts whitelist/blacklist
5) Repeat 2-4 until happy.

Running the compiler to get this data seems painful, particularly for users who use primarily hosted mode.  Can there be a utility provided to do this scan for the user?

- Dan

Scott Blum

unread,
Jan 18, 2007, 7:44:20 PM1/18/07
to Google-Web-Tool...@googlegroups.com
Actually, hitting "refresh" in hosted mode work for step 2 as well, because Generators run both in hosted mode and during a compile.

Emily Crutcher

unread,
Jan 19, 2007, 7:12:09 AM1/19/07
to Google-Web-Tool...@googlegroups.com
We could also take the whitelist/blacklist  idea one step further by creating a editor for it off of our hosted mode that is only included when the RPC module is active. That would make it harder for people to avoid looking at the serializable types, and would therefore address some of my concerns.

Bruce Johnson

unread,
Jan 19, 2007, 8:39:38 AM1/19/07
to Google-Web-Tool...@googlegroups.com
I prefer #1.


On 1/18/07, Dan Morrill <morr...@google.com> wrote:

I dislike violating the semantics of the Serializable contract. 

Agreement. IMO, it's just too much of a minefield for developers to quasi-support Serializable without being able to do something at least reasonable with readObject/writeObject/etc. (which we can't).

So tentatively I favor option 1), but with the following addition:  why don't we require users to specify serializable types externally to the .java source file?

We considered this originally and decided (and I still strongly believe) that it's way too much of a hassle to have to *always* explicitly name the serializable classes. Requiring it could discourage rich type hierarchy design, because every new subclass would require a corresponding change in .gwt.xml. This crosses the line into meta-data hell, and that's why we decided to infer all the classes automatically. I'm *not* saying that we shouldn't support whitelist/blacklist optimizations, but they should be just that: optimizations, and not in the critical path.

  (Alternatively, merely allow them to, so that they can choose to use the feature if they notice code bloat in practice.)

We could break implementation of this idea into iterations, so that we could proceed wisely as we add each new bit of functionality:

1) Serialize everything as described under #1 without whitelist or blacklist support
2) Find a way to make it really easy to inform the developer exactly which classes are getting pulled in, like some sort of compile report (or maybe just a good tree logger entry)
3) Add blacklist support such that any type mentioned in the module (and its subtypes) would not be a candidate for serialization.
4) If blacklist support alone isn't good enough, add whitelist support so that you can explicitly allow any types you want (and their subtypes, implicitly).

Even if we just do iteration (1) above, I wouldn't say that I'm "worried" about code bloat yet because I'd like to think the compiler can do a good enough job optimizing that it wouldn't be devastating to casual users.

Here is what I am thinking.  Currently, we already require users to tell us something about their Java source code, via a route external to the source files:  namely, whether a class is translatable or not.  This is accomplished via the <source> tag in the Module.gwt.xml file.  The serializability of a class is simply an additional meta-info tag about the class.  ("It is translatable, and further it is serializable.")

Would it be practical to provide a hook in Module.gwt.xml to mark packages containing serializable code?  Something like:

<module>
    <inherits name="com.google.gwt.user.User"/>
    <entry-point class="com.company.app.client.EntryPoint "/>
    <source path="client"/>
    <source path="dto"/>
    <serializable path="dto"/>
</module>

This formulation may have potential, but it may not be flexible enough, since you might want to have several (or one large) serializable domain models under the same package even though you only want to use different subsets in different GWT modules. IOW, maybe specifying whitelists/blacklists via types would be more precise. And by whitelisting a base class, you implicitly whitelist its subclasses, so you can still easily whitelist a family of types without undue config.

-- Bruce

Bruce Johnson

unread,
Jan 19, 2007, 8:43:43 AM1/19/07
to Google-Web-Tool...@googlegroups.com
On 1/19/07, Emily Crutcher <e...@google.com> wrote:
We could also take the whitelist/blacklist  idea one step further by creating a editor for it off of our hosted mode that is only included when the RPC module is active. That would make it harder for people to avoid looking at the serializable types, and would therefore address some of my concerns.

I think this comment shows why we don't want to strictly require whitelists/blacklists for RPC: it complicates the simple case. Already, we're talking about creating additional software just to support a concept that ultimately is very low-level and (probably) only matters when you start thinking about optimizations -- which should be much later in the cycle.

Hosted mode should be all about being able to quickly get your work started without thinking too hard. I'd rather we spend time thinking about ways to make hosted mode require *less* thought and setup rather than more.

Miguel Méndez

unread,
Jan 19, 2007, 9:38:27 AM1/19/07
to Google-Web-Tool...@googlegroups.com
I agree with Bruce on this one.  The whitelist/blacklist component should only serve to tune the RPC system but it should not be a requirement to get it up and running.
--
Miguel

Emily Crutcher

unread,
Jan 19, 2007, 10:43:59 AM1/19/07
to Google-Web-Tool...@googlegroups.com
Why would an optional RPC class inclusion editor make it harder to use hosted mode? The use case I'm particularly worried about in hosted mode is the user implements a RPC interface, starts up his/her program, and the server immediately crashes because a client-only class was sent to the server. It seems like useful debugging to be able to look at the RPC class inclusion editor and figure out what classes were included and immediately exclude them. If the server doesn't crash and the user doesn't care what classes are included, the user wouldn't click on the button displaying the editor.
 
 
 
 

 
On 1/19/07, Bruce Johnson <br...@google.com> wrote:

Scott Blum

unread,
Jan 19, 2007, 11:02:50 AM1/19/07
to Google-Web-Tool...@googlegroups.com
On 1/19/07, Bruce Johnson <br...@google.com> wrote:
> I think this comment shows why we don't want to strictly require
> whitelists/blacklists for RPC: it complicates the simple case. Already,
> we're talking about creating additional software just to support a concept
> that ultimately is very low-level and (probably) only matters when you start
> thinking about optimizations -- which should be much later in the cycle.

Actually, I totally agree with this, though I didn't say it. The user
experience should be that it just "works". I do think that w/b/lists
are a necessary part of the feature though, if only to head off at the
pass the "GWT RPC made my app 500k!" complaint.

Funny thought I had about this feature: even if we go with solution #1
a user could actually whitelist java.io.Serializable (and things that
implement it) to acheive the same effect as #2. :)

Scott

Bogdan Alexandru Costea

unread,
Jan 19, 2007, 12:37:26 PM1/19/07
to Google Web Toolkit Contributors

It gives you the freedom to optimize your app and at the same time
allows one to shoot himself in the foot. Perfect.

Bogdan

Bruce Johnson

unread,
Jan 19, 2007, 6:50:08 PM1/19/07
to Google-Web-Tool...@googlegroups.com
On 1/19/07, Bogdan Alexandru Costea <abco...@gmail.com> wrote:

It gives you the freedom to optimize your app and at the same time
allows one to shoot himself in the foot. Perfect.

With great freedom comes great responsibility :-)

To get some closure on this (yay!!!)...does anyone think we should not go with #1 without *requiring* w/b lists. Then, perhaps in a subsequent iteration, add w/b list support?

Bruce Johnson

unread,
Jan 19, 2007, 6:55:53 PM1/19/07
to Google-Web-Tool...@googlegroups.com
On 1/19/07, Emily Crutcher <e...@google.com> wrote:
Why would an optional RPC class inclusion editor make it harder to use hosted mode? The use case I'm particularly worried about in hosted mode is the user implements a RPC interface, starts up his/her program, and the server immediately crashes because a client-only class was sent to the server. It seems like useful debugging to be able to look at the RPC class inclusion editor and figure out what classes were included and immediately exclude them. If the server doesn't crash and the user doesn't care what classes are included, the user wouldn't click on the button displaying the editor.

I just meant that I'm wary of a feature design decision that makes you feel like you need to write additional software just to make the use of that feature manageable. If we automatically determine serializability by default (which I strongly advocate), then 98% of the time you don't ever need to think about this issue in hosted mode at all -- thus no new hostd mode feature need be added. Only when you move into the "let's optimize this for production" phase do you need to start thinking about w/b lists.

Sandy McArthur

unread,
Jan 19, 2007, 9:00:38 PM1/19/07
to Google Web Toolkit Contributors
Bruce Johnson wrote:
> To get some closure on this (yay!!!)... does anyone think we should not go

> with #1 without *requiring* w/b lists. Then, perhaps in a subsequent
> iteration, add w/b list support?

I think #1 is the way to go.

I'm not sure w/b lists will help that much. I think it would be better
if GWT's RPC provided an optionial way to specify type converters or
normalizers. These could provide the necessary meta data so there isn't
any extra code bloat for unused types and deal with unexpected
subtypes.

More common than the unexpected subtypes from Hibernate or
java.util/sql.Date is from collections. The result from
ArrayList.subList is not a subtype of ArrayList, it's a
java.util.SubList or java.util.RandomAccessSubList but I still want the
subList sent as a ArrayList across RPC. Also a type normalizer could
allow translations between heavy weight server side classes and light
weight client side bean classes.

Anyway, is the new RPC going to replace the exiting RPC implementation
or live in another package? Would it be worth developing this as an
external lib to GWT so people can download it and try it out easily?

Scott Blum

unread,
Jan 19, 2007, 10:18:11 PM1/19/07
to Google-Web-Tool...@googlegroups.com
On 1/19/07, Sandy McArthur <sand...@gmail.com> wrote:
More common than the unexpected subtypes from Hibernate or
java.util/sql.Date is from collections. The result from
ArrayList.subList is not a subtype of ArrayList, it's a
java.util.SubList or java.util.RandomAccessSubList but I still want the
subList sent as a ArrayList across RPC. Also a type normalizer could
allow translations between heavy weight server side classes and light
weight client side bean classes.

These are great use cases.  I'm a little unclear though on your idea of type converters.  Could you go into a little more detail?

Anyway, is the new RPC going to replace the exiting RPC implementation
or live in another package? Would it be worth developing this as an
external lib to GWT so people can download it and try it out easily?

I think the answers to both of these questions are "yes".  We will want to support backwards compatibility, so we'll need to keep the existing RPC implementation.  But we'll also want the freedom to figure out the best solution.  IMHO, if at the very end we arrive at a "best solution" and find out it could easily be backwards compatible with the existing RPC, then we could consider replacing the old stuff.  Disagreements welcome. :)

Scott

Sandy McArthur

unread,
Jan 20, 2007, 10:26:07 AM1/20/07
to Google Web Toolkit Contributors
Scott Blum wrote:
> I'm a little unclear though on your idea of type
> converters. Could you go into a little more detail?

No. :) The converter idea came to as I was writing about normalizing
the subLists and I hadn't thought it through. Having slept on it, I can
say it wouldn't work in a compatible way with the current RPC because
of strong typing. In the end it wouldn't provide any benefit to the
programmer overing converting object types manually and would likely be
more confusing.

A normalizer wouldn't have the typing problem. Most of the time it
would just create an new instance and copy the data over. Eg: pulling
values from a proxy object created by hibernate and stuffing the values
into a fresh instance. In the case of collections or anywhere there is
a interface and a lot of sibling implementations the normalizer would
create an instance of the RPC friendly implementation and copy data
into it. This could allow some conversion to go on too but it would be
limited to type compatible conversions.

My speculation on how it'd work is that the RPC servlet would take the
return type if it wasn't a primitive and look for a normalizer in a map
for the called method's return type. If found it would pass the
instance to a normalizer. Then where the rpc servlet grabs fields it
would repeat the normalizing process for the type each field was
declared as.

Scott Blum

unread,
Jan 20, 2007, 2:33:37 PM1/20/07
to Google-Web-Tool...@googlegroups.com
Then I think we're on the same page, if I can elaborate on what you wrote with "The RPC servlet knows what types should be normalized as what because it is initialized with metadata output from the RPC code generation process." :)
Reply all
Reply to author
Forward
0 new messages