Circular References Implementation Status

38 views
Skip to first unread message

Arthur Blake

unread,
Sep 30, 2007, 12:07:17 AM9/30/07
to jabsor...@googlegroups.com
All of the following is a bit esoteric, but I'd encourage anyone interested in the details of the circular references implementation to read it, think about it and comment if they have any issues or insight-- because circular references will change the future of the library (perhaps profoundly.)

I've finally found some time this weekend to devote to finishing up the circular references implementation that we have all been excited about.

For background info see:  http://code.google.com/p/jabsorb/issues/detail?id=6

I have come to the conclusion that the "fixups" phase that is central to making circular references work needs to be done slightly differently than was originally proposed.

While seemingly elegant and cute, the "eval" that takes place to apply fixups has a few drawbacks:

1. It's not standard json itself, it's a String of executable JavaScript -- This was one part of the scheme that made it elegant in the first place.  But it's only elegant on the JS side.   It means that an extra parser will be required on the Java side.  Admittedly it's a pretty simple parser because it's just parsing assignment expressions-- but this in itself would be uneccesarily complex and might make it harder to adopt this in other json-rpc implementations (we want to try and submit this as a feature in the JSON-RPC protocol itself)

2. It's might be seen by some to be insecure in that the eval can run any old executable code (eval is evil) -- even though we're just using it to run a series of assignment statements to fix up the circular references.  In other words, it's a little more powerful than we need.  Some might say, "who cares, we parse the json this way" but there are JS parsers that can read in the json without evaling it for use in security minded situations, that we might want to switch to at one point to alleviate those concerns- so why add an extra eval for running the assignment fixups if there is a simple alternative way to do it.
   
I thought up with a way to represent these fixups as just arrays of arrays of references, rather than assignment statements.  Instead of using eval, the fixups can then be applied with some simple looping operations
which don't have the drawback of running any executable code (they can only apply the fixups to the object graph-- so it's therefore more secure). 

It has the nice side benefit of representing the data more compact with with less duplication because the root object reference doesn't need to be supplied over and over, and also the double escaping that takes place in the first scheme is avoided, also making the data stream smaller.

The core of the change is to instead represent an object fixup location as an array of references to represent the path to the object being fixed up, or the path to the original object that is fixing it up.

It's perhaps easier to describe this with an example:

eval way:

  "fixups": "r[\"beanB\"][\"beanA\"]=r;"

new way:

  "fixups": [[
    [
      "beanB",
      "beanA"
    ],
    []
  ]],

Notice that the array method is really the exact same information, it's just transferred as arrays, and it doesn't have to redundantly store "r"or the extra escaped quote marks as has to be done the eval way.  The 2nd one looks bigger cause it's pretty printed-- another nice side benefit for debugging situations (in the first case it's just one big string which can't be pretty printed)

It's debatable whether it's more readable or not.  I'd argue that both schemes are pretty unreadable when you get into examples with large amounts of fixups.

I have gotten all this circular reference stuff working for 3 out of 4 of the cases of serializing (both sending and receiving on the JS side, and marshalling on the JS side.) 

The only part left to finish is the unmarshalling in Java -- It is turning out for me to be the most challenging part to get working, and I think it will require an extra method in the Serializer interface (a drawback-- but I think necessary because of the Java/Javascript impedance mismatch.)

Here's how the algorithm will work:

1. Store all objects in SerializerState on first pass (like is now done in marshall case for circular references) but don't generate fixups like the marshall case does.

2. Also store an instance of the Serializer as part of the ProcessedObject that was used to process each object (because multiple Serializers can potentially serialize the same types of objects-- we need to know what kind of serializer was used to serialize each object in the fixup phase-- especially with custom serializers, etc.)
   
3. Add an extra method to the Serializer interface, called "fixup" that knows how to apply a fixup in the fixup phase.  Essentially, it knows how to attach a child element to the type of element produced by the serializer, given the child element reference which can be either an int or String, and the Object to attach as a child of that reference.  The fixup can throw an UnmarshallException if an attempt is made to fixup something that shouldn't be fixed up. (example, adding a string based key to an array based object.
   
4. Now you can execute the unmarshall fixup phase which has the ability to "walk" the processed objects stored in the SerializerState in 1. in order to apply the fixups.
   

I hope to have all of the above working very soon, and a test release that is the current 1.1.1 release plus just circular references (1.2.circtest1) so people can begin to play with it and see how well it all works in practice.

A

Arthur Blake

unread,
Oct 3, 2007, 9:22:31 PM10/3/07
to jabsor...@googlegroups.com
Just another update-- luckily, William had a lot to say about this to me off line, and convinced me that a new method to the Serializer interface really isn't necessary :) (see 3. in my previous message.)  I just had to fix up the existing serializers a bit to account for the base object before recursing on down the chain -- so it would have a way to stop the recursion...  It really was pretty simple once all said and done.

I now have circular references completely working and passing all my new circular reference unit tests!

I'm just cleaning up the code a bit before I check it in to the trunk.

Then I'll make a test build for people to try out--

I'm just curious how many people want this / are waiting for it .. can you respond to this message?  I especially need people to help me test it out well and wring out the bugs...

Please respond-- if you need/want or can't wait for circular references!

thanks
A

arthur...@gmail.com

unread,
Oct 5, 2007, 5:38:03 PM10/5/07
to jabsorb-user
Yet another update! Circular References are completely working and
checked in to the trunk.
William is reviewing my code and kicking the tires a little bit before
we make a test release for people to play with.
I encourage anyone else who is interested to check out the trunk and
review my code....

On Oct 3, 9:22 pm, "Arthur Blake" <arthur.bl...@gmail.com> wrote:
> Just another update-- luckily, William had a lot to say about this to me off
> line, and convinced me that a new method to the Serializer interface really
> isn't necessary :) (see 3. in my previous message.) I just had to fix up
> the existing serializers a bit to account for the base object before
> recursing on down the chain -- so it would have a way to stop the
> recursion... It really was pretty simple once all said and done.
>
> I now have circular references completely working and passing all my new
> circular reference unit tests!
>
> I'm just cleaning up the code a bit before I check it in to the trunk.
>
> Then I'll make a test build for people to try out--
>
> I'm just curious how many people want this / are waiting for it .. can you
> respond to this message? I especially need people to help me test it out
> well and wring out the bugs...
>
> Please respond-- if you need/want or can't wait for circular references!
>
> thanks
> A
>

Reply all
Reply to author
Forward
0 new messages