Re: [json] Cycles

19 views
Skip to first unread message

Kris Zyp

unread,
Oct 28, 2008, 12:34:01 AM10/28/08
to js...@yahoogroups.com, restfu...@googlegroups.com
This is great to see, obviously I am pleased to see further adoption of reference-capable JSON. I did want to make a few comments:

While I had originally specified "$" as the reference to the root of current JSON document following the lead of JSONPath, I have since began to think that "#" is be a better choice. The reason for this is because full JSON referencing (formerly known as JSPON, JSPON was an earlier attempt at full CRUD protocol, but now it essentially superseded by HTTP + JSON + JSON referencing) is intended to be exactly analogous to the semantics of hyperlinks, with resolution following the rules of relative URIs, essentially allowing JSON to act as RESTful hypermedia with JSON referencing is applied. Path-based referencing is intended to define the referencing targets based upon position in the current JSON document and integrates well with the relative URL, forming a subset that is purely internal. Path-based referencing is the subset of JSON referencing that Crockford has implemented in cycle.js. However, in the context of considering path-based referencing as a proper subset of full relative URI based hyperlinking, it is actually more semantically correct to prefix a path-based reference with "#". The "#" symbol is specifically designated to have the purpose of referencing parts of the currently loaded document. Just as we use the hash part of URLs in HTML to navigate to different parts of a page, this is analogous to resolving/dereferencing different parts of the current JSON document/object graph. By using the "#" symbol to refer to the root of the JSON document, this path-based referencing can act as an exact subset of relative URI referencing, where the references implicitly indicate that their targets are internal, and referencing implementations like cycle.js can act on this exact subset without any further relative URI knowledge while implementations that are integrated with URI resolving and retrieval capabilities can act on the full set of JSON references without incompability.

For further information on JSON referencing (with path and id-based referencing), I originally described it here: http://www.json.com/2007/10/19/json-referencing-proposal-and-library/, with some more examples here: http://www.sitepen.com/blog/2008/06/17/json-referencing-in-dojo/. This is all with the original "$" notation for the root. The impetus for using "#" came from recent discussions with the RESTful JSON group (http://groups.google.com/group/restful-json, and cc'ing them on this email), and my resultant proposal for RESTful JSON interaction (http://www.json.com/specifications/json-resources/). I haven't updated these blog posts nor http://jspon.org with the "#" because these discussions were quite recent, and actually I would love to get feedback from this group for the best approach. Right now Dojo's implementation of JSON referencing uses the "#" notation. However, I would certainly be willing to change it per the consensus of the community.

Also, I am certainly not asking for a change in names, but I have been hesitant to use the term cycle/circular only because I think multiple referencing is every bit as valuable circular referencing. Multiple referencing is covered by cycle.js as well; cycles are subset of the referencing dependent data structures that are enabled by cycle.js. For example:

var b= {foo:"bar"};
var c={d:b,e:b};
JSON.stringify(decycle(c)) ->
"{"d":{"foo":"bar"},"e":{"$ref":"$[\"d\"]"}}"
And
var newC = retrocycle(decycle(c));
newC.d === new.e // -> true (identity is preserved)

Just didn't want you to sell yourself on the capabilities of this library, it does more than it's name suggests.

Anyway, once again, I think this library looks great, it is exciting to see the growing potential for ubiquitous interchange of rich data structures with a common technique for referencing implemented by this library, Dojo, Persevere, and hopefully others.
Thanks,
Kris

Douglas Crockford wrote:

JSON is not able to directly represent cyclical structures. However,
with some simple transformation, it can. This is demonstrated by two
JavaScript functions, decycle(value) and retrocycle(value).

decycle produces a deep copy of a value, except that recurrences are
replaced with JSPON notations. retrocycle modifies an object by
replacing JSPON notations, restoring cycles. The result of decycle can
be given to an encoder (such as JSON.stringify). The result of a JSON
decoder (such as JSON.parse) can be given to retrocycle.

These functions use a subset of Kris Zyp's JSPON, which uses a subset
of Stefan Goessner's JSONPath.

See http://www.JSON.org/cycle.js

__._,_.___
Recent Activity
Visit Your Group
Yahoo! Finance

It's Now Personal

Guides, news,

advice & more.

Need traffic?

Drive customers

With search ads

on Yahoo!

Check out the

Y! Groups blog

Stay up to speed

on all things Groups!

.

__,_._,___

Manuel Simoni

unread,
Oct 29, 2008, 1:30:56 PM10/29/08
to restfu...@googlegroups.com
On Tue, Oct 28, 2008 at 5:34 AM, Kris Zyp <kri...@gmail.com> wrote:
> This is great to see, obviously I am pleased to see further adoption of
> reference-capable JSON.

Just a quick thought --

Reading your posts made me realize how great a JSON hyperlinking
solution would be.

(I haven't read all of your past proposals, so excuse me if these
points have already been discussed.)

I see two requirements for a solution, analogous to HTML hyperlinking:

1) Standard ways for putting links into a document.
Compare HTML's <a href> and <link href>.
These are needed so that an agent can discover all outgoing links
in any document, irrespective of its content.

2) Standard way for targetting and "marking up" subelements within a document.
Compare HTML's # combined with <a name> and <a id>

So, by simply copying HTML's prior art, a *complete strawman* proposal
could look like this:

1. Everywhere a $ref member occurs it's value must be a URL.
{ "$ref": "http://example.com/some-doc#foo" }

2. Everywhere a $id member occurs its value must be a ID token.
{ "$id": "foo" }

I think demanding more complex processing from agents (e.g. a path
expression language) would strongly hinder the adoption of any JSON
hyperlinking proposal.

Thanks,
Manuel

Kris Zyp

unread,
Oct 29, 2008, 1:53:53 PM10/29/08
to restfu...@googlegroups.com

> Just a quick thought --
>
> Reading your posts made me realize how great a JSON hyperlinking
> solution would be.
>
> (I haven't read all of your past proposals, so excuse me if these
> points have already been discussed.)
>
> I see two requirements for a solution, analogous to HTML hyperlinking:
>
> 1) Standard ways for putting links into a document.
> Compare HTML's <a href> and <link href>.
> These are needed so that an agent can discover all outgoing links
> in any document, irrespective of its content.
>
> 2) Standard way for targetting and "marking up" subelements within a document.
> Compare HTML's # combined with <a name> and <a id>
>
Absolutely, that is exactly the intent JSON referencing.

> So, by simply copying HTML's prior art, a *complete strawman* proposal
> could look like this:
>
> 1. Everywhere a $ref member occurs it's value must be a URL.
> { "$ref": "http://example.com/some-doc#foo" }
>
>
Yes, that is exactly how JSON referencing works.

> 2. Everywhere a $id member occurs its value must be a ID token.
> { "$id": "foo" }
>
That is the general idea. However, while I have also thought it would be
great to have one standard identity property (I had proposed "id", just
a little different spelling than your "$id"), but in order to facilitate
adoption and utilize existing JSON data with minimal modification, I
think it advantageous to support user-defined identity attribute. In the
JSON resources proposal I had suggested a content type parameter. One
could also have an envelope that defines the identity property (it seems
this group would lean towards that approach, not sure about other JSON
folks). Also, I think it is beneficial to define the identity as a real
URI location (same resolution rules as $ref) rather than a fragment
target, so that sub-objects can be treated as their own resources, which
is important for efficient updating. Therefore it would be more
analogous to the HTML<base> tage than <a name> tag.

> I think demanding more complex processing from agents (e.g. a path
> expression language) would strongly hinder the adoption of any JSON
> hyperlinking proposal.
>

That's a reasonable concern, but I believe path-based
linking/referencing is important because it is critical for describing
arbitrary object topologies (like cycles) and can be less intrusive. For
example, if you had an array with a circular reference, this would be
impossible to describe with id-based referencing since an array can't
define an identity in JSON:
{"myArrayWithACycle":[{"$ref":"#myArrayWithACycle"}]}
This can not be described with id-based referencing. Also from my
experience, there are a lot of people that have expressed a preference
to path-based referencing so that don't have to assign ids to all
objects (and sub-objects) that might be targets of references. Also, one
could actually use path-based referencing to refer to primitive values
(like strings), which could have space-saving implications. Finally, the
implementation I have done in JavaScript (available in Dojo in
dojox.json.ref) probably only has a few hundred bytes of extra code to
handle paths, so it really isn't that painful.


Thanks,
Kris

Manuel Simoni

unread,
Oct 30, 2008, 9:52:46 AM10/30/08
to restfu...@googlegroups.com
On Wed, Oct 29, 2008 at 6:53 PM, Kris Zyp <kri...@gmail.com> wrote:
>> So, by simply copying HTML's prior art, a *complete strawman* proposal
>> could look like this:
>>
>> 1. Everywhere a $ref member occurs it's value must be a URL.
>> { "$ref": "http://example.com/some-doc#foo" }
>>
>>
> Yes, that is exactly how JSON referencing works.
>> 2. Everywhere a $id member occurs its value must be a ID token.
>> { "$id": "foo" }
>>
> That is the general idea. However, while I have also thought it would be
> great to have one standard identity property (I had proposed "id", just
> a little different spelling than your "$id"), but in order to facilitate
> adoption and utilize existing JSON data with minimal modification, I
> think it advantageous to support user-defined identity attribute.

What if the "IDs" used for targetting elements were decoupled from the
user-defined IDs for "business objects"? IOW, people would add "$id"
(for making nested objects targettable) in addition to whatever
identifying scheme they were already employing (including none).

I think this would avoid a whole can of worms. (For example, imagine
the complexity if HTML's <a name> anchors had to work with arbitrary
attributes besides @name.)

> [...]


> Also, I think it is beneficial to define the identity as a real
> URI location (same resolution rules as $ref) rather than a fragment
> target, so that sub-objects can be treated as their own resources, which
> is important for efficient updating. Therefore it would be more
> analogous to the HTML<base> tage than <a name> tag.

As I said before, I believe that for a JSON referencing effort to be
successful, it has to be of utmost simplicity and follow HTML's
footsteps. I am not entirely sure of this, but for now it's my working
assumption.

What you are trying to achieve with clearly identifyable subresources
I see as a separate, orthogonal effort to a referencing solution. Once
a referencing solution were in place (and subelements of a JSON
structure could be targetted), only then can one think further about
clearly identifyable subresources, which are very much a separate
topic from "mere" referencing in my opinion. (They are more of an
"object model"/media type thing, imho).

IOW, I don't think there's a case for deviating from HTML's prior art
and requiring "<a name>" = "$id" anchors be full URIs. The <a name>
and $id IDs should be mere identifiers used for selecting a
subresource *included in another resource's representation* (exactly
the same as say, a list of microformatted hcard entries inside a HTML
web page). The $ids only need to make sense in the context of the
larger document they are embedded in.

>> I think demanding more complex processing from agents (e.g. a path
>> expression language) would strongly hinder the adoption of any JSON
>> hyperlinking proposal.
>>
> That's a reasonable concern, but I believe path-based
> linking/referencing is important because it is critical for describing
> arbitrary object topologies (like cycles) and can be less intrusive. For
> example, if you had an array with a circular reference, this would be
> impossible to describe with id-based referencing since an array can't
> define an identity in JSON:
> {"myArrayWithACycle":[{"$ref":"#myArrayWithACycle"}]}
> This can not be described with id-based referencing.

Right, but serializing object graphs is an issue that again I see
orthogonal to Web-scale hyperlinking. Sure, being able to serialize
arbitrary object graphs is a nice and much needed facility, but it is
much less widely applicable than general hyperlinking, and again I
think adding it to a generic JSON referencing proposal would hinder
its adoption (e.g. many people have never thought before about
serializing object graphs with cycles.)

> Also from my
> experience, there are a lot of people that have expressed a preference
> to path-based referencing so that don't have to assign ids to all
> objects (and sub-objects) that might be targets of references. Also, one
> could actually use path-based referencing to refer to primitive values
> (like strings), which could have space-saving implications. Finally, the
> implementation I have done in JavaScript (available in Dojo in
> dojox.json.ref) probably only has a few hundred bytes of extra code to
> handle paths, so it really isn't that painful.

Of course it would be nice if we could select HTML page subelements
with XPath expressions after #. But I speculate that putting that into
the HTML spec would have led to troubles:

Even if the library for evaluating such expressions is small, it still
has to be implemented by *every* participating agent, and without such
a library, the references are essentially *undecipherable*. This means
that if I want to whip up a spider for downloading a web of JSON
documents in programming language X, which doesn't have such a
library, I will first need to spend time on the library.

Compare this to a HTML-like hyperlinking solution, without path
expressions, where you only need to implement a URL resolution
mechanism (with some kind of "<base>" support) which you need anyway.
And for having to assign $id to all elements that should targettable,
I think that's a price that we simply may have to pay for making them
addressable.

(It's obvious that we are looking from different angles at the
problem, and I hope I am not boring you with my repeated references to
HTML. Anyways, thanks for the interesting discussion!)

Best regards,
Manuel

Kris Zyp

unread,
Oct 30, 2008, 11:39:12 AM10/30/08
to restfu...@googlegroups.com, js...@yahoogroups.com

(putting js...@yahoogroups.com back on the cc in case anyone is
interested over there)

Manuel Simoni wrote:
> On Wed, Oct 29, 2008 at 6:53 PM, Kris Zyp <kri...@gmail.com> wrote:
>
>>> So, by simply copying HTML's prior art, a *complete strawman* proposal
>>> could look like this:
>>>
>>> 1. Everywhere a $ref member occurs it's value must be a URL.
>>> { "$ref": "http://example.com/some-doc#foo" }
>>>
>>>
>>>
>> Yes, that is exactly how JSON referencing works.
>>
>>> 2. Everywhere a $id member occurs its value must be a ID token.
>>> { "$id": "foo" }
>>>
>>>
>> That is the general idea. However, while I have also thought it would be
>> great to have one standard identity property (I had proposed "id", just
>> a little different spelling than your "$id"), but in order to facilitate
>> adoption and utilize existing JSON data with minimal modification, I
>> think it advantageous to support user-defined identity attribute.
>>
>
> What if the "IDs" used for targetting elements were decoupled from the
> user-defined IDs for "business objects"? IOW, people would add "$id"
> (for making nested objects targettable) in addition to whatever
> identifying scheme they were already employing (including none).
>
> I think this would avoid a whole can of worms. (For example, imagine
> the complexity if HTML's <a name> anchors had to work with arbitrary
> attributes besides @name.)
>
My point is that if a user already has a scheme for identifying
sub-objects why should we force them to add yet more properties to their
JSON to make it referenceable if it is already referenceable? If a user
already has URI referenceable subobjects, we can leverage that by simply
understanding the mapping (by knowing the identity property), rather
than creating new targetting schemes. I want to minimize the complexity
that we hoist on users. Adding properties is costly as well, encroaching
in the domain of available properties for user purposes.


>> [...]
>> Also, I think it is beneficial to define the identity as a real
>> URI location (same resolution rules as $ref) rather than a fragment
>> target, so that sub-objects can be treated as their own resources, which
>> is important for efficient updating. Therefore it would be more
>> analogous to the HTML<base> tage than <a name> tag.
>>
>
> As I said before, I believe that for a JSON referencing effort to be
> successful, it has to be of utmost simplicity and follow HTML's
> footsteps. I am not entirely sure of this, but for now it's my working
> assumption.
>
>

Of course I agree, with the caveat that HTML is structurally different
than JSON. I certainly want to follow HTML hyperlinking's steps as far
as they make sense in JSON, but not where JSON lends itself to a
different approach.


> What you are trying to achieve with clearly identifyable subresources
> I see as a separate, orthogonal effort to a referencing solution. Once
> a referencing solution were in place (and subelements of a JSON
> structure could be targetted), only then can one think further about
> clearly identifyable subresources, which are very much a separate
> topic from "mere" referencing in my opinion. (They are more of an
> "object model"/media type thing, imho).
>

I certainly want to separate orthogonal concerns, but understanding how
to identify targets so they can be referenced is critical to
referencing. Not thinking about the future is definitely a poor way to
design systems, so we need to think about how to keep consistent designs
for the future so as to minimize complexity for users.


> IOW, I don't think there's a case for deviating from HTML's prior art
> and requiring "<a name>" = "$id" anchors be full URIs. The <a name>
> and $id IDs should be mere identifiers used for selecting a
> subresource *included in another resource's representation* (exactly
> the same as say, a list of microformatted hcard entries inside a HTML
> web page). The $ids only need to make sense in the context of the
> larger document they are embedded in.
>
>

Yes, I understand, this is certainly a reasonable idea, but once again
it does nothing to address the issues I mentioned. How do you reference
an array or a primitive? This is where HTML-JSON analogy falls apart.
Also, JSON users don't like mixing in extra properties into their JSON
data, whereas this doesn't tend to as much of an issue with HTML because
it is actually written into the format.

>>> I think demanding more complex processing from agents (e.g. a path
>>> expression language) would strongly hinder the adoption of any JSON
>>> hyperlinking proposal.
>>>
>>>
>> That's a reasonable concern, but I believe path-based
>> linking/referencing is important because it is critical for describing
>> arbitrary object topologies (like cycles) and can be less intrusive. For
>> example, if you had an array with a circular reference, this would be
>> impossible to describe with id-based referencing since an array can't
>> define an identity in JSON:
>> {"myArrayWithACycle":[{"$ref":"#myArrayWithACycle"}]}
>> This can not be described with id-based referencing.
>>
>
> Right, but serializing object graphs is an issue that again I see
> orthogonal to Web-scale hyperlinking. Sure, being able to serialize
> arbitrary object graphs is a nice and much needed facility, but it is
> much less widely applicable than general hyperlinking, and again I
> think adding it to a generic JSON referencing proposal would hinder
> its adoption (e.g. many people have never thought before about
> serializing object graphs with cycles.)
>

They are both fundamentally about linking to other entities, whether
they be internal or external. Hyperlinks in HTML do not differentiate
between internal and external links; you actually lose a lot if you
don't keep these consistent, because then you can't link to
fragments/subobjects inside other resources (URI with fragments).
Inconsistent design won't help users here.
I am also curious how you quantified that internal references (like
cycles) are less important than hyperlinking. In my experience working
with the JSON community, circular references have received more
attention and effort. IMO, they are both important.

I certainly appreciate the HTML angle, the application of hyperlinking
to JSON is exactly what I am after as well, and HTML provides valuable
lessons. I just want to make we apply it in a way that matches the
structure of JSON (fundamentally much different than HTML/XML). Thank
you too for the great discussion and suggestions!

Thanks,
Kris

Reply all
Reply to author
Forward
0 new messages