The use of microsyntaxes in JSON-LD

41 views
Skip to first unread message

Richard Cyganiak

unread,
Oct 18, 2010, 3:11:34 PM10/18/10
to jso...@googlegroups.com
We all have our personal crystal balls that we use to make guesses
about the future. Everyone's guesses are about as good or as bad as
anyone else's. Nevertheless, I gazed into mine for a while and would
like to share what I saw.


In my opinion, JSON-LD's biggest problem is its reliance on
microsyntaxes.

"<...>" has a special meaning.
"...@..." has a special meaning.
"...^..." has a special meaning.
"_:..." has a special meaning.

This means that mixing some JSON-LD into an existing Javascript app or
JSON API will most likely break stuff, because simple strings suddenly
mean different things.

I think that's a bad idea. A simple string should be a simple string,
even if it happens to contain a character like "@" or "^".

Instead of using microsyntaxes for nodes that are not simple strings,
JSON-LD should either a) use coercion, or b) turn the simple string
into a simple object, e.g.,

{"@":"..."}
{"value":"...", "lang":"..."}
{"value":"...", "type":"..."}
{"_":"..."}

or c) allow both of the above. Type coercion is the neater option in
most cases IMO.

Removing the microsyntaxes would, in my subjective view, move JSON-LD
from “doomed to obscurity” territory over to “could actually grow into
something awesome”.

Or putting it another way: I believe that there's only a market for
one “linked data in JSON” syntax really. Initially there might be a
number of proposals, but I'm 100% convinced that someone will discover
the right way of doing this eventually, and will pretty much take all
the mindshare. I'm also 100% convinced that the winning approach will
not do any of that silly microsyntax business. That's because the
microsyntax thing will immediately turn off the average JSON
developer, and he or she is the most important person for the success
of this proposal.

All the best,
Richard

Bradley P. Allen

unread,
Oct 18, 2010, 3:37:43 PM10/18/10
to json-ld
Richard- I'm finding this resonating with me, as I am having to create
almost exactly this kind of object from triple object strings in
json_ld_processor to support being able to reserialize triples into
other formats, per Manu's earlier suggestion. It would be simpler if
the JSON-LD just gave it to me, and as you say conceptually no harder
for the average JS coder. I am sure Manu and Mark would object on the
basis of brevity, but perhaps with simple strings as defaults one
wouldn't lose much on average. - BPA

Manu Sporny

unread,
Oct 18, 2010, 11:13:32 PM10/18/10
to jso...@googlegroups.com
On 10/18/2010 03:11 PM, Richard Cyganiak wrote:
> In my opinion, JSON-LD's biggest problem is its reliance on microsyntaxes.

If that's JSON-LDs biggest problem, then we're in pretty good shape :)

Wait a second... I thought you said JSON-LD's biggest problem was that
the entire approach was flawed:

http://groups.google.com/group/json-ld/browse_thread/thread/743c113ed753087a/a612a3689d28e8d7?hide_quotes=no#msg_3c8d52bf8971465e

:P

> This means that mixing some JSON-LD into an existing Javascript app or
> JSON API will most likely break stuff, because simple strings suddenly
> mean different things.

That's true only if you decide to not use type coercion.

> I think that's a bad idea. A simple string should be a simple string,
> even if it happens to contain a character like "@" or "^".

I disagree, see below:

> Instead of using microsyntaxes for nodes that are not simple strings,
> JSON-LD should either a) use coercion, or b) turn the simple string into
> a simple object, e.g.,
>
> {"@":"..."}
> {"value":"...", "lang":"..."}
> {"value":"...", "type":"..."}
> {"_":"..."}

Ok, so let's assume that we do that - then what is the following triple:

"foo": {"value": "20101018", "type": "xsd:dateTime"}

Is it this?

<> foo "20101018"^^xsd:dateTime

or is it this?

<> foo _:bnode1
_:bnode1 value "20101018"
_:bnode1 type "xsd:dateTime"

The approach above introduces ambiguity to the processing rules - how
are we going to handle that if we accept the proposal above?

> Removing the microsyntaxes would, in my subjective view, move JSON-LD
> from “doomed to obscurity” territory over to “could actually grow into
> something awesome”.

Well, since you put it that way... :P

> Or putting it another way: I believe that there's only a market for one
> “linked data in JSON” syntax really. Initially there might be a number
> of proposals, but I'm 100% convinced that someone will discover the
> right way of doing this eventually, and will pretty much take all the
> mindshare.

Hrm, can't say that I agree - Microformats, Microdata, RDFa...

> I'm also 100% convinced that the winning approach will not do
> any of that silly microsyntax business. That's because the microsyntax
> thing will immediately turn off the average JSON developer, and he or
> she is the most important person for the success of this proposal.

I agree that the average JSON developers is an important person for the
success of this proposal... however, I don't know if they're the "most
important person".

We found JSON-LD to be most useful for RESTful Web services, not
directly programming using JSON-LD constructs. However, we're still
playing around with the format. Our CTO has been complaining that
JSON-LD is still a bit rough around the edges and didn't like the
microsyntax stuff either.

I hadn't intended that our engineers would use JSON-LD directly as a
data structure, but they are... in C++. So, this approach as to /how/ to
use JSON-LD was something I hadn't anticipated. I thought people would
just read JSON-LD serializations in and place the resulting triples into
a triple store to work with it... a number of people have asked why they
can't just work with the data directly. We'll see how the next couple of
weeks play out... perhaps the path forward will become more clear.

I'll try to get some of your suggestions in there, Richard... specifically:

* Type coercion
* Ensuring Zero-edit claim
* Typo fixes in examples "<>"
* Thinking about how JSON-LD and Javascript fit together in more depth.
* The effect of Microsyntaxes on Javascript developers

I don't want to take the Microsyntax stuff out yet because we don't have
a suitable replacement.

-- manu

--
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: Saving Journalism - The PaySwarm Developer API
http://digitalbazaar.com/2010/09/12/payswarm-api/

Richard Cyganiak

unread,
Oct 19, 2010, 5:01:30 AM10/19/10
to jso...@googlegroups.com

On 19 Oct 2010, at 04:13, Manu Sporny wrote:
> Ok, so let's assume that we do that - then what is the following
> triple:
>
> "foo": {"value": "20101018", "type": "xsd:dateTime"}
>
> Is it this?
>
> <> foo "20101018"^^xsd:dateTime
>
> or is it this?
>
> <> foo _:bnode1
> _:bnode1 value "20101018"
> _:bnode1 type "xsd:dateTime"
>
> The approach above introduces ambiguity to the processing rules - how
> are we going to handle that if we accept the proposal above?

Good point. The representation for a typed literal should perhaps be:

"foo": {"value": "2010-10-18", "@@type": "xsd:date"}

or "__type__" or whatever. The rule could be: If there's a "@@type"
and a "value" key, then parse it to a typed literal. Otherwise to a
blank node with properties. This relies on the assumption that
"@@type" doesn't clash with some random key that might already exist
in someone's JSON.

> I thought people would
> just read JSON-LD serializations in and place the resulting triples
> into
> a triple store to work with it...

If all you want to do is push triples from A to B, then existing RDF
serializations work reasonably well. If that was the only use case,
then I wouldn't be convinced that JSON-LD is necessary at all. And if
you insist on a JSON-based format for that use case, then a
straightforward representation of triples in JSON (for example, array
of objects, each object representing one triple) would be sufficient
to address that use case.

> a number of people have asked why they can't just work with the data
> directly.

I'm not surprised. People tend to like JSON as a data representation
format, and especially when working with data in the browser. I think
that a lot of people see the potential of bridging between RDF as a
back-end data representation and domain model, and JSON as a front-end
format to glue UIs and APIs onto the domain data.

> I'll try to get some of your suggestions in there, Richard...
> specifically:
>
> * Type coercion
> * Ensuring Zero-edit claim
> * Typo fixes in examples "<>"
> * Thinking about how JSON-LD and Javascript fit together in more
> depth.
> * The effect of Microsyntaxes on Javascript developers

May I suggest adding a few notes to the spec, along these lines:

“This feature is still under discussion and might change in such and
such a way”
“We are considering the addition of another feature here to do X”
“We seek feedback on this”

That might be a good way of getting a better understanding of what
people want.

Best,
Richard

Mark Birbeck

unread,
Oct 19, 2010, 5:27:27 AM10/19/10
to jso...@googlegroups.com
Hi Richard,

On Mon, Oct 18, 2010 at 8:11 PM, Richard Cyganiak <ric...@cyganiak.de> wrote:
> We all have our personal crystal balls that we use to make guesses about the
> future. Everyone's guesses are about as good or as bad as anyone else's.

With respect, that can't be true. :) At some point reality takes over,
and one set of guesses turns out to have been better than another set.

Of course, you might be saying that given that we don't yet know the
outcome of our guesses then as of today we have no way of knowing
which of today's guesses is best. But again I'd disagree; I've
witnessed many nascent technologies over the years which have just
been 'right' and deserved to gain popularity -- and then they have. So
I tend to be quite fussy about which guesses I follow.


> Nevertheless, I gazed into mine for a while and would like to share what I
> saw.

Great.


> In my opinion, JSON-LD's biggest problem is its reliance on microsyntaxes.
>
> "<...>" has a special meaning.
> "...@..." has a special meaning.
> "...^..." has a special meaning.
> "_:..." has a special meaning.

I don't disagree, but I think you're throwing the baby out with the bathwater.

If we see what we're doing here as building a bridge (just as we built
a bridge with RDFa), then we can't say that one side is better than
another, or that adopting one side in preference to another is flawed
and will cause the whole edifice to collapse; instead we have to try
to build from both sides, but be flexible as we do.

Type coercion is not something that just suddenly popped up a few days
ago. It's something that I experimented with in RDFj...it's something
that Jeni Tennison and David Reynolds did some work on as part of the
Linked Data API [1]...it's something that myself and Manu have talked
about in JSON-LD...it's even the core idea in JSON Schemas [2]!

So whilst you are right that it holds the key to the Holy Grail -- to
take unmodified JSON and provide a profile that says how to interpret
it as RDF -- I think you are wrong to say that it should be brought in
at the expense of having a way to create JSON-LD objects with no
profile.

(Especially when there are so many ways it could be done...should we
devise our own syntax? Build on JSON Schemas? It's all to be
determined.)


> This means that mixing some JSON-LD into an existing Javascript app or JSON
> API will most likely break stuff, because simple strings suddenly mean
> different things.

I'm not convinced that we'll get many false positives. I did think
about this a lot when doing RDFj, and concluded that the stricter you
make the rules, the less likely there are to be errors. For example,
to get a URI we must have '<' at the beginning and '>' at the end --
no leading and trailing spaces, no starting with '<' and forgetting to
put '>' at the end, and so on.

But note also that the false positives (if they exist) only apply when
serialising; if I manipulate the string "<http://xyz>" in my
application then nothing changes when I add a context object or
serialise it as RDF. In other words, I don't see how this can 'break
stuff'.


> I think that's a bad idea. A simple string should be a simple string, even
> if it happens to contain a character like "@" or "^".

In your current JavaScript application it will be...nothing changes.


> Instead of using microsyntaxes for nodes that are not simple strings,
> JSON-LD should either a) use coercion, or b) turn the simple string into a
> simple object, e.g.,
>
> {"@":"..."}
> {"value":"...", "lang":"..."}
> {"value":"...", "type":"..."}
> {"_":"..."}
>
> or c) allow both of the above. Type coercion is the neater option in most
> cases IMO.

Yes. As I said, type coercion has been on the agenda for at least 18
months...you might be tilting at windmills here. :)

But on your point about making the values into JSON objects, that is
*exactly* what I was moving away from when I first started working on
RDFj. As you'll know RDF/JSON [3] takes that approach, and it's fine
for serialising RDF on the wire, but doesn't help when programming in
JavaScript.

(And I'm sure Ian wouldn't disagree with that; he had a particular set
of requirements in mind when creating RDF/JSON, but now with the
growth in interest in JavaScript programming we need to extend this.
Having said that, I often meant to find time to see whether RDF/JSON
syntax could be part of RDFj, and now that there's more of us
interested in JSON-LD, perhaps we should put some effort into that. It
would be great if JSON-LD was backwards-compatible with RDF/JSON.)


> Removing the microsyntaxes would, in my subjective view, move JSON-LD from
> “doomed to obscurity” territory over to “could actually grow into something
> awesome”.

I don't agree. First, there's no way it's going to be doomed to
obscurity -- Manu and myself have been up against worse odds than
this, before. ;)

But second, I do agree with you that type coercion is going to really
set this thing on fire. At the moment I'm not seeing the need to throw
out the basic (current) syntax in order to support the more elaborate
and powerful techniques of type coercion -- I think we can very easily
layer it on top.


> Or putting it another way: I believe that there's only a market for one
> “linked data in JSON” syntax really. Initially there might be a number of
> proposals, but I'm 100% convinced that someone will discover the right way
> of doing this eventually, and will pretty much take all the mindshare. I'm
> also 100% convinced that the winning approach will not do any of that silly
> microsyntax business. That's because the microsyntax thing will immediately
> turn off the average JSON developer, and he or she is the most important
> person for the success of this proposal.

There's a lot in there: you seem to know what the average JSON
developer wants...Manu and myself have somehow accidentally proposed
something that would be of use to JavaScript programmers without
realising that JavaScript programmers are important for its
success...etc.

If you don't mind I won't respond to those points because the same
issues pop up in just about every discussion on standards that I've
ever seen over the years.

At the end of the day you begin with something that is of use to some
people, discover that it is also of use to some other people, find
common ground, write it up, and then rinse and repeat. (RDF/JSON ->
RDFj -> JSON-LD, for example.)

During the course of it you write blog posts, maybe do some sessions
at conferences and so on, and try to win more mindshare.

But if someone comes along and does the job better? Well, that's
great. Ideally we'd get them to do the job better within JSON-LD, and
let it evolve, but if they do it better with a different spec, and
they gain mindshare, then that's that. :) We're all building on each
other's work, after all.

Regards,

Mark

[1] <http://code.google.com/p/linked-data-api/>
[2] <http://tools.ietf.org/html/draft-zyp-json-schema-02>
[3] <http://n2.talis.com/wiki/RDF_JSON_Specification>

--
Mark Birbeck, webBackplane

mark.b...@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)

Richard Cyganiak

unread,
Oct 19, 2010, 5:46:38 PM10/19/10
to jso...@googlegroups.com
Mark,

I'm glad that we're on the same page with regard to type coercion.

Please help me understand the attraction of the microsyntax approach.

On 19 Oct 2010, at 10:27, Mark Birbeck wrote:
> I think you are wrong to say that it should be brought in
> at the expense of having a way to create JSON-LD objects with no
> profile.

I did not say that. I said that in all the cases where JSON-LD uses
"<x>", "_:x", "x^y" or "x@y", it would be better to use JSON objects
instead: {"@":"x"}, {"_":"x"} etc.

> I'm not convinced that we'll get many false positives.

This design strategy is relying on luck. A plain string in an existing
JS app is only rarely misinterpreted -- that's not very comforting.

> But note also that the false positives (if they exist) only apply when
> serialising; if I manipulate the string "<http://xyz>" in my
> application then nothing changes when I add a context object or
> serialise it as RDF. In other words, I don't see how this can 'break
> stuff'.

Things break when you try to process JSON-LD with normal JS code.
Either you check every string against a lot of regexes (and run it
through a function that removes certain backslashes), or you have bugs.

Now you could respond to that in two ways. Either you could say,
“Don't do that, use a JSON-LD library.” Or you could say, “Well you'll
have to accept that your code will have a few bugs and will sometimes
do weird stuff if certain patterns occur in strings.”

The thing is, I believe that if JSON-LD used the object style instead
of microsyntaxes, then you *could* write bug-free code without a JSON-
LD library if you followed some rules (“check that where you expect an
object, you really have an object; and check that where you expect a
string, you really have a string”). Especially the fact that nothing
needs unescaping makes writing correct code much easier.

> But on your point about making the values into JSON objects, that is
> *exactly* what I was moving away from when I first started working on
> RDFj.

Why were you moving away from the object style?

> As you'll know RDF/JSON [3] takes that approach,

Actually RDF/JSON uses a mix of both styles -- microsyntax for
subjects of triples, JSON objects for objects of triples. And it uses
JSON objects for *all* RDF nodes in the object position, including
plain literals and numbers and booleans, which in JSON-LD are
represent as simple JSON literals, and that's great IMO.

> and it's fine
> for serialising RDF on the wire, but doesn't help when programming in
> JavaScript.

How does flattening objects into microsyntaxes help when programming
in JavaScript?

I still don't see the attraction of the microsyntax approach. What's
the advantage of "<x>" over {"@":"x"}?

(My summary so far is this. Disadvantages of the microsyntax style:
microparsing for access to values; requires escaping/unescaping of
plain literals; in-your-face “magic” in the JSON. Disadvantages of the
object style: more characters in the JSON; nesting in the JSON; no
simple as-is printing for debugging; no equality checks with ==. These
are not huge differences, but object style wins for me.)

Best,
Richard

Reply all
Reply to author
Forward
0 new messages