JSON-LD comments

81 views

Skip to first unread message

Richard Cyganiak

unread,

Oct 15, 2010, 3:21:12 PM10/15/10

to jso...@googlegroups.com, Ian Davis, Keith Alexander

All,

I sent the message below to Manu and Mark yesterday night, and Manu
suggested I re-post it here. It's a bit of a late-night semi-rant,
sorry about that. So I'll try to start with something a bit more
coherent.

What I really want to know: What is the use case for JSON-LD?

I think there's two possible use cases:

a) “The server has JSON and the client wants RDF.”
b) “The server has RDF and the client wants JSON.”

An RDF/JSON format can meet either of those use cases, but not both.
I'm afraid that JSON-LD meets neither. It is designed to be simple and
easy to read; it succeeds very well with these goals. But to work well
for a) it really should require almost no changes to existing JSON
APIs; and to work well for b) it really should be trivial to get a
given property of a given resource out of the JSON structure, with one
canonical way of doing it.

Anyway, thanks for getting this discussion started, I think it's an
important one for the future of RDF.

My original comments from yesternight are below.

Best,
Richard

Begin forwarded message:

> I came across JSON-LD tonight.
>
> I haven't really read the spec in detail, these just some late-night
> comments after skimming it for a few minutes for the first time.
> Want to send this off now while it's fresh in my head.
>
> So here goes
>
> ---
>
> The example where you '#'-declare the FOAF prefix three times for
> three items in a list is scary. Make later items inherit the context
> from the previous one
>
> Please support something like profiles, where you link to another
> JSON that declares the default context. That way I can use a generic
> parser *and* have my data completely self-descriptive, with a
> minimal change to my JSON.
>
> I hate the thing where I have to do "<this>" to make a URI. I don't
> understand why "a" takes no pointy brackets, but "@" takes them. And
> for properties, can't I just say "http://something" and have the
> default context declare that values of foaf:name are literal but
> foaf:homepage are URIs?
>
> Same for datatypes, can't I say in the default context that dc:date
> takes a xsd:dateTime?
>
> Same for language, can't I say in the default context that the
> default language is 'en', and that 'title' and 'description' should
> be language-tagged while 'name' shouldnt' be?
>
> I'd argue that if you don't let me distinguish URIs and literals out-
> of-band in the default context, then you haven't met the "zero edit"
> design goal.
>
> I don't understand why so many examples have "@": "_:bnode1" stuff.
> Doesn't { } automatically create a new bnode if no URI is provided
> via "@"? If not, then I think it should. Ok there might be cases
> where I don't want { } to introduce a new resource, but I think
> those should be considered the exception, so let me say "@": "" or
> something like that to prevent the creation of a new resource.
>
> I'd prefer "_": "bnode1" instead of the "@": "_:bnode1" microsyntax.
>
> Actually I'd be ok with *only* having { } for introducing blank
> nodes. Need to reference that node from somewhere else in the file?
> Sorry you have to use a URI.
>
> Is it really necessary to support multiple contexts in a file? How
> about just one on the lexically first associative array?
>
> There should be an implied "@": "URI of the current file" on the
> root { }
>
> 7.3 says that the URI for "a" needs to be in "<>", but throughout
> the file it usually isn't.
>
> ---
>
> Well I guess I misunderstood the intention of JSON-LD a bit -- I
> thought it was about making “normal” JSON linked data friendly, but
> it seems to be yet another RDF-graph-serialized-in-JSON proposal,
> just pushing it a bit more towards human-readability and making it
> look a bit more like normal JSON.
>
> To be honest I'm sort of scared by all those RDF/JSON proposals. To
> me the entire approach seems flawed. It's just like RDF/XML: RDF/XML
> tried to sort of look like XML, but actually it turns out that you
> can't process it with XML tools because there's a million ways of
> expressing the same graph in RDF/XML, and it teaches the average XML
> developer that RDF is just XML with extra added ugliness.
>
> Here's a dare: Take the “zero edit” thing serious. Communicate
> everything out of band that could possibly be communicated out of
> band. Don't focus on the requirement to serialize arbitrary RDF
> graphs; those who really need this can live with some ugly extra
> markup. Make it so that we can turn an average off-the-street JSON
> API into linked data by adding "@profile" on the root and maybe
> sprinkling in the odd "@":"some-uri".
>
> Anyway, nice job on the spec writing and http://json-ld.org/
> homepage. JSON-LD definitely beats any of the other RDF/JSON
> proposals to date.
>
> Keep up the good work,
> Richard

Mark Birbeck

unread,

Oct 16, 2010, 8:23:15 AM10/16/10

to jso...@googlegroups.com, Ian Davis, Keith Alexander

Hi Richard,

First, many thanks for your comments...much appreciated.

On Fri, Oct 15, 2010 at 8:21 PM, Richard Cyganiak <ric...@cyganiak.de> wrote:
> All,
>
> I sent the message below to Manu and Mark yesterday night, and Manu
> suggested I re-post it here. It's a bit of a late-night semi-rant, sorry
> about that. So I'll try to start with something a bit more coherent.
>
> What I really want to know: What is the use case for JSON-LD?
>
> I think there's two possible use cases:
>
> a) “The server has JSON and the client wants RDF.”
> b) “The server has RDF and the client wants JSON.”

I think you're pretty much right, although that's still essentially
seeing the issue as one of serialisation. (Although now that I've read
to the end of your email I see that you do see the potential for
JSON-LD as being far beyond these two use-cases.)

I would add a third use-case -- which is the one that motivated the
creation of RDFj [1] -- and that is to project the 'semantic web
stack' into JavaScript programming.

As I'm sure you realise I don't mean by this just being able to
manipulate triples -- I spent too many years as an assembler
programmer to want to waste time that close to the metal! Instead what
I'm getting at is to be able to take features that are usually
associated with the semantic web, such as globally known predicates or
inference rules, and use them in day-to-day software writing.

One way to look at this is to say that we are turning objects into
'semantic objects'. [2]

I'm sure you can think of a ton of examples, but a simple one would be
to add a rule that says any object with a vCard organisation name is a
vCard organisation:

Forall ?Org (
If vcard:organization-name(?Org ?Name)
Then vcard:Organization(?Org)
)

We might have an object like so:

{ "name": "Google" }

If we want to say that the name property isn't just any old name
property, but it's the vCard organization name property, then we add a
context to the object:

{
"#": {
"name": "http://www.w3.org/2006/vcard/ns#organization-name"
},
"name": "Google"
}

As you can see, everything about the original object is still intact,
even down to any methods that might have been attached to it. All that
has changed is that the object has some metadata that provides a
JSON-LD--aware library with a way to 'interpret' the object.

Now, if our library understands the rule that I gave above -- that an
object with an organisation name is an organisation -- then running
that rule on this object would cause it to become:

{
"#": {
"name": "http://www.w3.org/2006/vcard/ns#organization-name"
},
"name": "Google"
"a": "http://www.w3.org/2006/vcard/ns#Organization"
}

The point here is that this technique would be useful to any
programmer, regardless of whether they understood the semantic web or
not. (And conversely, for most programmers it's a much easier way in
to the semantic web than triples.)

> An RDF/JSON format can meet either of those use cases, but not both. I'm
> afraid that JSON-LD meets neither. It is designed to be simple and easy to
> read; it succeeds very well with these goals. But to work well for a) it
> really should require almost no changes to existing JSON APIs; and to work
> well for b) it really should be trivial to get a given property of a given
> resource out of the JSON structure, with one canonical way of doing it.

More comments on these issues, below.

> Anyway, thanks for getting this discussion started, I think it's an
> important one for the future of RDF.
>
> My original comments from yesternight are below.
>
> Best,
> Richard
>
>
>
> Begin forwarded message:
>
>> I came across JSON-LD tonight.
>>
>> I haven't really read the spec in detail, these just some late-night
>> comments after skimming it for a few minutes for the first time. Want to
>> send this off now while it's fresh in my head.
>>
>> So here goes
>>
>> ---
>>
>> The example where you '#'-declare the FOAF prefix three times for three
>> items in a list is scary. Make later items inherit the context from the
>> previous one

Yes, that is definitely part of the 'philosophy'. It's essentially the
same as the hierarchical approach to contexts in RDFa.

>> Please support something like profiles, where you link to another JSON
>> that declares the default context. That way I can use a generic parser *and*
>> have my data completely self-descriptive, with a minimal change to my JSON.

Again, I can only say 'definitely'!

First, the context is the profile, and its similarity to an RDFa
profile is no accident (hopefully we'll go further in that direction).

Second, one of the problems here is that we're defining a format
without also looking at the associated API, so some things are not
immediately obvious; a key feature of an API that supports JSON-LD
would be to allow *both* a JavaScript object and some context to be
passed in to its functions. This would mean that you could take an
existing application, leave the data structure intact, but still
serialise its data to RDF.

>> I hate the thing where I have to do "<this>" to make a URI. I don't
>> understand why "a" takes no pointy brackets, but "@" takes them.

'a' should take pointy brackets.

>> And for
>> properties, can't I just say "http://something" and have the default context
>> declare that values of foaf:name are literal but foaf:homepage are URIs?
>> Same for datatypes, can't I say in the default context that dc:date takes
>> a xsd:dateTime?
>>
>> Same for language, can't I say in the default context that the default
>> language is 'en', and that 'title' and 'description' should be
>> language-tagged while 'name' shouldnt' be?

I agree, but I think that's a separate piece of work that can be
layered on top of what we have done here. (It's nothing more than
enriching the context.)

>> I'd argue that if you don't let me distinguish URIs and literals

>> out-of-band in the default context, then you haven't met the "zero edit"
>> design goal.

And you'd be right. The only thing I'd say in defence is that the
current design (within which I include the features that are in RDFj
but not yet in JSON-LD) marks a reasonable version 1. Making the
context more powerful should be quite straightforward from here.

>> I don't understand why so many examples have "@": "_:bnode1" stuff.
>> Doesn't { } automatically create a new bnode if no URI is provided via "@"?
>> If not, then I think it should.

Yes, definitely. We'll look at the examples and try to provide
something more realistic.

>> Ok there might be cases where I don't want {
>> } to introduce a new resource, but I think those should be considered the
>> exception, so let me say "@": "" or something like that to prevent the
>> creation of a new resource.

I don't think we're trying to inhibit the creation of a new bnode. I
think the examples are just saying that if you want to make the bnode
identifier explicit (so that it can be referred to elsewhere) then
this is how you do it. But you're right that we need more examples
where the bnode is automatically created.

>> I'd prefer "_": "bnode1" instead of the "@": "_:bnode1" microsyntax.

Two things on this. First, you'd still need to use "_:a" when you want
to refer to the bnode elsewhere, so it could get confusing. Second,
the idea is to try as much as possible to use 'microsyntaxes' (good
word) that people are familiar with.

>> Actually I'd be ok with *only* having { } for introducing blank nodes.
>> Need to reference that node from somewhere else in the file? Sorry you have
>> to use a URI.

I'm not sure we need to go that far, but I think you are right that
the anonymous bnode technique will be the most common.

>> Is it really necessary to support multiple contexts in a file? How about
>> just one on the lexically first associative array?

Manu and his colleagues did try that approach, since there were
certain scenarios involving very large files in which it gave them
advantages. However, after discussion and experimentation they decided
that they could get most of these advantages from attaching contexts
to objects directly, so we're sticking with that technique.

It's good news that they went through that exercise since it means
that we can be pretty confident that JSON-LD will work at the level of
very large files, all the way down to the single JavaScript object.

(Note also that placing the context into an array makes it more
difficult to implement your earlier observation about the power of
nested contexts, as well as requiring that all JSON-LD objects are
arrays.)

>> There should be an implied "@": "URI of the current file" on the root { }

Yes.

>> 7.3 says that the URI for "a" needs to be in "<>", but throughout the file
>> it usually isn't.

Right...it should be. Thanks.

>> ---
>>
>> Well I guess I misunderstood the intention of JSON-LD a bit -- I thought
>> it was about making “normal” JSON linked data friendly, but it seems to be
>> yet another RDF-graph-serialized-in-JSON proposal, just pushing it a bit
>> more towards human-readability and making it look a bit more like normal
>> JSON.

I think it's just a case of whoever worked on the spec last has
stamped their particular interest on it! I don't think Manu will mind
me saying that his main interest is in streaming very large files of
data; the current version of the spec reflects that. My interest on
the other hand is in 'semantifying' JavaScript programming, and the
RDFj version of the spec from which JSON-LD comes reflects that much
more.

Our task now is to try to beef the spec up so that it fulfils both of
these roles (and more), with plenty of examples to back everything up.

>> To be honest I'm sort of scared by all those RDF/JSON proposals. To me the
>> entire approach seems flawed. It's just like RDF/XML: RDF/XML tried to sort
>> of look like XML, but actually it turns out that you can't process it with
>> XML tools because there's a million ways of expressing the same graph in
>> RDF/XML, and it teaches the average XML developer that RDF is just XML with
>> extra added ugliness.

If we were on daytime TV this would be the point at which I'd start whooping. :)

Of course you are right that things could go wrong, but might I say
two things as riders:

* the main problem with RDF/XML is not that it tries to 'interpret'
free-form XML as RDF, but rather that it imposes a structure on that
XML by way of the notion of 'striping'. In other words, you aren't
actually able to interpret /any/ XML as RDF;

* the motivation for RDFa was that RDF inside HTML/XHTML would help
the semantic web more than RDF/XML.

I believe we got it about right with RDFa, although of course that
doesn't mean we won't mess up with JSON-LD. But with people like
yourself and Ian on this list, the chances of doing so are greatly
reduced. ;)

>> Here's a dare: Take the “zero edit” thing serious. Communicate everything
>> out of band that could possibly be communicated out of band. Don't focus on
>> the requirement to serialize arbitrary RDF graphs; those who really need
>> this can live with some ugly extra markup. Make it so that we can turn an
>> average off-the-street JSON API into linked data by adding "@profile" on the
>> root and maybe sprinkling in the odd "@":"some-uri".

Agreed...I think you are absolutely spot on. (More whooping...)

I'd add a double-dare to your dare (are double-dares what come next?)
and that is to make it so that JSON-LD is such an easy RDF
serialisation to understand that we end up using it for our examples
back in the RDFa spec and primer, and maybe people even prefer it to
Turtle, N3, etc.!

In fact, why stop there? :) Let's dare ourselves that JSON-LD becomes
so easy to understand that in the future the easiest way to explain
RDF to a programmer will be to begin with a JSON object and then
semantify it. ("Why should I use URIs for my property names?" "Because
then your objects mean the same as someone else's objects." "Why would
I want to use inference?" "Because then the library can take on the
burden of managing large numbers of objects for you." Etc.)

>> Anyway, nice job on the spec writing and http://json-ld.org/ homepage.
>> JSON-LD definitely beats any of the other RDF/JSON proposals to date.

Thanks.

And with the help of you and others who have already offered comments
and joined the list, I reckon we can hone this thing into shape.

Regards,

Mark

[1] <http://code.google.com/p/backplanejs/wiki/Rdfj>
[2] <http://webbackplane.com/mark-birbeck/blog/2009/04/20/rdfj-semantic-objects-in-json>

Richard Cyganiak

unread,

Oct 18, 2010, 3:02:18 PM10/18/10

to jso...@googlegroups.com

Thanks for the detailed response Mark.

Some comments inline.

On 16 Oct 2010, at 13:23, Mark Birbeck wrote:
>> I think there's two possible use cases:
>>
>> a) “The server has JSON and the client wants RDF.”
>> b) “The server has RDF and the client wants JSON.”
>

> I would add a third use-case -- which is the one that motivated the
> creation of RDFj [1] -- and that is to project the 'semantic web
> stack' into JavaScript programming.

That's not really a use case; it's something bigger. The use case
would be something more like:

c) “The JS programmer sees JSON or plain old JS objects, but a JS API
can see it as RDF and apply RDF-like operations (e.g. inference).”

> Second, one of the problems here is that we're defining a format
> without also looking at the associated API, so some things are not

> immediately obvious.

I understand. Developing a format independently from the API has
advantages as well -- people can try different API designs over the
format.

Maybe describing the context format independently from the “object
annotation format” might be useful -- exactly because you'd often use
contexts independently of a JSON file or concrete JS object. So, part
A of the spec describes: “Here's how to write down a context”. Part B
describes: “Here's how to attach extra annotations to JSON structures
and JS object, using '#' and '@' and similar conventions”.

>>> I'd prefer "_": "bnode1" instead of the "@": "_:bnode1" microsyntax.
>
> Two things on this. First, you'd still need to use "_:a" when you want
> to refer to the bnode elsewhere

Instead of using "_:bnode1" why not use {"_":"bnode1"}, or just
"bnode1" with a type coercion.

> * the main problem with RDF/XML is not that it tries to 'interpret'
> free-form XML as RDF, but rather that it imposes a structure on that
> XML by way of the notion of 'striping'. In other words, you aren't
> actually able to interpret /any/ XML as RDF;

The other main problem with RDF/XML is that there is a gazillion
different ways of expressing the same triple in RDF/XML. The JSON-LD
proposal suffers from that problem as well.

Perhaps this could be addressed through a serialization algorithm that
takes both an RDF graph and a default context as input. This would
ensure a consistent JSON serialization.

> In fact, why stop there? :) Let's dare ourselves that JSON-LD becomes
> so easy to understand that in the future the easiest way to explain
> RDF to a programmer will be to begin with a JSON object and then
> semantify it. ("Why should I use URIs for my property names?" "Because
> then your objects mean the same as someone else's objects." "Why would
> I want to use inference?" "Because then the library can take on the
> burden of managing large numbers of objects for you." Etc.)

Besides the "URIs for property names" reason you mention above, I'd
name the following two as the most convincing ones:

The ability to coerce a term to a URI is important because that means
the API knows it can interact with the term's value via HTTP, e.g., do
a GET to get a JSON representation of a linked object.

The ability to specify an object's URI with "@" is important because
it means I can modify the object and then do something like
#(myObject).put() to write it back to the server (with an HTTP PUT
call to the object's URI).