Use of JSON Schema as a Modeling Language

1,065 views
Skip to first unread message

michae...@gmail.com

unread,
Jun 22, 2015, 8:21:45 AM6/22/15
to json-...@googlegroups.com

Dear JSON Schema Community;

 

On behalf of the OASIS OData technical committee, I would like to understand from the experts whether or not JSON Schema is appropriate for our particular use case.

 

OData is an OASIS standard that defines an interoperable protocol for clients to interact with RESTful services in a completely generic manner.

 

An important part of the protocol is the ability to describe the resources exposed by the service. OData uses an entity-relationship data model for describing these resources, and has defined an XML-based representation of this common data model (known as "CSDL").

 

OData uses a JSON representation for data payloads, and together with the growing popularity of JSON/Javascript clients, the OASIS OData Technical Committee (TC) has gotten numerous requests to provide a JSON representation of the data model as well.

 

Given the significant overlap between describing a data model and validating the JSON payload returned from requests against the data model it is very tempting to use JSON Schema for both. In fact, we've seen other REST APIs, such as the DMTF "Redfish" specification, attempt to use JSON Schema to describe their resource model. However, in attempting to use JSON Schema to describe our data model we've run into a few issues that have caused us to question whether modeling is an appropriate use of JSON Schema.

 

So our primary question to the community is this:


Is JSON-Schema intended to be used for data modeling, or should we invest in an alternate "JSON Modeling" language that we would have first-class representation of concepts such as inheritance and relationships, and from which we could generate JSON Schema for validation?

 

We eagerly await your guidance in this area.


Thanks in advance,

 

Mike Pizzo

Editor, OASIS OData Technical Committee

Jason Desrosiers

unread,
Jun 23, 2015, 4:08:17 PM6/23/15
to json-...@googlegroups.com, michae...@gmail.com
Hi Mike,

JSON-Schema is very useful for data modeling.  In fact, JSON-Schema is more of a data modeling tool than a validation tool.  The validation features are actually quite basic.  One limitation for example is the inability to validate a value based on another value.  This means, you couldn't enforce something, "like the value of `startDate` must be less than the value of `endDate`".  The best JSON-Schema can do is enforce that `startDate` and `endDate` are both dates.

My guess would be that the issues you have run into are more about being new to JSON-Schema than they are about it's limitations. If you can provide some detail about what issues you are running into, I'm sure we can help you get past them.  If you do have a situation JSON-Schema can't describe, then it would help us to know about it so we can improve the specification to cover that edge case.

Jason

Simon Heimler

unread,
Jun 24, 2015, 5:29:09 AM6/24/15
to json-...@googlegroups.com, michae...@gmail.com
Dear Mike,

I've used JSON Schema successfully for Modeling.


I'll write a thesis about this approach, trying to conceptualize this "Schema-Driven Development" Approach, my experiences with it, it's benefits and limits.

So if you're intrested, drop me an email!

Best,
Simon

Jason Desrosiers

unread,
Jun 24, 2015, 4:41:52 PM6/24/15
to json-...@googlegroups.com, michae...@gmail.com
HI Simon,

I checked out Mobo and I like the schema driven approach.  I think it is a great way to go.  I've done some work in that area as well.  However, Mobo uses a highly modified version of JSON Schema.  If Mobo Schema works for you, that's great, but I wanted to point out for Mike's sake that it is not necessary to modify JSON-Schema the way you did for it to be used successfully and effectively for modeling.

Jason

Simon Heimler

unread,
Jun 25, 2015, 1:30:32 AM6/25/15
to json-...@googlegroups.com, michae...@gmail.com
Hello Jason,

yes, you're right - I've extended JSON Schema quite a bit and added my own object orientation feature.

But I wouldn't call it highly modified because the Schema mobo uses internally is 100% valid JSON Schema. Those additions are only optional and useful for the development of very big models. So you could write the model in pure JSON Schema in JSON notation and it should work, too.

Almost all further additions are either domain specific or implementation system specific (SMW). Of course it depends on your use case if the standard JSON Schema properties are sufficient for generating the end system. 

Btw. you could take a look at http://swagger.io/. They also use a JSON Schema based MDE approach.

Best,
Simon

--
You received this message because you are subscribed to a topic in the Google Groups "JSON Schema" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/json-schema/jcIq-OXkTJ0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to json-schema...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jason Desrosiers

unread,
Jun 25, 2015, 3:05:25 PM6/25/15
to json-...@googlegroups.com, michae...@gmail.com
Simon,

From what I learned from the Mobo documentation, your changes do not appear to be limited to adding features to JSON Schema.  First of all, you have a list of 13 keywords that are not supported.  Some of these keywords I think are pretty critical when defining JSON Schemas.  So, Mobo Schema really only extends a subset of JSON Schema.  But, the real reason I thought the term "highly modified" was appropriate was that some of the examples in the documentation appear to change the structure and semantics of existing JSON Schmea keywords.

First Example:
{
    "title": "Location",
    "description": "Location where hardware is deployed",
    "properties": [
        { "$extend": "/field/streetAdress" },
        { "$extend": "/field/streetNumber" },
        { "$extend": "/field/town" },
        { "$extend": "/field/country" }
    ],
    "required": [
        "streetAdress",
        "streetNumber",
        "town"
    ],
    "smw_prefix": {
        "header": 1,
        "wikitext": "Some prefix-description for the location"
    },
    "smw_postfix": {
        "wikitext": "Some postfix-description for the location"
    }
}

In this example, `properties` is an array when JSON Schema requires it to be an object.  How would this work?  Is the extended field required to define the key value?  Or is the key extrapolated from the path?  No matter how you deal with this, it amounts to changing the semantics of `properties` in addition to changing the type.  This certainly wouldn't validate using a standard JSON Schema validator.

Second Example:
title: Location
description: Location where hardware is deployed

items:
  - $extend: /field/streetAdress
  - $extend: /field/streetNumber
  - $extend: /field/town
  - $extend: /field/country

required:
  - streetAdress
  - streetNumber
  - town

I'm not sure how to interpret this one as a JSON Schema.  It contains a keyword that only applies to arrays (items) and a keyword that only applies to objects (required).  There is no type declaration, so I don't know which one it is supposed to be.  Assuming this is intended to define an array, then the `required` keyword would be meaningless.  I'm assuming you included it because you added semantics to the `required` keyword so it applies to arrays.

Correct me if I misunderstood any of this, but these look like backward incompatible changes from pure JSON Schema.  That is why I call it "highly modified".


I am well aware of swagger, but I am not a fan.  First of all, their implementation of JSON Schema is limited which can be frustrating to use for someone like me who knows the spec well.  But, the reason I am not a fan is because Swagger has an RPC structure.  Swagger is just like WSDL and has the same limitations.  I think a much better way to define APIs in a schema driven way is JSON Hyper-Schema + Jsonary.  Jsonary's visuals aren't professionally polished like Swagger, but JSON Hyper-Schema is so much more powerful because it allows us to do HATEOAS like no other JSON based standard I have found.  If you want to truly be RESTful and use JSON, I know of no other standard that is up to the job.

Jason

Simon Heimler

unread,
Jun 25, 2015, 4:14:30 PM6/25/15
to json-...@googlegroups.com
Hi Jason,

true, I've decided to not support some of JSON Schemas features. This is mostly because the end system has very little validation support (if that changes, it would make sense to support more of this). Most of the mobo schema is for defining the model structure.

I've thought about supporting more of JSON Schema. There would be very elegant way to do this: A library that does JSON Schema Expansion (similar in concept to JSON-LD Expansion: http://www.w3.org/TR/json-ld-api/#expansion).

This would fetch all $refs and "expand" all definitions into the JSON Schema file, so that it is easier to process afterwards. (Do you know what I mean? Difficult to explain in a short way..) Do you know if something like that already exists?

I see that the first example is very outdated and absolutely misleading.

Properties should not be used at all, because code generation depends on the order. The YAML example below is correct. Yes, the "type": "array" is missing but this is because it is implicit because the object has items. (Mobo automatically adds those implicit properties, but they can be written explicitly too if thats prefered.)

I wasn't aware that "required" is only for properties and not for arrays. If thats true, I have unwittingly broken the JSON Schema in this case. 

I'll take a look at Jsonary! The link http://jsonary.com/ does not work, however. 

Simon

Simon Heimler

unread,
Jun 25, 2015, 4:27:35 PM6/25/15
to json-...@googlegroups.com, michae...@gmail.com
Jason,

you've written:

Some of these keywords I think are pretty critical when defining JSON Schemas

Which keywords do you consider critical, that are missing?

Simon Heimler

unread,
Jun 26, 2015, 2:33:57 AM6/26/15
to json-...@googlegroups.com, michae...@gmail.com
Sorry for so many replies.

I've thought about "required" not working for arrays. That makes absolutely sense. Especially since my arrays are in fact collection of objects. The required array holds strings with the ID of those objects.

This is the one problem I've had with using JSON Schema for MDE purposes: From a generation perspective, i have to ensure the correct order, so I have to use the item notation. Since all model-parts have their own file, the ID is derived from the file name anyway. Using Objects with key/value would duplicate those ID declaration and introduce problems. From data structure and validation perspectice, they should be "properties" (because I need features like "required", it's more intuitive and easier to use). 

So I've solved this with the compromise that caused your confusion here: In the development model I'll use arrays with the items notation, extract the correct order and convert them to objects with properties notation. The "required" property does refer to properties (which just aren't there yet, but will be). So internally, the JSON Schema is valid.

Not having to write redundant information is very important for the model development process, imho. Duplicating ID's (or having to write the type: array when it's obvious) are things I'd like to avoid. Those shortcuts are optional, however.

I'm also not sure how far to go with beeing 100% JSON Schema compliant on the development model side. The generator can easily provide nice, more verbose and valid JSON Schema as an additional result format. Currently I've got no real use case for that, however.

Jason Desrosiers

unread,
Jun 26, 2015, 2:06:23 PM6/26/15
to json-...@googlegroups.com, michae...@gmail.com
Simon,

I suspected that one of those examples was not correct.  I'm glad the `items` one was the correct example.  It makes the most sense and it will validate as a JSON Schema even if the semantics of `required` are different.

You say that the `"type: "array"` is implicit because it has `items`.  Do you mean this is a behavior of Mobo Schema?  Because, it is not true of JSON Schema.  If the presence of an array keyword implies that the schema describes an array, then what if there is also a keyword present that describes an object?  Which does it choose?  JSON Schema avoids this problem by never assuming a type.  Checkout my explanation of why `type` should always be declared, https://groups.google.com/forum/#!topic/json-schema/HBSXbcQ8B2c.

The missing keywords I would consider critical are `$ref`, `allOf`, `anyOf`, `oneOf`, `not`, and `definitions`.  However, it may be that the functionality these keywords provide are completely covered by your custom extensions `$extend` and `$remove`.

I know of one library specifically that does `$ref` expansion, but I'm sure many if not all of the JSON Schema validators out there do it as well.  https://github.com/geraintluff/jsv4-php#the-schemastore-class

I'm sad that the Jsonary site is down.  I know Geraint has been busy, but I hope he hasn't abandoned the project or anything like that.  It really is a brilliant concept.

Jason

Simon Heimler

unread,
Jun 26, 2015, 3:37:18 PM6/26/15
to json-...@googlegroups.com, michae...@gmail.com
Hello Jason,

the mobo generator has in theory many model2model transformation layers. I'm still in the process of figuring this out, but this is my current idea:

The Development Model is writtein in JSON / YAML in a JSON-Schema-ish way. Each element of the model is stored in its own file, organized by folders and can be versionized. It is possible to write the model 100% JSON Schema conform, but it is also possible to make some shortcuts (like ommiting the type). It would also allow for features like internationalization, etc.

The type is internally a required property. Mobo will assume type: array for every object that has an items property and type:object if properties is present. If neither is and no type is given, it'll throw a schema validation error because it fails against the internal meta-schema.

That Development model is then transformed by a compabilityLayer which upgrades older models to the latest standard. Then there's the expansion layer - where inheritance ($extend) is applied, missing but implicit properties are added and in general additional metadata is generated. This is also the place where the many Schemas are linked together into several big schemas.

After this, the model is 100% JSON Schema conform and does not contain those special ($-prefixed) keywords like $extend anymore.

From that on, it would be consequent to have an additional layer that transforms all generic, domain specific properties to an end-system equivalent. E.g.: The recommended array will transform, so that each field that is required, gets an "mandatoryInput" property.

This is the last stage of the JSON Schema model. Now ther's a model2text transformation, using a templating engine (currently Handlebars.js) and some additional helper functions that are injected into the templating engine.

The resulting code can be uploaded in real-time to the target system.

Of course there are several steps, where validation happens, on syntax, schema and semantic level.

Yes, $extend and $remove replace all of the keywords you've mentioned and is easier to use. $ref does not even define the inheritance behavior, so I couldn't rely on it anyway. $extend supports single and multiple inheritance and allows to fine tune the merging behavior of arrays.

Thanks for the link - I'll look into it. Basically it only needs to resolve $refs both internally (usually definitions) and externally (URLs) and return a completely resolved JSON Schema without $refs.

Is there some documentation left on Jsonary, besides the GitHub Readme?

Simon

Jason Desrosiers

unread,
Jun 27, 2015, 11:06:53 PM6/27/15
to json-...@googlegroups.com, michae...@gmail.com
That sounds like a solid architectural approach.  I hope it works out for you.

I'm confused about your comments on `$ref` and inheritance behavior.  I read this in your documentation as well.  I can't think of any way `$ref` is related to inheritance or merging behavior.  There can be only one `$ref` in an object and if `$ref` is present, all other fields are supposed to be ignored.  So, there is never anything to merge or inherit.  An object that contains a `$ref` property should be replaced with the object `$ref` refers to.  It is `allOf`, `anyOf`, `oneOf`, and `not` that define how these sub-schemas relate to each other, but still there is no merging involved.

I'm not seeing any Jsonary documentation out there (other than the github README) now that the site is down.  You can always try emailing Geraint Luff (https://github.com/geraintluff).  He is the developer of Jsonary as well as the author of the JSON Hyper-Schema specification and the developer of the `SchemaStore` library I pointed you to (which should be exactly what you are looking for with regard to `$ref` expansion).

In short, Jsonary is a JavaScript library for working with JSON data described by JSON Schemas and JSON Hyper-Schemas.  Using this library, you can build a generic JSON Browser that will allow you to browse any API described by JSON Hyper-Schema.  This generic JSON Browser is analogous to Swagger UI.  I have a deployment of the latest JSON browser that Jsonary ships with available at http://json-browser.s3-website-us-west-1.amazonaws.com/ that you can check out, but it isn't very illustrative without an example API to point to.  I've been thinking about building a simple demo API just to show people how it works.  I haven't gotten around to it yet.

Jason

Simon Heimler

unread,
Jun 28, 2015, 2:39:29 AM6/28/15
to json-...@googlegroups.com, michae...@gmail.com
Hello Jason,

well, i've noticed that the tv4 library does implement inheritance. But since it's not part of the specification i can't rely on it. 
Its a very useful feature however, since you can reference/inherit a part of the model and at the same time overwrite some parts of it:

- $extend: /smw_template/NetworkPrinterHeader
  showForm: true

The use of allOf, anyOf, oneOf and not is very flexible and useful for schema validation purposes. For creating a simple dry, object-oriented JSON Schema it's unnecessarily complicated and the wrong approach in my case anyway. 

Ok, yes - that link isn't very helpful by its own.

This approach sound like the Hydra API Approach (bases on JSON-LD): http://www.markus-lanthaler.com/hydra/ ? Its also possible to describe existing APIs and make them machine interpretable.

Liebe Grüße,
Simon

Jason Desrosiers

unread,
Jun 29, 2015, 1:11:41 AM6/29/15
to json-...@googlegroups.com, michae...@gmail.com
Simon,

I haven't used tv4, but you are right that if it allows you to use `$ref` as a kind of inheritance operator, it is non-standard behavior.

Honestly, I don't really see what `$extend` and `$remove` add to JSON Schema other than a little syntactic sugar.

For example ...
{
 
"$extend": "/smw_template/NetworkPrinterHeader",
 
"showForm": true
}

... can be expresses in pure JSON Schema as ...
{
 
"allOf": [{ "$ref": "/smw_template/NetworkPrinterHeader" }],
 
"showForm": true
}

Admittedly, the second is more verbose, but I don't think it would be fair to call it unnecessarily complicated.  I have found `allOf`, `anyOf`, `oneOf`, and `not` plenty flexible, so I can't agree with you on that point.

Thanks for the link about Hydra! I looked into it briefly and it does sound like it is similar in concept to what I was talking about.  I'm looking forward to reading more about it.

Jason

Simon Heimler

unread,
Jun 29, 2015, 2:45:12 AM6/29/15
to json-...@googlegroups.com, michae...@gmail.com
Hello Jason,

the difference from $extend to $ref is:
  • Syntactic sugar, indeed. When you have to use $extend a lot it is very nice that it works so easy (it saves 2 blocks/indentation each). I think the name is also more intuitive
  • I can clearly define the inheritance behaviour in my own implementation. This is not as simple as it sounds, at least when it comes to arrays. (Do you overwrite them? Append, prepend? What about duplicates?)
  • After $extend is resolved, it disappears and is replaced. So the actual JSON Schema does not contain any $extends. That way I can use every JSON Schema library without having to worry about them supporting an unofficial/undocumented feature.
And I had to introduce some new keywords anyway, like $abstract. This doesn't make sense for validation purposes at all, but is central for object oriented modeling.

If I've understood anyOf, allOf, etc. correctly, they also make mostly sense for validation purposes. For modeling they are somewhat confusing or even misleading. At least my current target-system supports no feature that could make sense from those additional information.

Thanks for the discussion! This is very helpful for me. 

Regarding the Hydra API:
Markus Lanthaler has written his doctoral thesis about it (and a few papers):
He also evaluates a few different technologies, like JSON Schema.

Simon

Jason Desrosiers

unread,
Jun 29, 2015, 10:14:42 PM6/29/15
to json-...@googlegroups.com, michae...@gmail.com
I've enjoyed the conversation as well.  I'm especially thankful that you introduced me to Hydra.  I hadn't found any alternative to JSON Hyper-Schema for doing real HATEOAS in JSON until you showed me this.  Now I know Hydra can do it too and I'm excited to learn more about it.

Jason
...

Simon Heimler

unread,
Jun 30, 2015, 5:20:04 AM6/30/15
to json-...@googlegroups.com
Glad to hear that! I found the Hydra approach very interesting too. 

It is not a schema-based approach however, but a semantics / ontology based one. Much more powerful, but also a good bit more complicated ;)


Ralf Handl

unread,
Jul 6, 2015, 8:31:20 AM7/6/15
to json-...@googlegroups.com, michae...@gmail.com

Hi Jason,

I'm working together with Mike Pizzo on the OASIS OData technical committee.

 

Some of the problems we face are around the primitive types.

 

OData's set of primitive types originate from common database and programming language data types, see http://docs.oasis-open.org/odata/odata/v4.0/errata02/os/complete/part3-csdl/odata-v4.0-errata02-os-part3-csdl-complete.html#_Toc406397943, e.g. Edm.Int64, Edm.Date, or Edm.Double. The corresponding JSON Schema primitive types are typically broader in scope, and the OData primitive types imply additional restrictions on the serialization or recommended internal representation for data consumers.

 

This could be represented via new format attributes, e.g. "date" for OData's Edm.Date primitive type, and we've seen that e.g. Swagger goes into that direction, defining a few additional format attributes that match our needs ("date", "int64", "int32").

 

Is this the right way to go? If yes: with "neutral" attribute names, e.g. "duration" for duration values, matching the predefined ones in style, or with "distinctive" names, e.g. "xs:dayTimeDuration", to avoid possible conflicts with other parties adding format attributes?

 

Thanks in advance!

 

Ralf Handl

Co-Chair, OASIS Odata Technical Committee

Jason Desrosiers

unread,
Jul 8, 2015, 3:19:58 PM7/8/15
to json-...@googlegroups.com, michae...@gmail.com
Hi Ralf,

This requirement seems odd to me.  I don't know your system or your needs, so I'll assume you have a good reason for needing this. In order to express this constraint, you will need to extend the spec in some way.

You could use `format` like Swagger has done.  Since you are defining the extension, you can do it any way you like, but using `format` is not my favorite idea for a couple of reasons.  The biggest reason is that you would be co-opting the `format` field for something that has nothing to do with the format of a string.  What you really want is something more like a deserilaization directive.  The other reason I'm not excited about this option is that in the JSON Schema spec, `format` only applies to strings.  If, for example, you use `format` to define your various Int types, you would be changing the specification of `format` to also apply to `integer`s.  When extending the spec, I think it is best to not change any existing behaviors.  Extensions should only add.

To avoid the issues with the Swagger approach, you could introduce a new keyword that describes deserialization requirements.  JSON Schema validators should ignore keywords they don't know, so you can freely add any new keywords you need without compromising your ability to use your favorite JSON Schema validator with that schema.  This is the option I would suggest.

Extensions are very rare, so conflicts are not likely.  But, even if conflicts where impossible, I would still suggest namespacing all of your extensions.  This includes `format` extensions as well as schema keyword extensions.  If nothing else, it makes it clear to readers of the schema that they are extensions and may need processing in addition to a standard JSON Schema.

I hope this was helpful.

Jason

ralf....@sap.com

unread,
Jul 9, 2015, 1:04:46 PM7/9/15
to json-...@googlegroups.com, michae...@gmail.com

Hi Jason,

 

I'm working together with Mike Pizzo on the OASIS OData standard.

 

When mapping OData entity data models into JSON Schema we basically face to categories of problems:

  • OData concepts that don't have a counterpart in JSON Schema
  • OData concepts that can be translated into JSON Schema but are hard to recognize and translate back into OData concepts

 

The obvious solution to both problems is to introduce additional keywords to represent the OData concepts, but:

  • We have to hope that OData-agnostic JSON Schema consumers gracefully ignore the addiional keywords
  • If they ignore the missing net new concepts, we get less out of these generic JSON Schema consumers than we would like, especially from validators
  • Redundant representation of overlapping concepts bloats the schema documents

 

Missing concepts:

  • Inheritance
  • Abstract structured types
  • Keys for structured types
  • Relationships between keyed structured types

 

Hard-to-recognize concepts

  • Nullable
  • Primitive types
  • Primitive type facets

 

I'll start a separate thread for each of these topics and am looking forward to feedback and recommendations.

 

Thanks in advance!

 

Ralf Handl

Co-Chair, OASIS Odata Technical Committee



On Tuesday, June 23, 2015 at 10:08:17 PM UTC+2, Jason Desrosiers wrote:

Jason Desrosiers

unread,
Jul 9, 2015, 9:35:40 PM7/9/15
to json-...@googlegroups.com, michae...@gmail.com
Ralf,

I certainly have some opinions on most of these issues. I will look for the separate threads and post my comments there.

Jason
Reply all
Reply to author
Forward
0 new messages