In another thread, the subject of extending schemas came up, and someone said: "There is some kind of inheritance, Jim, but not as we know it". This is not the first time I've heard this kind of sentiment - generally, people want to specify "additionalProperties":false in one schema, but then define additional properties in another schema that extends it.
My view is that the "extends"/"allOf" behaviour is completely appropriate, and is inheritance in the purest sense. Any other behaviour added in the name of "inheritance" would actually break the principles of inheritance instead.
I think the way inheritance works is important, and I think that the desire for different behaviour is misguided - however, I want to make sure I haven't missed something, so I want to have a discussion about it. Let me start by laying out my case:
"B extends A"
The core of inheritance is that "B extends A" means that "Every valid B is also a valid A". So, wherever you use an A, you should also be able to use a B.
This is actually how schema extension with "extends"/"allOf" works - if B extends A, then any data that is a valid B is also a valid A.
If the schema for A defines two keys ("key1" and "key2"), and then says "additionalProperties": false, then that is a declaration that any object that contains other keys than those two is not a valid A. The only reason you should need to specify that is if the presence of other properties actually breaks something.
If the presence of other properties actually breaks something, then being able to define new properties in schema B makes no sense - if a piece of data actually used those properties, then it would no longer be a valid A, and it would not be able to be used in place of an A. It would break the fundamental rule of inheritance.
"Inflating" data
The most common case I've heard of where people seem to want to specify "additionalProperties":false but still define additional properties later is when they have a set of decoders that "inflate" JSON documents into native objects.
Say you have two schemas: SchemaA and SchemaB. You also have two classes in whatever language you're using: ClassA and ClassB. They both have a static method inflateFromJson() which returns objects of the corresponding class. SchemaA specifies all the constraints required such that ClassA::inflateFromJson() will not encounter an error, and the same for SchemaB/ClassB.
So - why would SchemaA ever need to specify "additionalProperties":false? The answer is "because the ClassA::inflateFromJson() function would throw an error if there are unexpected properties".
But that's not an explanation - why was inflateFromJson() written like that? In what other programming scenario do unknown extensions break things? If I have a function that's expecting a ClassA object, and I give it a ClassB object, there's no way that it should throw an exception saying "I did some introspection on the object, and I don't recognise all of these methods!".
If "B extends A" in both worlds, then you absolutely should be able to give some SchemaB JSON data to ClassA::inflateFromJson() and obtain a ClassA object without any problems.
Not everyone feels the same way, though, so I wonder what I'm missing. Thoughts?
You would use additionalProperties = false when, well, you have no way of handling additional properties. For instance, if I'm accepting JSON data and storing it in a database, my JSON schema contains metadata about which column and table to store the data in. By specifying additionalProperties: false, I'm declaring that I have no way of handling the incoming data with unknown properties.
However, if there is no way to extend a schema in a way that creates supersets (by deleting definitions as well as adding new restrictions -- and defining new properties, when the old schema has additionalProperties:false, effectively deletes old properties), then specifying this sort of behavior is impossible.
You're overriding old schemas, "Instead of using A's schema for the 'bProp' property (which is must not exist, because it is not in properties or patternProperties and additionalItems is false), use this schema instead."
We should look more closely into the semantics and introduce new properties if necessary, types of extends properties that are defined to only create subsets (which is 'allOf'), and properties which may re-define properties based off of another schema (which should be 'extends').
Well, it certainly got people confused, nobody can argue against that.
On my project alone, I remember offhand of two opened issues which
were linked to "extends". I've had to explain each time that this was
not a bug ;)
This problem is so frequent that I tried and wrote some (unfinished)
rules for what I coined as "schema merging":
https://github.com/json-schema/json-schema/wiki/Schema-merging-rules
I see this as a good idea, personally, although some will not agree
with me. But it certainly addresses this problem.
Wrong, see above.
I mean, I'm not sure what goal your theoretical algorithm solves. If I want to use an old schema, except change 'maxItems' to a smaller or larger value, this algorithm, because it is commutative, only allows me to go in one direction. I'll be able to make maxItems bigger but not smaller, or vice-versa. This isn't very useful.
, but perhaps there is a better term, maybe 'implements'. It's worth giving some effort towards, but I wouldn't be too concerned about choosing the right word for the job, as long as it's well-defined and internally consistent. For instance, I haven't seen anyone get confused that, even though an ECMAScript Array is also an Object, a JSON array is never an object.
I mean, is there any evidence to suggest that the v3 draft definition confused anyone more than some alternative term?If multiple inheritance isn't often used, it's because often you plain don't need it. But when it is used I don't see anyone getting confused that its use of 'extends' creates a union, not an intersection, of the inherited classes.
All of the other similar terms I can think of have stronger connotations - 'parent' (implies a relationship), 'inherits' (where something is inherited there must be some parent), 'derivedFrom' (maybe), 'reuse' (not really applicable to schemas) , 'base' (URIs). So I think I'll argue for using 'extends'. And typically, you use an 'extends' keyword in a programming language to define new functionality, new methods or properties. Obviously, JSON doesn't have methods, the only thing 'adding new functionality' could mean is that you expand the range of legal values, or at least better define them in some capacity (which may even mean a mix of allowing new instance values and restricting others).
And perhaps I've been programming PHP too long but in C++ I believe you may define new properties in a subclass that aren't accessible from the superclass. Is the legal range of values for an instance of the subclass not a superset of the legal range of values for an instance of the superclass? (The fact that you can't use an instance of the superclass where an instance of the subclass is called for is immaterial.)
Austin Wright.
Um. No? Every legal value for an instance of the subclass is a legal value for an instance of the superclass, but the inverse is not true. So the legal range of values for a subclass is a subset of the legal range of values for the superclass.In terms of the set of possible values, classical inheritance defines a subset.
On Sat, Jan 19, 2013 at 11:38 AM, Geraint (David) <gerai...@gmail.com> wrote:
> "reuse with patches" is dead on. Nicely put!
>
> In fact, I think "patches" would be exactly the right keyword to describe
> this behaviour.
>
Ick... There is also JSON Patch coming along, I believe using this as
a keyword could cause potential confusion.
Yes, people expect that. No, this is not how extends works, and draft
v3 wording tells that although you really need to read carefully.
Anyway. I don't think discussing the theory at hours on end will be of
benefit at this point. It is clearly the case that people who didn't
pay close attention to what the draft said have misunderstood extends,
and the next step from here on is to devise a solution which is
technically sound.
Actually, it is a subset. Take a pointer of the superclass
Super *p;
You can not only assign pointers to instances of Super, but also pointers to instances of Sub, and of any other class that inherits from Super, say Sub2.
On the other hand this pointer
Sub *q;
does not accept pointers to Super or Sub2 instances (unless Sub2 is also a subclass of Sub) so it restricts the values to a subset.
OK, can we just please stop using vocabulary related to "extends"?
This keyword has polluted, and continues to pollute, people's minds,
especially those who are seasoned with OO programming. It is to the
point that I feel twitchy each time I read these words in the context
of JSON Schema, given the time I, and others, have spent "dispelling
the myth".
We need to find another vocabulary. "Inheritance", "overriding",
"reuse", whatever. But please, nothing "extends"-related :p
Adding an additional property to a schema can have both effects, if "additionalProperties" is false, it will reduce the value set, because you add a restriction on the property value that was not there before. However if "additionalProperties" is true, it will increase the value set, because now also values with this new property are allowed.
On Sunday, January 20, 2013 12:56:17 AM UTC-7, Heinrich Nirschl wrote:Adding an additional property to a schema can have both effects, if "additionalProperties" is false, it will reduce the value set, because you add a restriction on the property value that was not there before. However if "additionalProperties" is true, it will increase the value set, because now also values with this new property are allowed.You can do either in OO models, sometimes, too. If you want to define that a property defined in the superclass has a smaller range in the subclass, often you can do that.
I'm still not clear why you want this feature. "additionalProperties":false is like a final class in Java - instead of trying to invent a way to nullify the "final" keyword, you should really be lookiing at your model, because something is obviously wrong there.
I agree there's going to be some confusion if this overrides behavior were called with an "extends" property, if only because that term is defined differently in v3. But that's no reason to not fix the definition with something more in-line with what people are expecting. So long as there is no better term.
Well, maybe not. Maybe "supplement"? idk, that's not strong enough a word.
--
Yeah - that's creating a subset of the possible values. But with your proposed idea of "schema overriding" or whatever, you can do the equivalent of defining a property in the subclass that has a wider range of values than the super-class - and that is not possible in OO models.
On Sunday, January 20, 2013 9:49:03 AM UTC, Austin Wright wrote:On Sunday, January 20, 2013 12:56:17 AM UTC-7, Heinrich Nirschl wrote:Adding an additional property to a schema can have both effects, if "additionalProperties" is false, it will reduce the value set, because you add a restriction on the property value that was not there before. However if "additionalProperties" is true, it will increase the value set, because now also values with this new property are allowed.You can do either in OO models, sometimes, too. If you want to define that a property defined in the superclass has a smaller range in the subclass, often you can do that.
Yeah - that's creating a subset of the possible values. But with your proposed idea of "schema overriding" or whatever, you can do the equivalent of defining a property in the subclass that has a wider range of values than the super-class - and that is not possible in OO models.
I'm still not clear why you want this feature. "additionalProperties":false is like a final class in Java - instead of trying to invent a way to nullify the "final" keyword, you should really be lookiing at your model, because something is obviously wrong there.
Austin - can I just ask what definition you are using for "value range"? Because I don't see how a value range for a subclass can ever be a superset of a value range for the superclass.
The JSON Schema model of inheritance is similar to the OO idea of "duck typing" - you judge by the values/properties/methods present, not the actual definitions.
But even with that model, the "value range" for a subclass (i.e. the set of possible values that can convincingly pass as instances of the subclass) is a subset of the "value range" for the superclass.
"Every instance of Super may be read into an instance of Sub"?How would you de-serialize (3, 4) into a Sub? To be a valid Sub, the third value *must* have a defined value.
Or to put it another way, there exists a serialized data structure that is a valid Sub, but not a valid Super, but no such structure the other way around. Every instance of Super may be read into an instance of Sub, but an instance of Sub can not be unserialized and read into a Super instance.
So after this rather lengthy discussion... I'm a bit still confused...This is what I got out of this:"extends" doesn't really do inheritance... and everyone hates the term."additionalProperties": false is somewhat like the Java "final" keyword on a class."type" sort of does the same thing as "extends" but is really a union or conjunction (almost as bad a "friends" in C++).all of this is reworked in v4 with a bit more clarity.Now following the recommendation from another thread, from fge I believe, I did something like this:foo.json{"id": "http://example.com/foo.json","properties": {"foo": {"type": "string"}}}bar.json{"id": "http://example.com/bar.json","extends": {"$ref": "http://example.com/foo.json"},"properties": {"bar": { "type": "string" }},"additionalProperties": false}Should this permit a instance of bar like { "foo": "hello", "bar": "world" } but not { "foo": "hello", "bar": "world", "name": "bob" }? That's what I want to do... and would expect..
The validator, I'm using (python jsonschema https://github.com/Julian/jsonschema) doesn't seem to honor the the inherited properties.. If there's a better python solution I'm open to change. However I'm trying to determine if this is a bug or feature of v3. From what the dialoge suggests it seems like a feature, for the reason that bar can't apply the restriction to foo because of a 'lack of visibility to foo', which I'd call BS on since there's a reference right there to foo in bar.
bar.json{"id": "http://example.com/bar.json","extends": {"$ref": "http://example.com/foo.json"},"properties": {
"bar": { "type": "string" }},"additionalProperties": false}
bar.json{"id": "http://example.com/bar.json",
"bar": { "type": "string" }},"additionalProperties": false}
The validator, I'm using (python jsonschema https://github.com/Julian/jsonschema) doesn't seem to honor the the inherited properties.. If there's a better python solution I'm open to change. However I'm trying to determine if this is a bug or feature of v3. From what the dialoge suggests it seems like a feature, for the reason that bar can't apply the restriction to foo because of a 'lack of visibility to foo', which I'd call BS on since there's a reference right there to foo in bar.
See above for how it works.
The reason it works like this is simple: what if you have a schema:base.json{"type": "object","properties": {"firstName": {"type": "string", ...},"middleName": {"type": "string", ...},"lastName": {"type": "string", ...}},"required": ["firstName", "lastName"]}and you want to restrict it to a stricter format - you want "firstName" and "lastName", but no other properties. In this case, you can do this:strict.json{"allOf": [{"$ref": "base.json"}],"properties": {"firstName": {},"lastName": {}},"additionalProperties": false}
Hello,
On Fri, Jan 25, 2013 at 4:45 AM, Jim Klo <jim...@sri.com> wrote:
[snip]"description": "Abstract Resource Data",[snip]
"properties": {
"doc_type": {
"type": "string",
"enum": ["resource_data"],
"required": true
}
},"patternProperties" : {
"X_.*": { "type":"any" }
Warning: regexes in JSON Schema are not anchored. If you want to match
properties starting with "X_", write "^X_." as a regex, no need for
more.
The way this regex is written, it will match, for instance, "AX_"
(since ".*" can match the empty string)
[snip]
in my test... i'm using the following test case to test my schema property
by property:
[
{
"description": "validate doc_type",
"schema":{
"type": "object",
"properties": {
"doc_type": { "$ref":
"file:lr/schema/abstract_resource_data.json#properties/doc_type" }
}
What I don't understand is that the referenced schema from the test has
"required": true.... shouldn't this get caught in the "missing doc_type"
test?
What validator are you using? The behaviour you see here is
unfortunately normal, and is why "required" has been defined as such
in draft v4.
Look at your schema above:"properties": {
"doc_type": {
"type": "string",
"enum": ["resource_data"],
"required": true
}
If you extract the schema you reference, this gives:
{
"type": "string",
"enum": [ "resource_data" ],
"required": true
}
(by the way: get rid of "type": "string", you don't need it here --
the constraint on "enum" is enough by itself)
And this schema will be evaluated _in its own context_. And in this
context, "required" means nothing. Therefore, the constraint cannot be
enforced.