some rules for extends

Martijn Faassen

unread,

Nov 4, 2010, 4:52:49 AM11/4/10

to json-...@googlegroups.com

Hi there,

I spent a bit of time thinking about possible rules for extends, where
the child restricts the parent.

* if the parent defines a simple type, the child cannot define one, except
if the parent type is number and the child type is integer.

* if the parent defines a union type, the child can define a
restricted version of that union type. (a bit annoying if there's a
repeated sub-schema)

* if the parent defines a disallow, the child should form a union with
the parent disallow.

* if the parent defines a minimum, the child can define a greater
minimum. if the parent defines a maximum, the child can define a
smaller maximum. the same for minItems, maxItems, minLength,
maxLength.

* if the parent defines exclusiveMinimum to True, the child cannot
override this. Same with exclusiveMaximum.

* if the parent defines uniqueItems, the child can redefine the
uniqueItems with a list that's a smaller subset.

* if the parent defines a pattern, the child cannot override the
pattern. Theoretically it may be possible to come up with a way to
verify whether a regular expression that's a legitimate restriction
of another one, but that sounds too hard.

* if the parent defines a format, the child cannot define a
format. Theoretically formats can be subsets of each other but this
we don't define.

* if the parent defines a maxDecimal, the client can define a smaller
maxDecimal.

This isn't a complete list; the rules for properties and items in
particular (in interaction with additionalProperties and required and
so on) are particularly complex. I think there are several decisions
you'd need to make about what the behavior should be if you want a
chance this is consistent across implementations.

Regards,

Martijn

Andi

unread,

Nov 4, 2010, 7:54:42 AM11/4/10

to JSON Schema

Thanks for sharing your thought, Martijn.
But I cannot agree with this. Somehow we need a simple definition. It
is too much work to define rules for every property in the context of
extending.
For me there are two alternatives, that we have discusse sin another
thread:

1) extends - validating one schema after another
2) inherits - the extending schema inherits from the extended schema.
Every property besides "id" will be overwritten.

Both concepts do not need own rules for every property so that it is
much clearer for the author.

What do the others think? Do you agree that such specific rules are
too complicated?

Martijn Faassen

unread,

Nov 4, 2010, 8:22:44 AM11/4/10

to json-...@googlegroups.com

Hi there,

On Thu, Nov 4, 2010 at 12:54 PM, Andi <andrea...@gmx.de> wrote:
> Thanks for sharing your thought, Martijn.
> But I cannot agree with this. Somehow we need a simple definition. It
> is too much work to define rules for every property in the context of
> extending.

Well, if we want to make extending as described now work in a way so that it's:

* useful

* follows the rule that extending schemas should only restrict

* is consistent across implementations

then we'll need some rules like this.

I agree that this is a complicated set of rules and that's a sign we
may want to change the requirement altogether. I do think it's
possible to come up with a solid set of rules, however.

> For me there are two alternatives, that we have discussed in another

> thread:
>
> 1) extends - validating one schema after another

In this case the extending schema could conflict with the original
schema in a way so that nothing can validate successfully at all. For
instance, if we require in one schema that an instance value needs to
be a string and the other require it needs to be an integer. We don't
require that tools know about such conflicts, though of course a tool
could offer support for this.

If this is to be the rule, we should relax the following language in the spec:

"The inheritance rules are such that any instance that is valid
according to the current schema MUST be valid according to the
referenced schema."

"A schema that extends another schema MAY define additional
properties, constrain existing properties, or add other constraints.
The schema MUST NOT define a constraint that conflicts with an
extended schema such that no instance may satisfy both schemas."

Instead, we could do this:

"The inheritance rules are such that any instance that is valid
according to the current schema SHOULD be valid according to the
referenced schema."

"A schema that extends another schema MAY define additional
properties, constrain existing properties, or add other constraints.
The schema SHOULD NOT define a constraint that conflicts with an
extended schema such that no instance may satisfy both schemas.

with a bit of extra information about that the minimal implementation
is just validate one after another, and that implementations MAY offer
facilities that verify whether one schema is a true extension of the
other.

Regards,

Martijn

Gary Court

unread,

Nov 6, 2010, 3:15:08 PM11/6/10

to json-...@googlegroups.com

On Thu, Nov 4, 2010 at 6:22 AM, Martijn Faassen <faa...@startifact.com> wrote:
> Hi there,
>
> On Thu, Nov 4, 2010 at 12:54 PM, Andi <andrea...@gmx.de> wrote:
>> Thanks for sharing your thought, Martijn.
>> But I cannot agree with this. Somehow we need a simple definition. It
>> is too much work to define rules for every property in the context of
>> extending.
>
> Well, if we want to make extending as described now work in a way so that it's:
>
> * useful
>
> * follows the rule that extending schemas should only restrict
>
> * is consistent across implementations
>
> then we'll need some rules like this.
>
> I agree that this is a complicated set of rules and that's a sign we
> may want to change the requirement altogether. I do think it's
> possible to come up with a solid set of rules, however.

This is only if you want to validate that a child schema is valid in
regards to extending a parent schema. If you don't have this
requirement, then you just need to validate with both the extended
child schema and the parent schema. If one of them throws a validation
error on a valid instance, then the child schema is not valid.

Loosing the restrictions of extends would violate the core reason
behind extends: That the child schema always accepts a subset of the
instances that the parent schema does. Rewording it to MAY would add
general confusion, and makes the child schema ambiguous.

Martijn Faassen

unread,

Nov 12, 2010, 10:39:18 AM11/12/10

to json-...@googlegroups.com

Hi there,

On Sat, Nov 6, 2010 at 8:15 PM, Gary Court <gary....@gmail.com> wrote:

> This is only if you want to validate that a child schema is valid in
> regards to extending a parent schema.

> If you don't have this
> requirement, then you just need to validate with both the extended
> child schema and the parent schema. If one of them throws a validation
> error on a valid instance, then the child schema is not valid.

If I have a derived schema that I somehow constructed from the base
and the extending schema, I can test it with a single valid instance
and if it says it's invalid, then the constructed schema is not valid.
If it succeeds, I have no guarantee that my merging was done
correctly, as other instances might exist that aren't valid according
to the referenced schema but aren't valid according to my merger.

But let's try something else. Let's talk about an implementation that
satisfies the requirements minimally:

So, is the following true?

* a minimal implementation of a validator can check against the
constraints declared by the extending schema as well as the base
schema, without any attempt at merging the two.

* if there is extending schema that declares a constraint so that *no*
instance is valid against both extending schema and base schema ("must
have property foo as opposed to must not have property foo", for
instance), the schema processor should reject any instance as invalid.
(it will do this if the minimal rule is implemented)

* A schema processor must not reject such impossible schemas as
invalid. It may report on such conflicts, but it cannot reject schemas
as invalid themselves, as that would make it impossible to create a
minimal implementation that is compliant.

* it's totally acceptable to have conflicting constraints that reject
*some* instances. For instance, I could declare an optional property
'foo' to be an integer in the parent and string in the child. This
would simply result in any instances with the property 'foo' to be
invalid, and would in effect be the extending schema putting an
additional constraint on the base schema so that foo is now not an
allowed property anymore.

* again a schema processor may report on such situations, but it may
not reject such schemas from validation, because a minimal
implementation cannot do so.

If the above is *not* true, it appears that implementing the spec
correctly would require some careful thought about merging schemas.
(if this is not so, please explain)

If the above is true, I do think a clarification would be helpful.

Here's the draft 03 text:

"The value of this property MUST be another schema or a URI
referencing a schema which will provide a base schema which the
current schema will inherit from. The inheritance rules are such that

any instance that is valid according to the current schema MUST be

valid according to the referenced schema. This MAY also be an array,
in which case, the instance MUST be valid for all the schemas in the
array. A schema that extends another schema MAY define additional

properties, constrain existing properties, or add other constraints.
The schema MUST NOT define a constraint that conflicts with an
extended schema such that no instance may satisfy both schemas."

I think we should add language like this here:

"A validator MAY report such schemas, but MUST NOT reject them.
Instead validators MUST accept such an extending schema for validation
against instances, and validate all instances as invalid against that
schema. At a minimum, implementations of validators SHOULD validate
against the constraints in the referenced schema(s) as well as the
additional constraints in the extending schema."

This brings into question whether the 'MUST NOT define a constraint"
language in the original is really what we want. This implies that
certain schemas are not valid. If the spec says the schema should be
valid according to the meta schema, presumably we want implementations
to reject invalid schemas automatically. his implies to me that people
implementing the spec should care about invalid schemas, not just
people writing such schemas. After all, equivalent MUST language about
instances does mean implementers should care. But I think otherwise
the spec stays clear of validating schemas itself, except by means of
a meta-schema.

Regards,

Martijn

Kris Zyp

unread,

Nov 12, 2010, 11:43:14 PM11/12/10

to JSON Schema

On Nov 12, 8:39 am, Martijn Faassen <faas...@startifact.com> wrote:
> Hi there,
>

These rules sound correct, although I am not certain what is meant by
a schema processor. We have provided normative language for an
instance validator to determine if an instance is valid against a
given schema. Whether or not the schema has empty set of valid
instances (due to conflicting properties) is a separate question. One
could use the language in the spec to write schema checker that could
verify whether or not a schema will accept any instances, but that
would be distinct role from an instance validator. However, if a
schema does not have any valid instances (due to conflicts), then an
instance validator will reject all instances, performing a check for
conflicts would be redundant since conflicting schemas will already
reject instances.

>
> If the above is true, I do think a clarification would be helpful.
>
> Here's the draft 03 text:
>
> "The value of this property MUST be another schema or a URI
> referencing a schema which will provide a base schema which the
> current schema will inherit from. The inheritance rules are such that
> any instance that is valid according to the current schema MUST be
> valid according to the referenced schema. This MAY also be an array,
> in which case, the instance MUST be valid for all the schemas in the
> array. A schema that extends another schema MAY define additional
> properties, constrain existing properties, or add other constraints.
> The schema MUST NOT define a constraint that conflicts with an
> extended schema such that no instance may satisfy both schemas."
>
> I think we should add language like this here:
>
> "A validator MAY report such schemas, but MUST NOT reject them.
> Instead validators MUST accept such an extending schema for validation
> against instances, and validate all instances as invalid against that
> schema. At a minimum, implementations of validators SHOULD validate
> against the constraints in the referenced schema(s) as well as the
> additional constraints in the extending schema."

So the instances will be rejected anyway, but this defines a separate
behavior for validation of the schema prior to validating the
instance?

>
> This brings into question whether the 'MUST NOT define a constraint"
> language in the original is really what we want. This implies that
> certain schemas are not valid. If the spec says the schema should be
> valid according to the meta schema, presumably we want implementations
> to reject invalid schemas automatically. his implies to me that people
> implementing the spec should care about invalid schemas, not just
> people writing such schemas. After all, equivalent MUST language about
> instances does mean implementers should care. But I think otherwise
> the spec stays clear of validating schemas itself, except by means of
> a meta-schema.

I am fine with removing the "MUST NOT define a constraint". Should it
be "SHOULD NOT" instead?

Martijn Faassen

unread,

Nov 13, 2010, 12:10:40 PM11/13/10

to json-...@googlegroups.com

Hey,

On Sat, Nov 13, 2010 at 5:43 AM, Kris Zyp <kri...@gmail.com> wrote:

> On Nov 12, 8:39 am, Martijn Faassen <faas...@startifact.com> wrote:

[snip]

> These rules sound correct, although I am not certain what is meant by
> a schema processor.

Sorry, the read "the instance validator".

> We have provided normative language for an
> instance validator to determine if an instance is valid against a
> given schema. Whether or not the schema has empty set of valid
> instances (due to conflicting properties) is a separate question.

My point is that the spec as it stands makes it seem to me that I
should reject as invalid any schema that validates no instance. This
is because it uses language very similar to how I should validate
instances. This is what I intend to clarify. Instead, the spec should
declare such schemas valid, but they should just reject any instances.

So, in my language I'm trying to explicitly say that there is *no*
validation of the schema itself. If you say "the schema MUST NOT
define.." that language could be easily interpreted as that we should
reject any such schemas. After all if you say elsewhere that a value
(in an instance) MUST be a string, that means the validator should
reject any instances where it's not a string, and here we are making
similar statements about schemas.

> I am fine with removing the "MUST NOT define a constraint". Should it
> be "SHOULD NOT" instead?

I think just removing the whole "MUST NOT define a constraint" section
would be most helpful. I also think we should add language showing
what a minimally conforming implementation should do:

"Conceptually, the behavior of extends can be seen as validating an
instance against all constraints in the extending schema as well as
the extended schema(s). More optimized implementations that merge
schemas are possible, but are not required."

I think by specifying the minimally conforming implementation we not
only help implementers but also help define what the behavior of
extends really is.

One consequence of this would be that an optimized implementation
might give different validation errors than the basic implementation,
but that would be okay as long as they reject and accept the same
instances.

That impossible schemas can be created would be considered okay by the
spec, unless we want to specify that an implementation SHOULD warn in
such a case (rejecting them would make them incompatible with
minimally conforming implementations, so they can't do that).

Regards,

Martijn

Kris Zyp

unread,

Nov 13, 2010, 1:11:40 PM11/13/10

to json-...@googlegroups.com

OK, I added that text in place of the mandate on non-conflicts.
Kris

--
Thanks,
Kris

Reply all

Reply to author

Forward