Problem with "optional" schema attribute

84 views
Skip to first unread message

Gary Court

unread,
Jun 4, 2010, 7:00:04 PM6/4/10
to JSON Schema
I've recently been implementing a JSON Schema validator that is fully
revision 2 (http://tools.ietf.org/html/draft-zyp-json-schema-02)
compliant. In my implementation, I have discovered a problem with how
the "optional" attribute is defined:

> This indicates that the instance property in the instance object is optional. This is false by default.

The problem is not the attribute, but it's default value and the
behavior it causes.

***Different then how all other attributes behave***

Every other schema attribute behaves in a way such that when a
attribute is defined, a restriction is placed on the JSON instance and
a check must be done against it to ensure that it is compliant against
that rule. If a schema attribute is not defined, then the check does
not need to be done.

The "optional" schema attribute behaves in the opposite way where if
it is not defined, then the restriction "ensure this property exists"
is placed on the instance and the check must be run.

Example:

type - If not defined, don't check the type
properties - If not defined, don't check individual properties for
validation
items - If not defined, don't check items for validation
additionalProperties - If not defined, don't check other properties/
items for validation
...etc...
optional - If not defined, make sure this property exists

Therefore, validators must treat the special condition where if the
"optional" schema attribute is not defined, to enforce the
restriction.

***The empty schema is not the true super set***

Thought of another way, the empty schema should be the super set of
all possible instances, while schemas with attributes are subsets of
the empty schema. But this is not the case. Example:

{
a : {},
b: { optional : true }
}

Schema "a" is the empty schema, while schema "b" is a custom schema.
Under revision 2's definition, schema "b" will accept more JSON
instances then schema "a", the empty schema. Specifically, schema "b"
will accept the undefined JSON instance while schema "a" will not.
Therefore, schema "b" is the true super set, and no other schema can
validate against more items.

***Causes issues with schema extensions***

The behavior of schemas that extends from multiple schemas are not
defined very well in the specification. As such, for the following
schema:

{
properties : {
a : {
extends : [
{ optional : true },
{ optional : true },
]
}
}
}

The following JSON will not validate:

{}

This is due to how the "extends" attribute defines an array value:

> This may also be an array, in which case, the instance must be valid for all the schemas in the array.

The problem is that the schema for "a" does not define the "optional"
attribute, therefore the property "a" is not optional. An array of
extensions only ensures that each schema is valid against the
instance, and does not affect the child schema.

***Solutions***

There are three solutions to correcting this behavior:

1. Change the default empty schema to {"optional":true}
2. Make the default value of "optional" true
3. Rename "optional" to "required"

First off, #1 is out of the picture for obvious reasons. Solution #2
would work, but is against the idea that all boolean properties
default to false.

Solution #3 makes the most sense as it defaults to false and places no
restrictions on the empty schema, allowing the empty set to contain
every possible JSON value. Optionally, validators can examine a JSON
schema for this attribute to see if it is using a revision 3 schema.

I ask that the editors of the JSON Schema spec please review and
update this schema attribute in the next spec revision.

Thanks!
-Gary Court

Ganesh and Sashi Prasad

unread,
Jun 5, 2010, 1:59:15 AM6/5/10
to json-...@googlegroups.com
It's a good argument and makes a lot of sense to me (option #3 - using "required" instead of "optional"). It may also clean up the grammar and make parsers cleaner and simpler.

The major issue is the amount of existing code that could break. [Of course, this will break less code than reversing the default value of "optional" from false to true (option #2).] On the other hand, major bite-the-bullet changes are best made in the early stages before widespread adoption.

The big pain is ease-of-use. From my experience, most attributes in messages are actually mandatory. Optional attributes tend to be the exception. Forcing designers to type ' "required": true ' after most attributes is an onerous requirement. I suspect that in practice, most designers will define the schema in a shorthand like Orderly (which round-trips to JSON Schema) where the pain could be mitigated through a shorthand for "required", such as an exclamation mark.

That's just my two cents. There may be other major considerations militating against such a change.

Regards,
Ganesh


--
You received this message because you are subscribed to the Google Groups "JSON Schema" group.
To post to this group, send email to json-...@googlegroups.com.
To unsubscribe from this group, send email to json-schema...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/json-schema?hl=en.


Kris Zyp

unread,
Jun 6, 2010, 12:12:22 AM6/6/10
to json-...@googlegroups.com
I agree this is a problem, and #3 may be the most reasonable solution.
The only other solution would be to add a type name of "undefined" to
the possible values for the "type" attribute. Optional properties could
be specified by including "undefined" in a union of types, in the same
way that nullability can be defined by adding a "null" type. If a single
type is defined it would still essentially default to required.
Kris

--
Thanks,
Kris

Ganesh and Sashi Prasad

unread,
Jun 6, 2010, 3:30:13 AM6/6/10
to json-...@googlegroups.com
> If a single type is defined it would still essentially default to required.

Is this rule necessary? In some cases, empty message bodies are OK. Why not require "required" to be explicitly specified in all cases (even with a single item)?

Regards,
Ganesh

Kris Zyp

unread,
Jun 6, 2010, 8:39:33 AM6/6/10
to json-...@googlegroups.com, Ganesh and Sashi Prasad


On 6/6/2010 1:30 AM, Ganesh and Sashi Prasad wrote:
> If a single type is defined it would still essentially default to required.

Is this rule necessary?

This isn't any new rule. If you specify a type and the instance value doesn't match, it is not a valid schema.


In some cases, empty message bodies are OK.
Right, and it is easy to allow empty message bodies.
Why not require "required" to be explicitly specified in all cases (even with a single item)?

None of the attributes in JSON Schema are required, they are all optional and have a default value. Having "required" be required would be completely inconsistent and defeat the brevity that is possible with JSON Schema.
Kris
-- 
Thanks,
Kris

Gary Court

unread,
Jun 6, 2010, 12:13:44 PM6/6/10
to JSON Schema
Under Kris's alternative suggestion, we would have to add two new
defined types for the "type" attribute:

"undefined" - The value of the instance can be undefined, making it
optional
"all" - (Lack of better name) Same as type ["any", "undefined"]

We would also change the default value of "optional" to true, and
deprecate it in the next spec. (Though rev 3+ validators should add
support for backwards compatibility)

The default value of "type" would be "all", making the empty schema
optional; therefore allowing the empty schema to be the set of all
possible instances.

This is not a bad solution as it gives us some backwards compatibility
with the revision 2 spec. The only time this would not be backwards
compatible is where a schema, assuming "optional" is false, does not
define a "type" attribute.

On Jun 6, 6:39 am, Kris Zyp <kris...@gmail.com> wrote:
> On 6/6/2010 1:30 AM, Ganesh and Sashi Prasad wrote:
>
> > > If a single type is defined it would still essentially default to
> > required.
>
> > Is this rule necessary?
>
> This isn't any new rule. If you specify a type and the instance value
> doesn't match, it is not a valid schema.
>
> > In some cases, empty message bodies are OK.
>
> Right, and it is easy to allow empty message bodies.
>
> > Why not require "required" to be explicitly specified in all cases
> > (even with a single item)?
>
> None of the attributes in JSON Schema are required, they are all
> optional and have a default value. Having "required" be required would
> be completely inconsistent and defeat the brevity that is possible with
> JSON Schema.
> Kris
>
>
>
>
>
>
>
> > Regards,
> > Ganesh
>
> > On 6 June 2010 14:12, Kris Zyp <kris...@gmail.com
> >     <mailto:json-...@googlegroups.com>.
> >     To unsubscribe from this group, send email to
> >     json-schema...@googlegroups.com
> >     <mailto:json-schema%2Bunsu...@googlegroups.com>.

Matthew W

unread,
Jun 7, 2010, 12:24:10 PM6/7/10
to JSON Schema
Personally my feeling is that 'optional' is an attribute only of a
particular property of an object schema, and not an attribute of the
schema which that property refers to.

I think the way object-property flags are conflated with schema flags
are in part responsible for the confusion here. It would be good to
separate object-property attributes from schema attributes, eg like
so:

{
properties: [
{name: "foo", optional: true, readonly: true, schema: {type:
'integer', ...}}
]
}

Note that this also avoids an awkward problem when you want to refer
to a schema by reference, but need to make it optional. At present you
need to do something like:

{
properties: {
foo: {"$ref": "/some/schema", optional: true}
}
}

Note that this relies on some implicit assumptions about how object
properties are merged when creating an object graph from references.
An implementation would have to be careful not to overwrite the
referenced schema object with optional: true, but instead shadow it
somehow with the extra merged property, meaning $ref no longer points
to the same instance.

(yes, sorry, /another/ reservation of mine about the current
referencing mechanism which I forgot to mention before).

Anyway note that separating the schema from the property flags makes
this problem go away:

{
properties: [
{name: "foo", optional: true, schema: {"$ref": "/some/schema"}}
]
}

Gary Court

unread,
Jun 7, 2010, 1:19:09 PM6/7/10
to JSON Schema
On Jun 7, 10:24 am, Matthew W <matthew.will...@gmail.com> wrote:
> Personally my feeling is that 'optional' is an attribute only of a
> particular property of an object schema, and not an attribute of the
> schema which that property refers to.

Yes, I too have noticed that both "optional" and "requires" schema
attributes only apply when a schema is applied against a property of
an object instance. It would be nice if these were not a special case.

> I think the way object-property flags are conflated with schema flags
> are in part responsible for the confusion here. It would be good to
> separate object-property attributes from schema attributes, eg like
> so:
>
> {
>   properties: [
>     {name: "foo", optional: true, readonly: true, schema: {type:
> 'integer', ...}}
>   ]
>
> }

I hesitate to back this schema form for a few reasons:
1. More verbose, and breaks currently existing schemas. However, is
easily detectable that it's using a new schema format.
2. This method allows for multiples definitions of the same "name".
3. Makes referencing the subschema within it way more unstable.
Example, {$ref : "#.properties.0.schema"} is bad as it's not clear
from this reference what 0 is, and this could change by just the
reordering of properties. And not to mention what happens when
extensions are used (which is not well defined to begin with).

I would rather see a schema like this:

{
properties : {
foo : { type : "integer" }
}
required : ["foo"]
}

This way we don't need a second type of parser for the properties; as
well, this moves the "required" ("optional") attribute to the schema
where the restriction checks are actually going to take place. This
layout also results in smaller schemas.

> Note that this also avoids an awkward problem when you want to refer
> to a schema by reference, but need to make it optional. At present you
> need to do something like:
>
> {
>   properties: {
>     foo: {"$ref": "/some/schema", optional: true}
>   }
>
> }

As I understand it, the revision 2 way is:

{
properties : {
foo : {
optional : true,
extends : {$ref : "/some/schema"}
}
}
}

> Note that this relies on some implicit assumptions about how object
> properties are merged when creating an object graph from references.
> An implementation would have to be careful not to overwrite the
> referenced schema object with optional: true, but instead shadow it
> somehow with the extra merged property, meaning $ref no longer points
> to the same instance.
>
> (yes, sorry, /another/ reservation of mine about the current
> referencing mechanism which I forgot to mention before).

When you start messing with shadow objects, you're over complicating
it.

Matthew W

unread,
Jun 7, 2010, 1:53:57 PM6/7/10
to JSON Schema
> > Personally my feeling is that 'optional' is an attribute only of a
> > particular property of an object schema, and not an attribute of the
> > schema which that property refers to.
>
> Yes, I too have noticed that both "optional" and "requires" schema
> attributes only apply when a schema is applied against a property of
> an object instance. It would be nice if these were not a special case.

Agreed.

> I hesitate to back this schema form for a few reasons:
> 1. More verbose, and breaks currently existing schemas. However, is
> easily detectable that it's using a new schema format.
> 2. This method allows for multiples definitions of the same "name".
> 3. Makes referencing the subschema within it way more unstable.
> Example, {$ref : "#.properties.0.schema"} is bad as it's not clear
> from this reference what 0 is, and this could change by just the
> reordering of properties. And not to mention what happens when
> extensions are used (which is not well defined to begin with).

Fair points.

{
properties: {
foo: {optional: true, schema: {....}}
},
}

fixes (2) and (3) but not (1).

You could add the convention that eg

{optional: true, type: "integer"}

normalises to

{optional: true, schema: {type: "integer"}}

which ensures backwards compatibility, although yes there is the issue
you mention then with how path-based relative references would then
work. (see the identityProperties thread for some debate over the pros
and cons of path-based referencing in general)

> I would rather see a schema like this:
>
> {
>   properties : {
>     foo : { type : "integer" }
>   }
>   required : ["foo"]
>
> }

That works, yeah. (presumably you could have optional: ['foo', 'bar']
as well?)

> > Note that this relies on some implicit assumptions about how object
> > properties are merged when creating an object graph from references.
> > An implementation would have to be careful not to overwrite the
> > referenced schema object with optional: true, but instead shadow it
> > somehow with the extra merged property, meaning $ref no longer points
> > to the same instance.
>
> > (yes, sorry, /another/ reservation of mine about the current
> > referencing mechanism which I forgot to mention before).
>
> When you start messing with shadow objects, you're over complicating
> it.

For sure. I'd prefer to avoid the need for any kind of property
shadowing when referencing a schema in order to re-use that schema in
(say) an optional context. In fact I'd prefer to avoid it in general.

-Matt

George Sakkis

unread,
Jul 25, 2010, 11:33:04 AM7/25/10
to JSON Schema
On Jun 5, 1:00 am, Gary Court <gary.co...@gmail.com> wrote:

> ***Solutions***
>
> There are three solutions to correcting this behavior:
>
> 1. Change the default empty schema to {"optional":true}
> 2. Make the default value of "optional" true
> 3. Rename "optional" to "required"

How about:

4. Drop "optional" altogether and determine that a property is
optional if and only if has a "default".

With the current behavior, "default" is basically ignored unless
"optional" is true. A default value implies optional, so it is
redundant to have to say so explicitly. Replacing "optional=true" with
"default=null" should probably cover the majority of use cases. In
this way we also avoid the "null vs undefined" Javascript blunder; if
an application needs to differentiate them, it can just set a domain-
specific default that denotes undefined without burdening the JSON
Schema spec.

Regards,
George

Gary Court

unread,
Jul 25, 2010, 12:20:32 PM7/25/10
to json-...@googlegroups.com
Good observation. However, I'm not sure I would agree with this
approach as I don't think we should be binding the default value of a
property with if the property is required to be valid. They have very
different semantic meanings, which likely means there's a use case
where you would want one but not the other. Sadly I can't think of one
at the moment.

George Sakkis

unread,
Jul 25, 2010, 1:51:08 PM7/25/10
to json-...@googlegroups.com

We won't be binding a default for required properties; in fact the
definition of a required property will be "a property without a
default". If you meant that binding for optional properties might not
always be possible or desirable (e.g. if null has different
semantics), I think the reference mechanism should make it possible to
define an "undefined" value that is bound as default. I mean something
like this (hope I get the $ref syntax right):

{
"properties": {
"undefined": {"default": {}},
"foo": {},
"bar": {"default": null}
"baz": {"default": {"$ref": "#undefined.default"}}
}
}

Here "foo" is required while "bar", "baz" and "undefined" are optional
(actually "undefined" should not be passed but I'm not sure if JSON
schema allows "local variables") . "bar"'s value will be null if
omitted. For "baz" though, its default is the object referenced by
undefined.default. The client code can check if "baz" was omitted
simply by:

if (data.baz === data.undefined) { ...}

The value for "undefined" doesn't have to be disposable like this, it
may be a persistent entity referenced externally. This is a known
pattern in Python that has a None (null) object but not undefined. If
None happens to be a semantically valid value, a fresh object can be
created and used as a sentinel:

_undefined = object()

def foo(x=_undefined):
if x is _undefined:
...

Regards,
George

Kris Zyp

unread,
Aug 27, 2010, 6:57:45 PM8/27/10
to json-...@googlegroups.com, Gary Court

On 6/4/2010 5:00 PM, Gary Court wrote:

> [snip]


> ***Solutions***
>
> There are three solutions to correcting this behavior:
>
> 1. Change the default empty schema to {"optional":true}
> 2. Make the default value of "optional" true
> 3. Rename "optional" to "required"
>
> First off, #1 is out of the picture for obvious reasons. Solution #2
> would work, but is against the idea that all boolean properties
> default to false.
>
> Solution #3 makes the most sense as it defaults to false and places no
> restrictions on the empty schema, allowing the empty set to contain
> every possible JSON value. Optionally, validators can examine a JSON
> schema for this attribute to see if it is using a revision 3 schema.

I had suggested option #4 of adding "undefined" as a type, but I tend to
think option #3 might actually be more sensible. Any objections to
changing to option #3 for rev 3 of the draft?

--
Thanks,
Kris

Dean Landolt

unread,
Aug 27, 2010, 7:07:16 PM8/27/10
to json-...@googlegroups.com
+1 for making properties optional by default (I think this is more in line philosophically anyway).

Gary Court

unread,
Aug 28, 2010, 2:54:39 AM8/28/10
to JSON Schema
+1 for option #3
Reply all
Reply to author
Forward
0 new messages