Proposal for restricting property names

547 views
Skip to first unread message

Gary Court

unread,
Oct 4, 2010, 8:24:42 PM10/4/10
to JSON Schema
In the second JSON Schema draft, the only way to place a restriction
on the name of a property is if the name is known ahead of time, and
there is a finite number of properties. There exists no way to place
restrictions on an infinite number of unknown property names of an
object.

For example, assume the following object:

{
"a0" : "text",
"a1" : "text",
"a2" : "text",
//etc...
}

Now, currently, we can validate this object using the following
schema:

{
"type" : "object",
"additionalProperties" : { "type" : "string" }
}

However, if all properties of this object are required to have a name
of a certain pattern/format, there is no way to restrict this. So this
object is allowed to have (even though it should be illegal):

{
//...
"a11" : "text",
"invalid" : "text",
"a13" : "text",
//...
}

Therefore, I am proposing for the next draft of JSON Schema a new
attribute that can enforce a pattern on property names. I see there
being two solutions to this:

***SOLUTION 1: The "names" Attribute***

This schema attribute applies to instance objects and enforces that
all property names of the instance object matches the provided regular
expression. The exception to this rule is that property names defined
by the "property" attribute are not checked. For example, the above
schema would be rewritten as:

{
"type" : "object,
"names" : "a\d+",
"additionalProperties" : { "type" : "string" }
}

***SOLUTION 2: The "name" Attribute***

This schema attribute applies to any instance and enforces the the
property name of the instance matches the given regular expression.
For example, the above schema would be rewritten as:

{
"type" : "object,
"additionalProperties" : {
"type" : "string",
"name" : "a\d+"
}
}

***COMPARISON***

Both solutions achieve the required goal. However, the second solution
has some disadvantages that may not be immediately apparent:

By placing the rule on the properties of the object themselves, you
are now creating a condition where the child instances are placing
restrictions on the parent instance. No other attribute (except for
"requires") does this, and should be avoided as it creates unintended
consequences when referencing other schemas that have this rule. For
example, the following schema is invalid because the referenced schema
requires a particular name:

{
"properties" : {
"exampleText" : { "$ref" : "schemaAbove#/additionalProperties" }
}
}

This is not the behavior most schema writers are trying to achieve;
Even if the the schema writer was trying to ensure that anything
referencing it has a particular name, this can still be achieved using
the first solution with a schema in the "requires" attribute.
Therefore, the first solution solves all use cases with no
disadvantages over the other.

In conclusion, I propose adding a "names" attribute to fill the
missing gap of enforcing property names of an object to match a
provided pattern.

Kris Zyp

unread,
Oct 4, 2010, 10:37:30 PM10/4/10
to json-...@googlegroups.com, Gary Court
One limitation with these approaches is that you can only have one
regex based set of property names. A couple other possibilities:

#3: Allow the "properties" attribute to take an array (or an object like
normal), and the array would be an array of schemas that specify the
property name (as regex) to apply to. Each property in the instance
would have to match and validate against at least one property definition.
"properties": [
{"name":"s\d+","type":"string"},
{"name":"n\d+","type":"number"}
]
One possibility in conjunction with #3 is that we could move union
capabilities to the properties array, since this effectively provides
unioning (when the same name is used multiple times), which could
eliminate the issue with primitive types mixing with referenced schemas
in "type".

#3b: We could vary this a little and the array could contain "name" and
"type" that refers to the schema, and possibly also "requires" (since it
doesn't apply directly to the value):
"properties": [
{"name":"s\d+","type":{"type":"string"}},
{"name":"n\d+","type":{"type":"number"}, "requires":"s1"}
]

#4: Use regular expressions as the property names in the "properties".
"properties": {
"s\d+": {"type":"string"},
"n\d+": {"type":"number"}
}

This has the advantage of being more succint, but would have some
backwards compatibility issues (although the problematic property names
would probably be relatively rare).

Both of these approaches would allow us to eliminate
additionalProperties altogether (a nice simplification).

Kris

--
Thanks,
Kris

Gary Court

unread,
Oct 5, 2010, 12:06:53 AM10/5/10
to JSON Schema
With the approach I proposed, you can have multiple regex for the same
type by using the OR operator in regex, like:

{
"names" : "s\d+|n\d+"
}

If you wanted to use different names's for different schemas, you can
use unions types:

{
"types" : [
{"names" : "s\d+"}
{"names" : "n\d+"}
]
}

Both approach #3 and #4 would break older JSON Schema validators,
while approach #1 would just be ignored. I was trying to remain as
backwards compatible as possible.

As an aside, moving union types to "properties" would just be bad,
causing too much collision with the way "properties" is supposed to
work and the now simple only "type"/"disallow".

-Gary

Gary Court

unread,
Oct 5, 2010, 12:53:15 AM10/5/10
to JSON Schema
Actually, I should not have wrote that while I was distracted.

In my last message, the second example of union types would not work.
So, it's true that options 1 & 2 would not allow different schemas for
multiple name patterns. Kris's option 3, 3b, & 4 would support this at
the cost of backwards compatibility with older validators. Also,
option 3 (not 3b) still has the problem that option 2 has, where
referencing other schemas inherits it's name restrictions. Option 3b
nicely moves "requires" out of the schema, at the cost of being more
wordy. Option 4 gives us the most flexability, but makes writing
schemas harder as you need to be aware of special characters (like
".") in property names. Finally, all 3 of these options means
validators have to do search against all properties instead of a hash
lookup when locating the appropriate schema for a property, slowing
down validation.

Also, I understand now about what you meant by moving union types.
You're referring to having union types defined outside of the schema,
where an instance's schema can be an array of schemas instead of a
single schema that has multiple schemas as types (as it is now). Can't
say I'd like that approach as it would have to be handled as a special
case by a validator.

-Gary

Gary Court

unread,
Oct 5, 2010, 1:30:34 AM10/5/10
to JSON Schema
Actually, in options 3, 3b, & 4 - all are trying to accomplish the
same thing: to associate a property name restriction to an
"additionalProperties" schema. Yet they would all break older
validators. If we are looking to maintaining this backwards
compatibility, we could:

Option #5 - Introduce a new attribute that provides a property name
pattern to schema association like:

"otherProperties": {


"s\d+": {"type":"string"},
"n\d+": {"type":"number"}
}

This is essentially the same as option #4, except it does not change
the signature of "properties". Properties that don't match those
defined in the "properties" attribute would then be compared to this
new attribute. This keeps simple properties easy to write, multiple
properties condensed, backward compatible, prevents referencing
problems, and allows validators to do hash lookups first before moving
to slower search and compare.

This new attribute could also be merged with option #1, providing an
easy and more complex way of validating property names.

-Gary

> --
> You received this message because you are subscribed to the Google Groups "JSON Schema" group.
> To post to this group, send email to json-...@googlegroups.com.
> To unsubscribe from this group, send email to json-schema...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/json-schema?hl=en.
>
>

BigBlueHat

unread,
Oct 5, 2010, 10:55:28 AM10/5/10
to JSON Schema
On Oct 4, 10:37 pm, Kris Zyp <kris...@gmail.com> wrote:
>  One limitation with these approaches is that you can only have one
> regex based set of property names. A couple other possibilities:
>
> #3: Allow the "properties" attribute to take an array (or an object like
> normal), and the array would be an array of schemas that specify the
> property name (as regex) to apply to. Each property in the instance
> would have to match and validate against at least one property definition.
> "properties": [
>   {"name":"s\d+","type":"string"},
>   {"name":"n\d+","type":"number"}
> ]

I've been getting into JSON Schema over the last few weeks and have
been wondering about "properties" being an object vs. an array of
properties. I know Gary's mentioned the speed of validation--which I'm
sure the looping requirement would slow things a bit.

For my use case (a form builder), I'm having a frustrating time
working with (unmodified) JSON Schemas as source documents of a
template engine (Mustache/Handlebars.js). Neither of them support key
output (which is an edgecase need for most JSON formats), so getting
at the name of the properties (for the name of input fields) is going
to require changing how those template engines work. That's not all
bad (key output may have other use cases), but it does show an area
where "properties" as an object may have some short comings--as
opposed to it being an array.

I'm new here, so take this all with a grain of salt. :) My plan is to
either a) hack in key output into Mustache/Handlebars.js or b) (which
is more likely, atm) modify the JSON Schema on its way into the Form
builder to make properties an array and pass the keys into a "name"
field (as described above).

Again, nothing major, just me learning JSON Schema. :) Thoughts on
this potential change would certainly help me grok the rational behind
"properties" as an object.

Thanks, all,
Benjamin

Kris Zyp

unread,
Oct 9, 2010, 11:38:48 PM10/9/10
to JSON Schema


On Oct 4, 11:30 pm, Gary Court <gary.co...@gmail.com> wrote:
> Actually, in options 3, 3b, & 4 - all are trying to accomplish the
> same thing: to associate a property name restriction to an
> "additionalProperties" schema. Yet they would all break older
> validators. If we are looking to maintaining this backwards
> compatibility, we could:
>
> Option #5 - Introduce a new attribute that provides a property name
> pattern to schema association like:
>
> "otherProperties": {
>  "s\d+": {"type":"string"},
>  "n\d+": {"type":"number"}
>
> }
>
> This is essentially the same as option #4, except it does not change
> the signature of "properties". Properties that don't match those
> defined in the "properties" attribute would then be compared to this
> new attribute. This keeps simple properties easy to write, multiple
> properties condensed, backward compatible, prevents referencing
> problems, and allows validators to do hash lookups first before moving
> to slower search and compare.
>
> This new attribute could also be merged with option #1, providing an
> easy and more complex way of validating property names.
>

Do you have a preference on this? I guess I would prefer to just go
with #1 unless we really need the power of the more complex
capability.
Thanks,
Kris

Gary Court

unread,
Oct 12, 2010, 12:58:14 PM10/12/10
to json-...@googlegroups.com
I don't think we should sell ourself short. JSON Schema should be able
to represent any schema in as little description as possible. I think
being able to to represent a schema for multiple subset of properties
is important (albeit rare) case to handle. I like option #5 the best
(over the close #4) as it keeps the most common use case simple to
write.

Also, after some thinking, merging with option #1 doesn't really
simplify things, so I would ignore that suggestion.

Kris Zyp

unread,
Oct 27, 2010, 6:25:44 PM10/27/10
to JSON Schema
I fine with option #5. Do you want to add it to the spec?
Kris

Gary Court

unread,
Oct 27, 2010, 8:52:35 PM10/27/10
to json-...@googlegroups.com
Sure, I can add it. But can you pull in my last pull request before
hand? Thanks!

-Gary

Andi

unread,
Oct 30, 2010, 3:13:51 PM10/30/10
to JSON Schema
I think this is interesting - yet too complex for the most cases.

Andi

unread,
Oct 30, 2010, 3:11:26 PM10/30/10
to JSON Schema
Hmm, why not defining key (this is the official name, I think) schemas
like any other schema, but with implicit type="string"?
Every key is a string, and you could define it like that. I propose
the property "additionalKeys", so that the keys defined in properties
will be valid as well. Example

type: 'object'
additionalKeys: {
// type: 'string' <- not necessary, it's implicit
minLength: 2,
maxLength: 10,
pattern: "a-z"
}

This is the easiest solution and we reuse all the stuff schema.js
already has.

Ideas:

- Probably it is necessary to allow "type" for languages like Python,
where you can use any data type as key.
- Of course, using an array of several schemas could be allowed.

Gary Court

unread,
Oct 30, 2010, 4:05:39 PM10/30/10
to json-...@googlegroups.com
Yea, I had thought about doing that. The thing is, you can already
represent every restriction using regular expressions. For example,
your example could be written as:

patternProperties : {
"^[a-z]{2,10}$" : {}
}

-Gary

Andi

unread,
Oct 31, 2010, 7:36:28 AM10/31/10
to JSON Schema
But it is also about performance. I want my validator to be fast. I
have seen some implementation who do not deal very well with that. I
think in JSV I have seen a RegExp to check the decimal places of a
number. Bad practice. It is good practice to reuse the schema
properties that are already there and to use a pattern only when it's
necessary. What you propose is hacky but not very elegant. (OK, my
example is surely not the best)
If we use you proposal, it means that for every additional key, a
RegExp match has to be made, which is not very performant.

Sky Sanders

unread,
Oct 31, 2010, 12:37:46 PM10/31/10
to json-...@googlegroups.com
Andi, validity of claims and assertions aside, a more cooperative approach will probably achieve better results.

I could be alone in my perception but I have seen a pattern that could be likened to you kicking the saloon doors open with both guns drawn shooting down what you characterize as  'hacks' without restraint (or viable alternatives).

Sit down, have a drink, join the conversation.  Honey, vinegar and bees.

Again, this is just my observation and you may have a valid perspective but I would submit that your current approach will generate more friction and resistance than results.

Cheers,
Sky  

Tatu Saloranta

unread,
Oct 31, 2010, 12:47:09 PM10/31/10
to json-...@googlegroups.com
On Sun, Oct 31, 2010 at 4:36 AM, Andi <andrea...@gmx.de> wrote:
> But it is also about performance. I want my validator to be fast. I
> have seen some implementation who do not deal very well with that. I
> think in JSV I have seen a RegExp to check the decimal places of a
> number. Bad practice. It is good practice to reuse the schema
> properties that are already there and to use a pattern only when it's
> necessary. What you propose is hacky but not very elegant. (OK, my
> example is surely not the best)
> If we use you proposal, it means that for every additional key, a
> RegExp match has to be made, which is not very performant.

You assume that regular exceptions are slow to use, which is not the
case for good implementations (available for most common platforms).
Compilation of expression only need to be done once per schema
instance, and comparisons are typically very fast; commonly faster
than naive hand-written checks.
Further, regex compiler can obviously optimize for simple cases where
value is constant string.

So trying to avoid use of regexps seems a lot like premature
optimization; especially if it adds complexity.

-+ Tatu +-

Gary Court

unread,
Oct 31, 2010, 1:45:55 PM10/31/10
to json-...@googlegroups.com
> You assume that regular exceptions are slow to use, which is not the
> case for good implementations (available for most common platforms).
> Compilation of expression only need to be done once per schema
> instance, and comparisons are typically very fast; commonly faster
> than naive hand-written checks.
> Further, regex compiler can obviously optimize for simple cases where
> value is constant string.
>
> So trying to avoid use of regexps seems a lot like premature
> optimization; especially if it adds complexity.

I agree. It has been my experience as well that regular expressions
are faster at multi-step text comparison then any hand-written
JavaScript.

-Gary

Andi

unread,
Oct 31, 2010, 2:05:09 PM10/31/10
to JSON Schema
I respect your and the others' view as I respect my own.
It seems, that I have expressed my opinion a little bit too harshly.
It's just that I like to make a clear point about my view, and that's
it.
If the vote is for the regular expression syntax, I will accept it,
even if I know that it's probably not fitting very well in the current
draft.
> > json-schema...@googlegroups.com<json-schema%2Bunsubscribe@googlegr oups.com>
> > .
> > > > For more options, visit this group athttp://
> > groups.google.com/group/json-schema?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "JSON Schema" group.
> > To post to this group, send email to json-...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > json-schema...@googlegroups.com<json-schema%2Bunsubscribe@googlegr oups.com>
> > .
Reply all
Reply to author
Forward
0 new messages