No, the second one is not a valid schema. Only a string or an array of
strings is valid for the "type" property.
Kris
No, the only valid string options are "string", "number", "null", "boolean",
"object", "array", "integer", and "any".
Kris
Yes, exactly.
> I guess that's the implication of one of the examples. The placement
> of all attribute properties in one place seems to me like it will be
> prohibitive of future expansion.
Why? The primary motivation for this is to keep things as simple as
possible. We had considered doing unions in the type property with full
schemas, but that was rejected.
Kris
Which is simpler (please forgive my laziness in not quoting property names):
{
type: [
{ type: "number", minimum: 10 },
{ type: "array", minItems: 1 },
{ type: "string", minLength: 100 },
{ type: "null" },
]
}
...or:
{
type: [ "number", "string", "array", "null" ],
minimum: 10,
minItems: 1,
minLength: 100
},
To me, the first is simpler to read, extend, validate, and learn. Is this
illegal, required to be rejected as a schema?
{ type: "array", minLength: 100 }
Some poor sucker is going to write that (thinking, perhaps, about Array.length)
and if he's lucky it will be an exception. If not, it just won't do what he
wants.
You wouldn't need multiple property names, though, if they were grouped by
type:
{
type: [
{ type: "number", minimum: 10 },
{ type: "array", minLength: 1 },
{ type: "string", minLength: 100 },
{ type: "null" },
]
}
Each 'type' can be validated independantly. If {type:X,...} has X=array, you
know which keys are valid. If X=string, you also know. If it's a union, it's
a union of individually checkable types. You don't need to union the set of
all permissible parameters first, and you reduce the number of parameter names
needed in general.
--
rjbs
* Kris Zyp <kri...@gmail.com> [2008-09-04T08:58:37]
> Why? The primary motivation for this is to keep things as simple asWhich is simpler (please forgive my laziness in not quoting property names):
> possible. We had considered doing unions in the type property with full
> schemas, but that was rejected.
{
type: [
{ type: "number", minimum: 10 },
{ type: "array", minItems: 1 },
{ type: "string", minLength: 100 },
{ type: "null" },
]
}
...or:
{
type: [ "number", "string", "array", "null" ],
minimum: 10,
minItems: 1,
minLength: 100
},
To me, the first is simpler to read, extend, validate, and learn. Is this
illegal, required to be rejected as a schema?
{ type: "array", minLength: 100 }
Some poor sucker is going to write that (thinking, perhaps, about Array.length)
and if he's lucky it will be an exception. If not, it just won't do what he
wants.
You wouldn't need multiple property names, though, if they were grouped by
type:
{
type: [
{ type: "number", minimum: 10 },
{ type: "array", minLength: 1 },
{ type: "string", minLength: 100 },
{ type: "null" },
]
}
Each 'type' can be validated independantly. If {type:X,...} has X=array, you
know which keys are valid. If X=string, you also know. If it's a union, it's
a union of individually checkable types. You don't need to union the set of
all permissible parameters first, and you reduce the number of parameter names
needed in general.
Well, sure, but I couldn't very well give an example of what I thought might be
better than the current layout while still giving a valid schema.
> > To me, the first is simpler to read, extend, validate, and learn. Is this
> > illegal, required to be rejected as a schema?
> >
> > { type: "array", minLength: 100 }
> >
> > Some poor sucker is going to write that (thinking, perhaps, about
> > Array.length) and if he's lucky it will be an exception. If not, it just
> > won't do what he wants.
>
> That's potentially true of a lot of systems/formats.
The fact that other software sucks doesn't mean that there is no reason to try
to make software that sucks as little as is practical.
A: I'm worried that our product might be terrible.
B: Lots of products are terrible.
A: Oh, no problem, then!
Who wants to work with that team?
> This essentially means that a type can be list of schema itself.
Right, that's what I'm suggesting.
> This seems more flexible but would add a bit more complexity and would burden
> on folks like me to implement it in a json schema validator when you can do
> the equivalent in jsonschema already.
I have implemented a schema system that works this way in several languages,
and found it pretty painless. I think that it's probably no more or less
complex to implement. In JSON Schema as it stands, the complexity is unioning
valid attributes for the permitted types and then ensuring that only those were
given. (That is, if type is [string,number] know that maxItems is not
allowed, or at least not useful.) In Rx, the complexity is recursing down
schemata. I'm not sure there's even any complexity there, just difference.
Anyway, if the response is "This might be better sometimes but it's too big of
a change, so we're not going to do it," that's fine. It just seems, to me,
like a better plan for future extension.
--
rjbs
Right. I don't think that this is possible currently. (Prove me wrong!)
If you accept an object, you get one and only one chance to accept the
parameters for validating object-type data, and they all apply and have one set
of values.
Rx expresses the above as:
{
type: //any,
of: [
{ type: //rec, required: { name: //str, id, //int } },
{ type: //rec, required: { brand: //str, id, //int } },
]
}
--
rjbs
This is a valid schema. Once again sorry for the confusion I caused by my
earlier message.
Kris
That being the case, it seems like requiring always:
{
type: [
{ type: array, minItems: 10 },
{ type: string, pattern: "^0+$" }
]
}
...and never allowing:
{
type: [ array, string ],
minItems: 10,
pattern: "^0+$"
}
Could simplify things significantly.
--
rjbs
* Ian Lewis <ianm...@gmail.com> [2008-09-04T12:21:54]
> > Which is simpler (please forgive my laziness in not quoting property> > [ EXAMPLE 1 ]
> > names):
> >
> > ...or:
> > [ EXAMPLE 2 ]
>Well, sure, but I couldn't very well give an example of what I thought might be
> Which one is simpler is a matter of debate but the first is not valid schema
> afaik.
better than the current layout while still giving a valid schema.
> > To me, the first is simpler to read, extend, validate, and learn. Is this
> > illegal, required to be rejected as a schema?
> >
> > { type: "array", minLength: 100 }
> >
> > Some poor sucker is going to write that (thinking, perhaps, aboutThe fact that other software sucks doesn't mean that there is no reason to try
> > Array.length) and if he's lucky it will be an exception. If not, it just
> > won't do what he wants.
>
> That's potentially true of a lot of systems/formats.
to make software that sucks as little as is practical.
A: I'm worried that our product might be terrible.
B: Lots of products are terrible.
A: Oh, no problem, then!
Who wants to work with that team?
> This essentially means that a type can be list of schema itself.
Right, that's what I'm suggesting.^
> This seems more flexible but would add a bit more complexity and would burden
> on folks like me to implement it in a json schema validator when you can do
> the equivalent in jsonschema already.
I have implemented a schema system that works this way in several languages,
and found it pretty painless. I think that it's probably no more or less
complex to implement. In JSON Schema as it stands, the complexity is unioning
valid attributes for the permitted types and then ensuring that only those were
given. (That is, if type is [string,number] know that maxItems is not
allowed, or at least not useful.) In Rx, the complexity is recursing down
schemata. I'm not sure there's even any complexity there, just difference.
Anyway, if the response is "This might be better sometimes but it's too big of
a change, so we're not going to do it," that's fine. It just seems, to me,
like a better plan for future extension.
Incorrect if I was given the second schema? I don't understand.
Roughly:
if type( schema['type'] ) is string:
validator_class = validator_class_registry[ schema['type'] ]
return validator_class(schema)
if type( schema['type'] ) is list:
alternatives = [ make_schema(a_schema) for a_schema in schema['type'] ]
Basically:
> for each type:
> // check for disallowed attributes
> case (type):
> string:
> check for attrs not allowed be string
> array:
> check for attrs not allowed by array
> ...
Don't use conditionals, use classes. Then it's dead simple.
From ArrType's __init__:
if not set(schema.keys()).issubset(set(('type', 'contents', 'length'))):
raise Error('unknown parameter for //arr')
Anybody can then write his own type for validation without needing to worry
about conflicting with existing parameters.
> This seems like a pain and a lot of verbose code that provides little value.
> What if you want to extend json schema and add a custom validation to be
> used internally in your application? What if you want to change the behavior
> of a particular attribute? These were the kinds of things I wanted to
> support with my json-schema validator. Adding this kind of code makes it
> hard.
No, this makes it *easy* because you write a new validator class for each new
type.
--
rjbs
Incorrect if I was given the second schema? I don't understand.
Roughly:
if type( schema['type'] ) is string:
validator_class = validator_class_registry[ schema['type'] ]
return validator_class(schema)
if type( schema['type'] ) is list:
alternatives = [ make_schema(a_schema) for a_schema in schema['type'] ]
Basically:
http://git.codesimply.com/?p=Rx;a=blob;f=python/Rx.py;h=da9a85a1114e26623d85da3fb83b87e723502dc5;hb=HEAD
Don't use conditionals, use classes. Then it's dead simple.
> for each type:
> // check for disallowed attributes
> case (type):
> string:
> check for attrs not allowed be string
> array:
> check for attrs not allowed by array
> ...
From ArrType's __init__:
if not set(schema.keys()).issubset(set(('type', 'contents', 'length'))):
raise Error('unknown parameter for //arr')
Anybody can then write his own type for validation without needing to worry
about conflicting with existing parameters.
No, this makes it *easy* because you write a new validator class for each new
> This seems like a pain and a lot of verbose code that provides little value.
> What if you want to extend json schema and add a custom validation to be
> used internally in your application? What if you want to change the behavior
> of a particular attribute? These were the kinds of things I wanted to
> support with my json-schema validator. Adding this kind of code makes it
> hard.
type.
The best way to let people extend the validator for their own purposes is to
let them do so without altering (and thereby screwing up) the validator's code.
So, they'll need to provide something that lists its valid arguments (because
even if you want to allow all valid arguments all the time, even when
meaningless, you want to never allow always-invalid arguments), has a name so
that the type can be recognized in a list of strings, and validates a value
based on those arguments.
So, you could do this without classes:
# python
{
'name': 'palindrome',
'arguments': set(('length', 'ignore_spaces')),
'checker': some_function
}
...but that looks a lot like a poor man's class.
What you really don't want to end up with is a system where everybody who wants
a custom type has to either (a) alter JsonSchema.py (b) go through the
committee to add the type to the core.
> Yah, I suppose you could do something like this and put validation for
> attributes common to all or multiple types in parent classes. How would you
> envision altering the behavior for say the "optional" keyword across all
> types?
Well, that's why I made required/optional a function of the //rec type, rather
than of each object (//rec) property. It meant that the per-object-entry type
could stand alone and be validated alone. It also means it could be re-used.
You could say:
/d20/attr = { type://int, range:{ min: 3, max: 18 } }
{
type: //rec
required: { charisma: /d20/attr },
optional: { comliness: /d20/attr }
rest: { type: //map, values: /d20/attr },
}
Now we have a real data type, with the optional/required bits bumped up to the
//rec type, where they are relevant. The //map isn't burdened at all by an
'optional' value for its values-type, since that would make no sense.
Now, that said, you could definitely say that every data type has an isOptional
property. It just means that you'll end up having to reuse 'd20-attr-optional'
*and* 'd20-attr-required'.
Also, it means you'll probably want d20-attr (alone) for use either as the type
that's extended for opt/req and for use in places where optional-ness is
meaningless.
> Taking a type centric view of things is ok, but I think it might make
> dealing with that don't have much to do with type, like "required", or
> "identity", kind of a pain as you would need to alter the behavior of the
> base class' check method or extend every type class. Perhaps a hybrid
> approach with validation methods for each schema attribute that could be
> overridden at different levels would be best if allowing/disallowing
> attributes based the type is a requirement.
Sure, see above. Frankly, though, I find that it isn't a pain at all. Don't
extend every type class, because that's not going to scale. Instead, give each
type all the data it needs to validate its contents at the right scope.
{
type: "object",
properties: { foo: "object", bar: "integer" },
identity: "bar"
}
It is sometimes less elegant to read than if you shove top-level properties
down into contained schemata, but it is easier to implement and extend.
> I simply figured that extending one class, the validator itself, would be
> easier.
Obviously I don't have a million users, so I can't say with certainty what they
would want if they existed. My prediction, though, is that the most likely
thing people will extend is *what* can be validated. "I want to be able to
validate that something is the name of a state capital." That, as opposed to
altering *how* validation occurs. "I want to be able to say that two errors
are okay."
If that is true, then the place to optimize for easy extension is in the type
catalog.
--
rjbs
* Ian Lewis <ianm...@gmail.com> [2008-09-04T23:53:51]
> You would have to check to check the values of the type attribute and theThe best way to let people extend the validator for their own purposes is to
> other provided attributes to make sure there aren't any invalid attributes
> given in the schema. They way you imelemented this is Rx, this is easy; it's
> not so easy the way I implemented jsonschema. It wasn't a requirement and I
> didn't place much value on validating the schema itself in this way. Given
> this, the I thought I could make a more easily used/extended validator
> without creating classes for each type.
let them do so without altering (and thereby screwing up) the validator's code.
So, they'll need to provide something that lists its valid arguments (because
even if you want to allow all valid arguments all the time, even when
meaningless, you want to never allow always-invalid arguments), has a name so
that the type can be recognized in a list of strings, and validates a value
based on those arguments.
So, you could do this without classes:
# python
{
'name': 'palindrome',
'arguments': set(('length', 'ignore_spaces')),
'checker': some_function
}
...but that looks a lot like a poor man's class.
What you really don't want to end up with is a system where everybody who wants
a custom type has to either (a) alter JsonSchema.py (b) go through the
committee to add the type to the core.