On Sep 3, 7:44 pm, "Kris Zyp" <kris...@gmail.com> wrote:
> If the second one were encoded strings would it be valid?
> {"type":["{\"type\":\"string\"}","{\"type\":\"number\"}"]}
> No, the only valid string options are "string", "number", "null", "boolean",
> "object", "array", "integer", and "any".
> Kris
If type is always a type name or list of type names, does that mean
that to say "an integer greater than zero or a double/single-quoted
string" you say:
I guess that's the implication of one of the examples. The placement
of all attribute properties in one place seems to me like it will be
prohibitive of future expansion. Also, validation of schemata
themselves seems a bit more troublesome, as you'll have to validate
the given properties against all allowed types if you want to prohibit
someone supplying "minimum" for type: [ "string", "object" ].
> If type is always a type name or list of type names, does that mean > that to say "an integer greater than zero or a double/single-quoted > string" you say:
> I guess that's the implication of one of the examples. The placement > of all attribute properties in one place seems to me like it will be > prohibitive of future expansion.
Why? The primary motivation for this is to keep things as simple as possible. We had considered doing unions in the type property with full schemas, but that was rejected. Kris
> Why? The primary motivation for this is to keep things as simple as > possible. We had considered doing unions in the type property with full > schemas, but that was rejected.
Which is simpler (please forgive my laziness in not quoting property names):
To me, the first is simpler to read, extend, validate, and learn. Is this illegal, required to be rejected as a schema?
{ type: "array", minLength: 100 }
Some poor sucker is going to write that (thinking, perhaps, about Array.length) and if he's lucky it will be an exception. If not, it just won't do what he wants.
You wouldn't need multiple property names, though, if they were grouped by type:
Each 'type' can be validated independantly. If {type:X,...} has X=array, you know which keys are valid. If X=string, you also know. If it's a union, it's a union of individually checkable types. You don't need to union the set of all permissible parameters first, and you reduce the number of parameter names needed in general.
> * Kris Zyp <kris...@gmail.com> [2008-09-04T08:58:37] > > Why? The primary motivation for this is to keep things as simple as > > possible. We had considered doing unions in the type property with full > > schemas, but that was rejected.
> Which is simpler (please forgive my laziness in not quoting property > names):
Which one is simpler is a matter of debate but the first is not valid schema afaik.
To me, the first is simpler to read, extend, validate, and learn. Is this
> illegal, required to be rejected as a schema?
> { type: "array", minLength: 100 }
That is valid schema as the minLength property is only evaluated when the type is a string. It's effectively the same as
{type: "array"}
Currently the python validator doesn't give any warnings about this but it could.
> Some poor sucker is going to write that (thinking, perhaps, about > Array.length) > and if he's lucky it will be an exception. If not, it just won't do what > he > wants.
That's potentially true of a lot of systems/formats.
> Each 'type' can be validated independantly. If {type:X,...} has X=array, > you > know which keys are valid. If X=string, you also know. If it's a union, > it's > a union of individually checkable types. You don't need to union the set > of > all permissible parameters first, and you reduce the number of parameter > names > needed in general.
This essentially means that a type can be list of schema itself. This seems more flexible but would add a bit more complexity and would burden on folks like me to implement it in a json schema validator when you can do the equivalent in jsonschema already.
> > Which is simpler (please forgive my laziness in not quoting property > > names):
> > [ EXAMPLE 1 ] > > ...or: > > [ EXAMPLE 2 ]
> Which one is simpler is a matter of debate but the first is not valid schema > afaik.
Well, sure, but I couldn't very well give an example of what I thought might be better than the current layout while still giving a valid schema.
> > To me, the first is simpler to read, extend, validate, and learn. Is this > > illegal, required to be rejected as a schema?
> > { type: "array", minLength: 100 }
> > Some poor sucker is going to write that (thinking, perhaps, about > > Array.length) and if he's lucky it will be an exception. If not, it just > > won't do what he wants.
> That's potentially true of a lot of systems/formats.
The fact that other software sucks doesn't mean that there is no reason to try to make software that sucks as little as is practical.
A: I'm worried that our product might be terrible. B: Lots of products are terrible. A: Oh, no problem, then!
Who wants to work with that team?
> This essentially means that a type can be list of schema itself.
Right, that's what I'm suggesting.
> This seems more flexible but would add a bit more complexity and would burden > on folks like me to implement it in a json schema validator when you can do > the equivalent in jsonschema already.
I have implemented a schema system that works this way in several languages, and found it pretty painless. I think that it's probably no more or less complex to implement. In JSON Schema as it stands, the complexity is unioning valid attributes for the permitted types and then ensuring that only those were given. (That is, if type is [string,number] know that maxItems is not allowed, or at least not useful.) In Rx, the complexity is recursing down schemata. I'm not sure there's even any complexity there, just difference.
Anyway, if the response is "This might be better sometimes but it's too big of a change, so we're not going to do it," that's fine. It just seems, to me, like a better plan for future extension.
My originaly motivation for asking was about mixing type of complex
objects.
I wanted something like
{type:[{type:object, properties:{name:string, id:integer},
additionalProperties:false}, {type:object, properties:{brand:string,
id:integer}, additionalProperties:false}]}
This is a simple contrived example, but I think it demonstrates my
points well. I am interested to see how this can be defined gien the
current schema.
Jacob
On Sep 4, 11:32 am, Ricardo SIGNES <rsig...@gmail.com> wrote:
> > > Which is simpler (please forgive my laziness in not quoting property
> > > names):
> > > [ EXAMPLE 1 ]
> > > ...or:
> > > [ EXAMPLE 2 ]
> > Which one is simpler is a matter of debate but the first is not valid schema
> > afaik.
> Well, sure, but I couldn't very well give an example of what I thought might be
> better than the current layout while still giving a valid schema.
> > > To me, the first is simpler to read, extend, validate, and learn. Is this
> > > illegal, required to be rejected as a schema?
> > > { type: "array", minLength: 100 }
> > > Some poor sucker is going to write that (thinking, perhaps, about
> > > Array.length) and if he's lucky it will be an exception. If not, it just
> > > won't do what he wants.
> > That's potentially true of a lot of systems/formats.
> The fact that other software sucks doesn't mean that there is no reason to try
> to make software that sucks as little as is practical.
> A: I'm worried that our product might be terrible.
> B: Lots of products are terrible.
> A: Oh, no problem, then!
> Who wants to work with that team?
> > This essentially means that a type can be list of schema itself.
> Right, that's what I'm suggesting.
> > This seems more flexible but would add a bit more complexity and would burden
> > on folks like me to implement it in a json schema validator when you can do
> > the equivalent in jsonschema already.
> I have implemented a schema system that works this way in several languages,
> and found it pretty painless. I think that it's probably no more or less
> complex to implement. In JSON Schema as it stands, the complexity is unioning
> valid attributes for the permitted types and then ensuring that only those were
> given. (That is, if type is [string,number] know that maxItems is not
> allowed, or at least not useful.) In Rx, the complexity is recursing down
> schemata. I'm not sure there's even any complexity there, just difference.
> Anyway, if the response is "This might be better sometimes but it's too big of
> a change, so we're not going to do it," that's fine. It just seems, to me,
> like a better plan for future extension.
* Jacob <jacob.to...@gmail.com> [2008-09-04T13:53:31]
> My originaly motivation for asking was about mixing type of complex > objects.
> I wanted something like > {type:[{type:object, properties:{name:string, id:integer}, > additionalProperties:false}, {type:object, properties:{brand:string, > id:integer}, additionalProperties:false}]}
> This is a simple contrived example, but I think it demonstrates my > points well. I am interested to see how this can be defined gien the > current schema.
Right. I don't think that this is possible currently. (Prove me wrong!)
If you accept an object, you get one and only one chance to accept the parameters for validating object-type data, and they all apply and have one set of values.
> * Ian Lewis <ianmle...@gmail.com> [2008-09-04T12:21:54] > > 2008/9/4 Ricardo SIGNES <rsig...@gmail.com>
> > > Which is simpler (please forgive my laziness in not quoting property > > > names):
> > > [ EXAMPLE 1 ] > > > ...or: > > > [ EXAMPLE 2 ]
> > Which one is simpler is a matter of debate but the first is not valid > schema > > afaik.
> Well, sure, but I couldn't very well give an example of what I thought > might be > better than the current layout while still giving a valid schema.
> > > To me, the first is simpler to read, extend, validate, and learn. Is > this > > > illegal, required to be rejected as a schema?
> > > { type: "array", minLength: 100 }
> > > Some poor sucker is going to write that (thinking, perhaps, about > > > Array.length) and if he's lucky it will be an exception. If not, it > just > > > won't do what he wants.
> > That's potentially true of a lot of systems/formats.
> The fact that other software sucks doesn't mean that there is no reason to > try > to make software that sucks as little as is practical.
> A: I'm worried that our product might be terrible. > B: Lots of products are terrible. > A: Oh, no problem, then!
> Who wants to work with that team?
Yes, but making products better requires work. Work that may not be particularly necessary.
> > This essentially means that a type can be list of schema itself.
> Right, that's what I'm suggesting.^
Apparently it is valid schema as you have written it. I didn't remember reading that the value of the type property could be a schema. However, a validator doesn't have to do any checks at this point to make sure that the attributes match the type given.
> > This seems more flexible but would add a bit more complexity and would > burden > > on folks like me to implement it in a json schema validator when you can > do > > the equivalent in jsonschema already.
> I have implemented a schema system that works this way in several > languages, > and found it pretty painless. I think that it's probably no more or less > complex to implement. In JSON Schema as it stands, the complexity is > unioning > valid attributes for the permitted types and then ensuring that only those > were > given. (That is, if type is [string,number] know that maxItems is not > allowed, or at least not useful.) In Rx, the complexity is recursing down > schemata. I'm not sure there's even any complexity there, just difference.
> Anyway, if the response is "This might be better sometimes but it's too big > of > a change, so we're not going to do it," that's fine. It just seems, to me, > like a better plan for future extension.
That's partly why I wouldn't want to change the schema in a way that made it hard to implement. It's kind of a mute point since it is valid anyway.
As for throwing errors when something like {"type":"string", maxItems: 5} is given as a schema, I kind of feel like it would add a decent amount of complexity to start validating more of schema itself. Part of the reason XML Schema is such a pain is that writing a conforming validator is pretty hard, the Schema spec itself it pretty complex and it's slow. JSON Schema is pretty interesting but I fear that most developers will steer clear of it if writing schemas or implementing a usable validator gets much more complex than it is currently.
Maybe. But part of the reason it was easy for me to implement a python validator was because I didn't have to check to make sure a particular attribute was valid for the given types. Each attribute could be evaluated separately. How would you validate that the schema is incorrect if you were given the second schema? You would need to do something like (in psudo-code)
for each type: // check for disallowed attributes case (type): string: check for attrs not allowed be string array: check for attrs not allowed by array ...
This seems like a pain and a lot of verbose code that provides little value. What if you want to extend json schema and add a custom validation to be used internally in your application? What if you want to change the behavior of a particular attribute? These were the kinds of things I wanted to support with my json-schema validator. Adding this kind of code makes it hard.
* Ian Lewis <ianmle...@gmail.com> [2008-09-04T21:54:57]
> > That being the case, it seems like requiring always:
> > (per-type parameterization)
> > ...and never allowing:
> > (ball-of-mud parameterization)
> > Could simplify things significantly.
> Maybe. But part of the reason it was easy for me to implement a python > validator was because I didn't have to check to make sure a particular > attribute was valid for the given types. Each attribute could be evaluated > separately. How would you validate that the schema is incorrect if you were > given the second schema? You would need to do something like (in psudo-code)
Incorrect if I was given the second schema? I don't understand.
Roughly:
if type( schema['type'] ) is string: validator_class = validator_class_registry[ schema['type'] ] return validator_class(schema)
if type( schema['type'] ) is list: alternatives = [ make_schema(a_schema) for a_schema in schema['type'] ]
> for each type: > // check for disallowed attributes > case (type): > string: > check for attrs not allowed be string > array: > check for attrs not allowed by array > ...
Don't use conditionals, use classes. Then it's dead simple.
From ArrType's __init__:
if not set(schema.keys()).issubset(set(('type', 'contents', 'length'))): raise Error('unknown parameter for //arr')
Anybody can then write his own type for validation without needing to worry about conflicting with existing parameters.
> This seems like a pain and a lot of verbose code that provides little value. > What if you want to extend json schema and add a custom validation to be > used internally in your application? What if you want to change the behavior > of a particular attribute? These were the kinds of things I wanted to > support with my json-schema validator. Adding this kind of code makes it > hard.
No, this makes it *easy* because you write a new validator class for each new type.
You would have to check to check the values of the type attribute and the other provided attributes to make sure there aren't any invalid attributes given in the schema. They way you imelemented this is Rx, this is easy; it's not so easy the way I implemented jsonschema. It wasn't a requirement and I didn't place much value on validating the schema itself in this way. Given this, the I thought I could make a more easily used/extended validator without creating classes for each type.
> Roughly:
> if type( schema['type'] ) is string: > validator_class = validator_class_registry[ schema['type'] ] > return validator_class(schema)
> if type( schema['type'] ) is list: > alternatives = [ make_schema(a_schema) for a_schema in schema['type'] ]
Yah, I suppose you could do something like this and put validation for attributes common to all or multiple types in parent classes. How would you envision altering the behavior for say the "optional" keyword across all types?
> > for each type: > > // check for disallowed attributes > > case (type): > > string: > > check for attrs not allowed be string > > array: > > check for attrs not allowed by array > > ...
> Don't use conditionals, use classes. Then it's dead simple.
> From ArrType's __init__:
> if not set(schema.keys()).issubset(set(('type', 'contents', 'length'))): > raise Error('unknown parameter for //arr')
> Anybody can then write his own type for validation without needing to worry > about conflicting with existing parameters.
> > This seems like a pain and a lot of verbose code that provides little > value. > > What if you want to extend json schema and add a custom validation to be > > used internally in your application? What if you want to change the > behavior > > of a particular attribute? These were the kinds of things I wanted to > > support with my json-schema validator. Adding this kind of code makes it > > hard.
> No, this makes it *easy* because you write a new validator class for each > new > type.
Taking a type centric view of things is ok, but I think it might make dealing with that don't have much to do with type, like "required", or "identity", kind of a pain as you would need to alter the behavior of the base class' check method or extend every type class. Perhaps a hybrid approach with validation methods for each schema attribute that could be overridden at different levels would be best if allowing/disallowing attributes based the type is a requirement. I simply figured that extending one class, the validator itself, would be easier.
* Ian Lewis <ianmle...@gmail.com> [2008-09-04T23:53:51]
> You would have to check to check the values of the type attribute and the > other provided attributes to make sure there aren't any invalid attributes > given in the schema. They way you imelemented this is Rx, this is easy; it's > not so easy the way I implemented jsonschema. It wasn't a requirement and I > didn't place much value on validating the schema itself in this way. Given > this, the I thought I could make a more easily used/extended validator > without creating classes for each type.
The best way to let people extend the validator for their own purposes is to let them do so without altering (and thereby screwing up) the validator's code. So, they'll need to provide something that lists its valid arguments (because even if you want to allow all valid arguments all the time, even when meaningless, you want to never allow always-invalid arguments), has a name so that the type can be recognized in a list of strings, and validates a value based on those arguments.
What you really don't want to end up with is a system where everybody who wants a custom type has to either (a) alter JsonSchema.py (b) go through the committee to add the type to the core.
> Yah, I suppose you could do something like this and put validation for > attributes common to all or multiple types in parent classes. How would you > envision altering the behavior for say the "optional" keyword across all > types?
Well, that's why I made required/optional a function of the //rec type, rather than of each object (//rec) property. It meant that the per-object-entry type could stand alone and be validated alone. It also means it could be re-used. You could say:
Now we have a real data type, with the optional/required bits bumped up to the //rec type, where they are relevant. The //map isn't burdened at all by an 'optional' value for its values-type, since that would make no sense.
Now, that said, you could definitely say that every data type has an isOptional property. It just means that you'll end up having to reuse 'd20-attr-optional' *and* 'd20-attr-required'.
Also, it means you'll probably want d20-attr (alone) for use either as the type that's extended for opt/req and for use in places where optional-ness is meaningless.
> Taking a type centric view of things is ok, but I think it might make > dealing with that don't have much to do with type, like "required", or > "identity", kind of a pain as you would need to alter the behavior of the > base class' check method or extend every type class. Perhaps a hybrid > approach with validation methods for each schema attribute that could be > overridden at different levels would be best if allowing/disallowing > attributes based the type is a requirement.
Sure, see above. Frankly, though, I find that it isn't a pain at all. Don't extend every type class, because that's not going to scale. Instead, give each type all the data it needs to validate its contents at the right scope.
It is sometimes less elegant to read than if you shove top-level properties down into contained schemata, but it is easier to implement and extend.
> I simply figured that extending one class, the validator itself, would be > easier.
Obviously I don't have a million users, so I can't say with certainty what they would want if they existed. My prediction, though, is that the most likely thing people will extend is *what* can be validated. "I want to be able to validate that something is the name of a state capital." That, as opposed to altering *how* validation occurs. "I want to be able to say that two errors are okay."
If that is true, then the place to optimize for easy extension is in the type catalog.
> * Ian Lewis <ianmle...@gmail.com> [2008-09-04T23:53:51] > > You would have to check to check the values of the type attribute and the > > other provided attributes to make sure there aren't any invalid > attributes > > given in the schema. They way you imelemented this is Rx, this is easy; > it's > > not so easy the way I implemented jsonschema. It wasn't a requirement and > I > > didn't place much value on validating the schema itself in this way. > Given > > this, the I thought I could make a more easily used/extended validator > > without creating classes for each type.
> The best way to let people extend the validator for their own purposes is > to > let them do so without altering (and thereby screwing up) the validator's > code. > So, they'll need to provide something that lists its valid arguments > (because > even if you want to allow all valid arguments all the time, even when > meaningless, you want to never allow always-invalid arguments), has a name > so > that the type can be recognized in a list of strings, and validates a value > based on those arguments.
> ...but that looks a lot like a poor man's class.
> What you really don't want to end up with is a system where everybody who > wants > a custom type has to either (a) alter JsonSchema.py (b) go through the > committee to add the type to the core.
I think we lost each other. I didn't mean not using classes at all, but rather that instead of making separate classes for each type, I made one class for the validator. If you want to extend the validator you, well, extend the validator. The difference is that you took at type centric view and I took an attribute centric view. If you wanted to add a type with my implementation you would extend the validator and override the validate_type() function. In yours, you would create a new class and register it with the validator.
I didn't mean that I thought it was preferrable to do something crazy like what you are describing here.
> > Yah, I suppose you could do something like this and put validation for > > attributes common to all or multiple types in parent classes. How would > you > > envision altering the behavior for say the "optional" keyword across all > > types?
> Well, that's why I made required/optional a function of the //rec type, > rather > than of each object (//rec) property. It meant that the per-object-entry > type > could stand alone and be validated alone. It also means it could be > re-used. > You could say:
> Now we have a real data type, with the optional/required bits bumped up to > the > //rec type, where they are relevant. The //map isn't burdened at all by an > 'optional' value for its values-type, since that would make no sense.
> Now, that said, you could definitely say that every data type has an > isOptional > property. It just means that you'll end up having to reuse > 'd20-attr-optional' > *and* 'd20-attr-required'.
> Also, it means you'll probably want d20-attr (alone) for use either as the > type > that's extended for opt/req and for use in places where optional-ness is > meaningless.
> > Taking a type centric view of things is ok, but I think it might make > > dealing with that don't have much to do with type, like "required", or > > "identity", kind of a pain as you would need to alter the behavior of the > > base class' check method or extend every type class. Perhaps a hybrid > > approach with validation methods for each schema attribute that could be > > overridden at different levels would be best if allowing/disallowing > > attributes based the type is a requirement.
> Sure, see above. Frankly, though, I find that it isn't a pain at all. > Don't > extend every type class, because that's not going to scale. Instead, give > each > type all the data it needs to validate its contents at the right scope.
> It is sometimes less elegant to read than if you shove top-level properties > down into contained schemata, but it is easier to implement and extend.
> > I simply figured that extending one class, the validator itself, would be > > easier.
> Obviously I don't have a million users, so I can't say with certainty what > they > would want if they existed. My prediction, though, is that the most likely > thing people will extend is *what* can be validated. "I want to be able to > validate that something is the name of a state capital." That, as opposed > to > altering *how* validation occurs. "I want to be able to say that two > errors > are okay."
> If that is true, then the place to optimize for easy extension is in the > type > catalog.
Yah, but you can do this by extending the validator. Want to add a new attribute? Extend the validator class and add a validate_mynewattr() function. Adding a new type (state_capital?) means doing the same but overriding the validate_type() method.
class myValidator(JSONSchemaValidator): def validate_type(self, x, fieldname, schema, fieldtype=None): if fieldtype == "state_capital" if x.get(fieldname) is not in self.statecapitals: raise ValueError("%s is not a state capital." % x.get(fieldname)) else: JSONSchemaValidator.validate_type(self, x, fieldname, schema, fieldtype)
You can add your validate_mynewattr() method and overridded validate_optional() to the same myValidator class. However, adding a type is probably the most popular use case for extension so there is some value in having it be in different classes but either way works.
In any case it's starting to feel like an implementation or style difference.
> Yah, but you can do this by extending the validator. Want to add a new
> attribute? Extend the validator class and add a validate_mynewattr()
> function. Adding a new type (state_capital?) means doing the same but
> overriding the validate_type() method.
> class myValidator(JSONSchemaValidator):
> def validate_type(self, x, fieldname, schema, fieldtype=None):
> if fieldtype == "state_capital"
> if x.get(fieldname) is not in self.statecapitals:
> raise ValueError("%s is not a state capital." % x.get(fieldname))
> else:
> JSONSchemaValidator.validate_type(self, x, fieldname, schema,
> fieldtype)
> You can add your validate_mynewattr() method and overridded
> validate_optional() to the same myValidator class. However, adding a type is
> probably the most popular use case for extension so there is some value in
> having it be in different classes but either way works.
> In any case it's starting to feel like an implementation or style
> difference.
Ok, so I will try to make this the last thing I say on the subject, in
deference to that.
If you have to subclass the validator to add types, you are going to
start suffering when you say, "I want to use JSON Schema, and I want
John Smith's extension for complex numbers and Jane Doe's extension
for shape types." You will either have multiple inheritance or
something uglier.
This seems like a problem in search of a compositional solution, not
an inheritance solution.