Validation/structural features:
Hyper-schema features:
Semantic:
Right, your turn - let's hear it. :)
General features:
- "$data" - +1
- Multi-lingual meta-data - -0. Metadata keywords confuse me and I feel like they clutter the spec and confuse users sometimes.
- "enumNames", "formatMinimum", "formatMaximum" - -+-0, I find it really unfortunate how we have this proliferation of not-quite-namespaces rather than structured things, but I don't have a better suggestion and this seems to be the direction we're already going in so...
- Ban unknown properties mode - indifferent. I also wouldn't use this.
Validation/structural features:
- Alternative string values for type - unsurprisingly, +1
- "switch" - undecided. Need to look at this more carefully to see how much I like it. I'm cautiously weary.
- "patternGroups" - -0 agreed about this having limited use
- "contains" - +1
- Re-introduction of some v3 format values, Additional format values - +1
- "unordered" - -0, again due to my weariness about non-validating properties, but also because I perhaps would do this by writing "type": "set" along with the above
- "$data" - I really like this. It's flexible and powerful, and can be fairly straightforward to implement.
- "enumNames" - I like this, especially when combined with the multi-lingual meta-data.
So, in this case "prop2" would be valid if "prop1" is a number.
- "enumNames" - I like this, especially when combined with the multi-lingual meta-data.
Still I think this is weird, and not very flexible. What if I need an additional description. I think I might even ditch usage of "enum" completely in favor of "anyOf". This addition does not make it much better imho.
If a schema says "prop2" is invalid, then that (to my mind) should be something that I can fix by changing "prop2".
However, in this example, changing the supposedly invalid data ("prop2") has no effect whatsoever. In order to make "prop2" valid, I have to change "prop1".
(If I think this is strange, you can imagine how I feel about the idea of referencing external URIs - you could end up with a document that was invalid, but which you could only 'fix' by changing a completely separate document which you may or may not even have control over!)
If I was actually explaining to a human what the problem was, I would say "If prop1 is defined, then prop2 must be a number". So this particular constraint could be re-phrased as:
If a schema says "prop2" is invalid, then that (to my mind) should be something that I can fix by changing "prop2".
Why? How can you "fix" that by only looking at "prop2", if the validility of a property is actually depending on something else?
However, in this example, changing the supposedly invalid data ("prop2") has no effect whatsoever. In order to make "prop2" valid, I have to change "prop1".
Yes, of course, that's the way how it is simply defined in the data model. How is that weird?
So, that would mean that the user can select a T-Shirt when the outside temp is at a minimum of 12 degrees, and can also choose to carry a fur coat if the temp is below 15 degrees, so for a temp of 13 degrees actually both choices would be valid. (of course you could also use "switch" for that, both apporaches should work)
So - since you are not God or doing actual weather manipulation - you can't change the outside temp, right? So, that is just the way it is. Of course, the user can still make his entry valid if he just chooses the appropriate closing. And if wearing T-Shirts under 12 degrees is disallowed, he simply cannot choose a T-Shirt it is that easy. Imho it is all down to the actual requirements.
If I was actually explaining to a human what the problem was, I would say "If prop1 is defined, then prop2 must be a number". So this particular constraint could be re-phrased as:
That might work for that particular example. But how do I access another property that is not a sibling, but rather some levels deeper and for other attributes than required?
> An "enum" is a familiar concept, so there is an advantage in using it in these cases.
Yeah I agree in general, but it is too inflexible when needing additional information for the single enum values. And I think adding something like "enumNames" is only a half way solution. So, next we need "enumDescription" and so on. I think "oneOf" might be more verbose, but also much more versatile.
Now, can you also explain your proposal in one sentence?
My own preference re. $data is to limit its use to templating links, for now, and continue discussing its use in and syntax for instance-data-dependent conditions. But I expect I'm in the minority, as that seems like a very hotly needed feature.
It seems there are two basic issues, (1) how to model "variable" conditions that involve instance data (e.g. comparisons between static values and instance data or between two values in the instance data);
and (2) whether or not to allow external references (URIs) or only refer to data within a single instance.
On the second question, it's an interesting proposal to open it up to to pull data from anywhere, but fraught with complications. As I'm sure you realize, a URI is not enough information by itself to pull in data. Even if you expanded $data to specify all the REST metadata included in "links", even if you limited resources to http and application/json and JSON Pointer style fragment resolution, etc., etc. you wouldn't adequately specify how to communicate with 99% of web services out there.
JSON Schema is one little corner of the world trying to change this (in my mind horrible) situation, but thinking about it realistically, as you say: JSON Schema can't really specify a generic mechanism for pulling data from external sources.
So we are left with (typically) the server doing the manual work of communicating with external services and processing that data to include it in responses to the client -- i.e., in the instance data.
Geraint's switch syntax above (which does not really deal with $data references, so maybe not a good example) makes it easier for a validator to generate the former kinds of messages.
Whereas to my mind, modelling the condition in terms of the property (either using absolute paths as you do or relative JSON pointers) makes the validator's work to generate useful error messages more difficult/specialized. That, plus quite a bit of special-casing about where $data can and can't be used, not all of the implications being fleshed out, plus how it interacts with the new formatMinimum/formatMaximum, make me very nervous about the whole proposal.
My own preference re. $data is to limit its use to templating links, for now, and continue discussing its use in and syntax for instance-data-dependent conditions. But I expect I'm in the minority, as that seems like a very hotly needed feature.
You can't fix it just by looking at "prop2" - the thing that actually needs changing is "prop1". And it's for this reason that I think that ending up with a validation error for "prop2" is incorrect. It is possible to re-structure the constraints such that the validation errors end up on "prop1", which I think is much more helpful.
There is more than one way to specify the same set of constraints, and the way you express the constraints affects the nature/helpfulness of the validation errors. I think that the syntax you are referencing is pretty much always going to give you validation errors that point at an unintuitive part of the data (e.g. "part2"), when the part that you ac
If the two concerns are within the same document, it can always be rephrased - as Eric G says, to "model comparisons not in terms of properties being valid or invalid but in terms of the object that has the propert(ies) being compared".
So your syntax is needed to say that "prop2" is valid only when "prop1" is a string. However, in terms of which documents are actually valid, that is completely equivalent to saying that if "prop2" is defined then "prop1" must be a string - which you can say with existing vocab (though it's even neater with "switch").
The idea of data that is valid only in winter
doesn't really fit into that principle - validation status should not change over time.
I mean, what if you update your data format, and you need to re-run using the new schema to make sure it's all correct? If it's relying on external data, then it fails half your clothing choices because what you chose last winter is no longer suitable for summer weather. But the this data was valid at the time it was submitted - so it should be valid now.
JSON data should not go off like food in the fridge if you leave it for too long.
In this case, I'd say you need summer and winter schemas, expressing the differing clothing requirements. I mean, if your API suddenly changes its requirements, then a schema change is not out of the question.
I'm afraid I don't see the problem - I mean, I can go arbitrarily deep in either my "if" or "then", and I can use any keywords I like in either of them.
Yeah, I'm agreeing with this more now. I might go and dig up an old thread to get the other side of this one, but "oneOf" is a reasonable solution that is both complete and doesn't require new keywords.
Well, I already have shown how it could work with "minimum", thus not needing something like "minimum": {"$data": "1/smaller"}
"If certain schema keywords contain a "$data" property, evaluate the given path and use the value retrieved from that instead of the schema value that would be normally used"
I think it might be good to clear up - I actually think our two "$data" proposals are separate and incompatible.
Your proposal doesn't cover the link-templating case, and if your constraints are all within the same document, I'm still unclear what feature it actually adds - it seems like an alternative syntax to express things which can already be expressed.
yes, I meant the example I posted just above, this one:
And no, no additional keywords required, also applies for _all_ schema attributes, not only "minimum". That's why I like it.
I am not sure. At least with your "minimum" example, it also seems to work with my proposal. So, if I haven't overlooked something, this should include _all_ of your own proposals.
Ah! So, does the behaviour of your "$data" vary depending on where you use it?
So if you use it inside "minimum", then it extracts the value and then uses it as the minimum-value-constraint for the actual data (in this case, "larger"?), but if you use it inside a schema, then it extracts the value and uses it as the instance for validation?
Ah! So, does the behaviour of your "$data" vary depending on where you use it?
All that $data does with my proposal is to replace the implicitly derived value with the one specified wtith "$data".
So if you use it inside "minimum", then it extracts the value and then uses it as the minimum-value-constraint for the actual data (in this case, "larger"?), but if you use it inside a schema, then it extracts the value and uses it as the instance for validation?
Well - if I get it right - the diference is this:
You say: "The minimum of value B should be greater than the _minimum_ (defined in it's schema) for value A"
while I say: "The minimum of value B should be greater than the (current!) _value_ of A"
And I think the latter one is really what I would care about in such a case. The advantage I see that the latter approach can use _all_ of the schema features, while with your proposal it is just limited to _some_ Plus, I do not have to care about the schema of value A, since it is actually the _value_ I care about.
And in addition, I still see that even both of these uses for "$data" could be possible. But after all what at least _I_ need is to express relationships between data, not it's schemas. Of course the latter might still be usefull _in addition_. So for example, like this:
"allOf": [
{
"$data": "path/to/value1";
"value": true
},
{
"minimum": { "$data": "path/to/value2/smaller" },
}
]
Yeah, so why not? In this example, the property would be valid if a) "value1" is "true", and the minimum is smaller than the minimum defined in the schema for "value2".
Hope the difference is more clear now, but as you see, the meaning of "$data" would not change in essence, just the context where it is used.
The derived value of the data, or from the schema/keyword?
OK, we're evidently at cross-purposes here, because I'm saying: "The minimum of B should be {the value of A}".
We are obviously misunderstanding each other. I thought that you were saying that "$data" altered which instance was being validated, but that the rest of the schema was the same
- I'm completely unclear how that can even work when used inside the "minimum" property.
Yeah, I'm not trying to take values from the schema - I don't know how that got mixed-up. I am extracting the value of A, and I am substituting that value into the schema for B. So:
The reason this has to be limited to some keywords is schemas like:{
"properties": {
"$data": ...
}
}
This is specifying a schema for a data property called "$data" - and we don't want the validator to end up substituting the value of "properties" in the schema. Does that make sense?