I think that this discussion is about the same problem that I'm having but it seems to be framed differently. So I'm not sure that the proposed "array of strings" + "allOf" syntax actually deals with my situation.
I'm creating a JSON data structure to model genealogical data. Part of this involves modelling the content of documents such as a birth certificate. There are two elements of this of interest here. One lists the subjects in the document and the other describes the relationships among the subjects.
At first it seems plausible that you can describe these two elements as follows:
"subject" : {
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"name" : { "type" : "string" },
"role" : { "type" : "string" }, ...
}}}
"relationship" : {
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"source_subject" : { "type" : "string" },
"relationship" : { "type" : "string" },
"target_subject" : { "type" : "string" }
}}}
where the values of "source_subject" and "target_subject" are JSON pointers to items in the subject array.
The problem with this is that the pointers reference relative and not absolute positions in the subject array. If the order of the elements in the array changes, say when an item is dropped, the referential integrity of the pointer is destroyed.
As suggested by silvio, a solution to this is to use keyed arrays or hash tables instead of the relative arrays described by JSON Schema. Using keyed arrays the subject and relationship data might look like this:
"subject" : {
"key1" : {
"name" : "John Doe",
"role" : "self"},
"key2" : {
"name" : "Fred Doe",
"role" : "parent"}
}
relationship : [ {
"source_subject" : "/subject/key2",
"relationship" : "father",
"target_subject" : "/subject/key1" } ]
This works well but now the subject array cannot be described by a JSON Schema. JSON schema can only describe "values in a key/value pair and not the keys. It also assumes that the name of the key is known and in situations like the subject array, the keys, in general, cannot be known ahead of time.
(You will notice that the absolute key problem does not appear in the relationship array only because we are not trying the reference the array items from outside the array.)
This problem occurs over and over in complex data structures. Right now I maintain my data using keyed arrays. This helps ensure the referential integrity of JSON pointers across the entire data structure. To make use of Jeremy Dorn's excellent json-editor, I map keyed arrays into relative arrays that can be described by JSON Schema and then map them back once the edits are done.
It would be nice if JSON Schema could amended to describe a collection of objects where each instance is identified by a unique key. It would be like the existing JSON Schema array structure except that the "array" items would have explicit keys instead of implicit keys based on item order. Constraints could now be placed on the key (uniqueness being a critical one) as well as on the instance values.
Do your proposals actually cover this use case?