The id attribute sets up a self link to the schema. What though if
'id' is not set in a sub-schema? Is the self-link still that of the
parent schema?
For $ref, it says: "this URI MAY be relative or absolute, and relative
URIs SHOULD be resolved against the URI of the current schema"
Would this resolve against the id of a schema if it is set?
Regards,
Martijn
If the "id" of a schema is not defined, then the URI of the schema
follows the default rule of slash-based instance referencing.
"parent"
"parent#/child"
"parent#/child/subchild"
> For $ref, it says: "this URI MAY be relative or absolute, and relative
> URIs SHOULD be resolved against the URI of the current schema"
>
> Would this resolve against the id of a schema if it is set?
An instance can have multiple URIs assigned to it. Take the following schema:
example.json
{
"id" : "test",
"properties" : {
"a" : {
"id" : "aaa"
}
}
}
The schema that has id "aaa" can be referenced using the following URIs:
"example.json#/properties/a"
"test#/properties/a"
"aaa"
Relative URIs in "id" are resolved against the default URI of itself,
which is created using slash-delimited referencing rules of the
container instance's URI.
-Gary
This is going to be lengthy, as I want to make sure the spec describes
a rule for this, and that the rule is good.
> If the "id" of a schema is not defined, then the URI of the schema
> follows the default rule of slash-based instance referencing.
>
> "parent"
> "parent#/child"
> "parent#/child/subchild"
Really? Where in the spec does it say that?
I understand it should be possible to address parts of a schema using
fragment resolution, but I'm not talking about this scenario here; I'm
talking about the whole schema being addressed, and this containing a
sub-schema that has no 'id'. (as an aside: I'm not sure the spec
describes I can address parts of schemas using the fragmentResolution
story. The whole concept of fragmentResolution is actually only
introduced in the context of hyper schemas and talks about "instance
representations", not schemas. I realize that it'd be convenient to
address schemas that way too, but what in the spec makes this
possible?).
The $ref spec says:
"This URI MAY be relative or absolute, and relative URIs SHOULD be
resolved against the URI of the current schema."
Okay, so now for resolving a relative URI, we're looking at the "URI
of the current schema".
Take this scenario:
example/ (schema retrieved under this URI)
{
"id" : "test/",
"properties" : {
"a" : {
$ref: "foo"
}
}
}
What is the URI of the outer schema? (note the slashes in the URIs in
the example, otherwise the example becomes confusing as url joining
would throw away a step that doesn't end in a /)
One interpretation would be that it's the one it was retrieved under, so:
example/
But the spec says for ids, "this attribute defines the URI of this
schema" and "If the URI is relative, it SHOULD be resolved against the
URI used to retrieve this schema".
This would imply that URI of the outer schema is:
"example/test/"
I think we can simply say that the relative 'id' on the outer schema
makes this a buggy schema. Is a relative "id" *ever* useful on an
outer schema? Or should we forbid such relative ids from appearing at
all for outer schemas?
I guess since example/ and example/test are effectively described to
be the same schema, it doesn't matter?
Let's throw it out and try again:
example/ (schema retrieved under this URI)
{
"properties" : {
"a" : {
$ref: "foo"
}
}
}
So how do we resolve $ref, which contains a relative URI? The spec
says for $ref that "relative URIs SHOULD be resolved against the URI
of the current schema.". Earlier evidence in the $ref section
indicates that "the current schema" is the inner schema, as the spec
says "it SHOULD replace the current schema with the schema
referenced..".
So, we need to know the URI of the schema under the "a" property, the
"current schema".
Now the spec doesn't give us any clues. Your interpretation (I think?)
is that it should construct a fragmentResolution path to determine the
URI of the current schema:
example/#/properties/a
So how do we resolve $ref now? I would suggest we discard fragment
resolution in this case, which might make the whole fragment
resolution discussion moot, and we simply resolve like this:
example/foo
This would suggest a much simpler rule for default "current URI", that
wouldn't talk about fragment resolution but simply say: "if the id is
missing, the current URI of the schema is that under which this schema
(or a container schema) was retrieved".
Let's examine that rule in some more detail, specify a nested scenario now:
example/ (schema retrieved under this URI)
{
"properties" : {
"a" : {
id: "bar/"
"properties': {
"b": { $ref: "foo" }
}
}
}
}
What would $ref resolve to now? The rule I specified (based on
fragmentResolution behavior) would indicate: example/foo
I think it is illustrative to make the id URIs absolute:
/example/ (schema retrieved under this URI)
{
"properties" : {
"a" : {
id: "/example/bar/"
"properties': {
"b": { $ref: "foo" }
}
}
}
}
But now let's refactor the schema (with relative URIs) into two. I
think this would be a fairly obvious refactoring given the id 'bar/'
in the above (even more obvious when the URIs are absolute):
example/
{
"properties" : {
"a": { $ref: "bar/" }
}
}
example/bar/
{
"properties': {
"b": { $ref: "foo" }
}
}
But now $ref "foo" resolves to something else!
from example/bar, it would refer to:
example/bar/foo
...and it'd be that even when we're dealing with example/, as example/
retrieves example/bar/, which then causes the $ref URIs inside to use
example/bar/ as its base. (interpreting the loading of a $ref to be
the loading of a schema and thus defining the schema's base URI).
This would indicate our rule is insufficient. Instead, I think the
rule should do the following if there's a missing "id": use the
"current URI" of the parent schema as the "current URI" to resolve
relative refs and constructive relative ids, or, if there is no parent
schema, the URI under which the outer schema was retrieved.
That would suggest this rule for "id":
"This attribute defines the current URI of this schema (this attribute is
effectively a "self" link). This URI MAY be relative or absolute. If
the URI is relative it is resolved against the URI of its parent
schema, if the schema is contained in a larger schema. If this is an
outer schema without parent, the URI of the parent schema is held
to be the URI under which this schema was addressed. If id is missing
the current URI of a schema is defined to be that of the parent
schema."
This defines a simple rule for establishing "URI of the current
schema": it's the id if absolute, or if the id is relative, the URI of
the parent schema with the id resolved against it, or if the id is
missing, the URI of the parent schema.
Then relative $ref can be defined as always resolving against the "URI
of the current schema".
If this rule was in effect, the original non-refactored scenario
already would have example/bar/foo as the URI for $ref "foo", and the
refactoring would leave this unaffected.
Regards,
Martijn
I'll be the first to admit that the spec is lacking when it comes to
hyper schema and referencing. Any suggestions for improving it are
more then welcome.
>> If the "id" of a schema is not defined, then the URI of the schema
>> follows the default rule of slash-based instance referencing.
>>
>> "parent"
>> "parent#/child"
>> "parent#/child/subchild"
>
> Really? Where in the spec does it say that?
No where, although it's vaguely implied under the fragment resolution
section. This should be defined better.
> I understand it should be possible to address parts of a schema using
> fragment resolution, but I'm not talking about this scenario here; I'm
> talking about the whole schema being addressed, and this containing a
> sub-schema that has no 'id'. (as an aside: I'm not sure the spec
> describes I can address parts of schemas using the fragmentResolution
> story. The whole concept of fragmentResolution is actually only
> introduced in the context of hyper schemas and talks about "instance
> representations", not schemas. I realize that it'd be convenient to
> address schemas that way too, but what in the spec makes this
> possible?).
Schemas are a subset of instances, or put another way, is an instance
used to validate another instance. Therefore you can use the same
referencing on schemas as you would instances.
> The $ref spec says:
>
> "This URI MAY be relative or absolute, and relative URIs SHOULD be
> resolved against the URI of the current schema."
>
> Okay, so now for resolving a relative URI, we're looking at the "URI
> of the current schema".
>
> Take this scenario:
>
> example/ (schema retrieved under this URI)
> {
> "id" : "test/",
> "properties" : {
> "a" : {
> $ref: "foo"
> }
> }
> }
>
> What is the URI of the outer schema? (note the slashes in the URIs in
> the example, otherwise the example becomes confusing as url joining
> would throw away a step that doesn't end in a /)
>
> One interpretation would be that it's the one it was retrieved under, so:
>
> example/
>
> But the spec says for ids, "this attribute defines the URI of this
> schema" and "If the URI is relative, it SHOULD be resolved against the
> URI used to retrieve this schema".
>
> This would imply that URI of the outer schema is:
>
> "example/test/"
Both are correct in this example. If you wanted, you could
conceptually think of it as "example/test/" is a simlink to
"example/", making both URIs identify the same resource. If you use
relative URIs in your IDs, you forgo any authoritative location.
> I think we can simply say that the relative 'id' on the outer schema
> makes this a buggy schema. Is a relative "id" *ever* useful on an
> outer schema? Or should we forbid such relative ids from appearing at
> all for outer schemas?
> I guess since example/ and example/test are effectively described to
> be the same schema, it doesn't matter?
Yes, relative "id"s are useful on any schema. As an example, you could
think of a handful of schemas in a directory and, no matter where the
directory is located, the schemas can always reference each other.
Even if the files were named differently or they were all embedded
into one file.
Remember that there is no concept of an outer or inner schema to a
validator. All schemas are just instances with URIs, with pointers to
each other. This allows you to embed an entire schema into another
schema without changing it's meaning.
> Let's throw it out and try again:
>
> example/ (schema retrieved under this URI)
> {
> "properties" : {
> "a" : {
> $ref: "foo"
> }
> }
> }
>
> So how do we resolve $ref, which contains a relative URI? The spec
> says for $ref that "relative URIs SHOULD be resolved against the URI
> of the current schema.". Earlier evidence in the $ref section
> indicates that "the current schema" is the inner schema, as the spec
> says "it SHOULD replace the current schema with the schema
> referenced..".
>
> So, we need to know the URI of the schema under the "a" property, the
> "current schema".
>
> Now the spec doesn't give us any clues. Your interpretation (I think?)
> is that it should construct a fragmentResolution path to determine the
> URI of the current schema:
>
> example/#/properties/a
Correct.
> So how do we resolve $ref now? I would suggest we discard fragment
> resolution in this case, which might make the whole fragment
> resolution discussion moot, and we simply resolve like this:
>
> example/foo
Correct. URI resolution states that you drop the fragment if the path
has changed.
> This would suggest a much simpler rule for default "current URI", that
> wouldn't talk about fragment resolution but simply say: "if the id is
> missing, the current URI of the schema is that under which this schema
> (or a container schema) was retrieved".
Not sure I'm understanding, but two different schemas can't have the
same URI. That's why we have fragment resolution because "example/foo"
and "example/foo#/properties" are different URIs.
> Let's examine that rule in some more detail, specify a nested scenario now:
>
> example/ (schema retrieved under this URI)
> {
> "properties" : {
> "a" : {
> id: "bar/"
> "properties': {
> "b": { $ref: "foo" }
> }
> }
> }
> }
>
> What would $ref resolve to now? The rule I specified (based on
> fragmentResolution behavior) would indicate: example/foo
It should reference "example/bar/foo":
1. When the root schema is instantiated, it as a URI of "example/" or
"example/#".
2. When the sub-root schema is instantiated, it's initial URI is
"example/#/properties/a", and since it has a relative "id" attribute,
it also has the (more primary) "example/bar/" URI.
3. When the sub-sub-root schema is instantiated, it's initial URI is
"example/bar/#/properties/b". Since it is a reference, and the
reference is relative, the reference URI is resolved to
"example/bar/foo". Then the schema is replaced with the schema
referenced by that URI (if found).
> I think it is illustrative to make the id URIs absolute:
>
> /example/ (schema retrieved under this URI)
> {
> "properties" : {
> "a" : {
> id: "/example/bar/"
> "properties': {
> "b": { $ref: "foo" }
> }
> }
> }
> }
Making your URIs absolute removes ambiguity when resolving. You should
note that your example is still using relative URIs (as it does not
include a protocol).
> But now let's refactor the schema (with relative URIs) into two. I
> think this would be a fairly obvious refactoring given the id 'bar/'
> in the above (even more obvious when the URIs are absolute):
>
> example/
> {
> "properties" : {
> "a": { $ref: "bar/" }
> }
> }
>
> example/bar/
> {
> "properties': {
> "b": { $ref: "foo" }
> }
> }
>
> But now $ref "foo" resolves to something else!
>
> from example/bar, it would refer to:
>
> example/bar/foo
As I stated above, the previous example should resolve to
"example/bar/foo". Therefore, they still both reference the same
schema.
-Gary
On Tue, Nov 16, 2010 at 5:39 AM, Gary Court <gary....@gmail.com> wrote:
> Remember that there is no concept of an outer or inner schema to a
> validator. All schemas are just instances with URIs, with pointers to
> each other. This allows you to embed an entire schema into another
> schema without changing it's meaning.
Remember that the behavior of 'current URI' as you state it implicitly
relies on a concept of outer and inner schemas. See the bottom of my
mail for my attempt to state that rule. Only if you resolve ids
(including missing ones) to absolute URIs before validation can an
implementation of a validator ignore this relationship entirely, and
to resolve them you need information about parent/child relations. I
want to be able to forget about this stuff during validation, and
that's what my implementation does, but I don't think my
implementation is compliant with the spec as it stands.
Let's look at an example again:
http://example.com/schema/
{
"properties: {
"a": {
"id": "a/",
"properties": {
"b": {
"id": "b/"
"properties": {
"c": {
"type": "number"
}
}
}
}
}
}
I won't follow any tenuously deduced rules, just the spec. I want to
know the URIs of the following schemas:
* the main schema
* the schema indicated by property a
* the schema indicated by property b
* the schema indicated by property c
The main schema:
The spec says the uri under which the schema was retrieved, so:
The schema indicated by property a:
The spec says "If the URI is relative, it SHOULD be resolved against
the URI used to retrieve this schema."
Well, this schema was retrieved using http://example.com/schema/. So,
it should be:
The schema indicated by property b:
The spec says "If the URI is relative, it SHOULD be resolved against
the URI used to retrieve this schema."
Okay, this schema was retrieved using http://example.com/schema/. I
don't think you can argue it was retrieved using
http://example.com/schema/a, as it wasn't, though of course it could
be. There is nothing that says we should prefer "could be retrieved
even though we haven't actually done it" over anything else. So, it
should be:
This is *wrong*. We want it to be:
http://example.com/schema/a/b/
But the spec doesn't say that. The spec says, use the URI used to
retrieve this schema, and we didn't use http://example.com/schema/a to
retrieve this schema at all.
If the rule had said: "If the URI is relative, it SHOULD be resolved
against the URI of the schema it is contained in." (with a special
rule if there is no parent), we would have done this correctly.
The schema indicated by property c:
The spec says nothing. There is no rule in the spec to figure out a
URI for this schema. There isn't a rule stating use
fragmentResolution. There certainly isn't a rule that says: "use
fragmentResolution but pay attention to the current URIs of all the
containing schemas a and b first and use that information in your
fragment URI".
Before, it wouldn't be bad if the spec said nothing, but now we have a
rule for $ref, which needs this information: "and relative URIs SHOULD
be resolved against the URI of the current schema."
For the last schema, we don't have a way to deduce the "URI of the
current schema", and therefore cannot resolve $ref.
Don't you think we should firm up the spec in this department?
You could try spelling the whole fragmentResolution set of rules in
the spec. That might complicated matters, though. Instead, you can
improve the definition of "id" and "URI of the current schema":
"This attribute defines the current URI of this schema (this attribute is
effectively a "self" link). This URI MAY be relative or absolute. If
the URI is relative it is resolved against the current URI of the parent
schema it is contained in. If this schema is not contained in any
parent schema,
the current URI of the parent schema is held to be the URI under which
this schema
was addressed. If id is missing, the current URI of a schema is
defined to be that of the parent
schema. The current URI of the schema is also used to construct relative
references such as for $ref."
There is a problem with this interpretation that might argue in favor
of using fragmentResolution: there is no way to address nested schemas
that have no URI of their own. We define "current URI" to be the same
as a parent schema, and this might be undesirable.
If you do want to use fragmentResolution, you should say something like this:
"This attribute defines the current URI of this schema (this attribute is
effectively a "self" link). This URI MAY be relative or absolute. If
the URI is relative it is resolved against the current URI of the parent
schema it is contained in. If this schema is not contained in any
parent schema,
the current URI of the parent schema is held to be the URI under which
this schema
was addressed. If id is missing, the default id (and current URI) is
the slash-delimited fragmentResolution from
the current URI of the nearest ancestor schema that does have an
explicit id, or if no such ancestor exists,
the URI under which this schema was addressed."
Note that you can't construct the fragmentResolution URI against the
current URI of a parent if that parent doesn't have an explicit
relative or absolute 'id' attribute, as that would break if you have
several nested schemas without an id (unless you defined a rule for
resolving a "relative fragment resolution"). Those constructed URIs
can always safely
be used for further relative id resolution though, because the
fragmentResolution part is throw away in that case.
Regards,
Martijn
I wrote a slightly different wording later, just for your consideration:
"This attribute defines the current URI of this schema (this attribute is
effectively a "self" link). This URI MAY be relative or absolute. If
the URI is relative it is resolved against the current URI of the parent
schema it is contained in. If this schema is not contained in any
parent schema, the current URI of the parent schema is held to be the
URI under which
this schema was addressed. If id is missing, the current URI of a schema is
defined to be that of the parent schema. The current URI of the schema
is also used to construct relative
references such as for $ref."
I don't think the wording matters much, but I think for clarity it
helps to put in the last line:
"The current URI of the schema is also used to construct relative
references such as for $ref."
Regards,
Martijn
+1 from me as well. This does a much better job at describing URI
generation then what exists.
Your comment seems to imply that there is a practical difference
between the implicit id being the same
as that of the parent schema, and the implicit id being constructed
using fragmentResolution. Could you give
an example where this makes a difference? I haven't found one.
As far as I understand, fragmentResolution won't affect the $ref URL
at all, because everything behind # in the URL will be ignored as soon
as you construct a relative URL. Therefore we can just as well use the
simpler rule that makes the current URI the same as the parent
schema's and avoid referencing the whole fragmentResolution story
altogether.
As to whether there should always be an implicit 'id'; I do transform
the schema and construct such an id in my codebase if the id is
missing myself, and use that as the value of current URI. But such
schema transformation is an implementation detail and I don't think
the current spec prevents that implementation. Similarly resolving
$ref can also be performed by including the sub-schemas in the main
schema (with special circular references in case of recursive
references), but again, this is an implementation detail.
Regards,
Martijn
On Sat, Dec 4, 2010 at 12:42 AM, Piers <piers....@gmail.com> wrote:
> But then the spec says:
>
> If id is missing, the current URI of a schema is defined to be that
> of the parent schema.
>
> Which to my mind was confusing,
I see your point now, that doesn't match the 'self link' description.
I think we have
two use cases for "current URI":
* self link
* base URL to base relative URLs on
For self link, the whole fragment resolution story is important. For
the base URL calculation, it is not,
as the bit behind # gets thrown away. The spec doesn't talk about a id
if the explicit id is missing, so there's
no self-link calculation necessary to implement the spec, only a good
current URI calculation method.
So the spec right now talks about both issues and mixes them up a bit,
which is indeed confusing. self-link and "current URI" are related but
not entirely identical concepts.
I think we need a piece of text in the spec that talks about both use
cases in some balanced way. Care to give it
a try?
Regards,
Martijn
-Gary