Re: JSON Schema, Partial Updates, Subschema Validation

1,773 views
Skip to first unread message

Geraint (David)

unread,
Jan 4, 2013, 9:32:30 AM1/4/13
to json-...@googlegroups.com
Could you clarify exactly what you mean by "sub-schema validation" and "partial updates"?

If you're completely replacing a smaller portion of the data, then you can simply address the relevant part of the schema using a JSON Pointer fragment, and validate against that.

Example data:
{
    "eggs": [...],
    "bacon": [...]
}

Example schema:  /schemas/breakfast
{
    "type": "object",
    "properties": {
        "eggs": {...},
        "bacon": {...}
    }
}

So in this case, if the user submits some new data for only part of the document (e.g. the "bacon" property), then you can simply address the relevant sub-schema as "/schemas/breakfast#/properties/bacon".

On Thursday, 3 January 2013 19:49:51 UTC, Jan Drake wrote:
Hello,

I'm considering embarking on a JSON Schema approach in my web services framework; however, I am curious if anybody has a well-developed approach to sub-schema validation along with partial updates from the client?

Ideally, I would like clients to be able to submit partial updates and validate against the schema for the resource being updated without having to write separate validation logic.

I can dream, can't I?  :)

Anybody done this yet?


Jan

Francis Galiegue

unread,
Jan 6, 2013, 6:07:21 AM1/6/13
to json-...@googlegroups.com
On Sun, Jan 6, 2013 at 3:13 AM, <paulp...@gmail.com> wrote:
> Hi,
>
> I'm new to this JSON schema stuff, and I too have been looking for examples
> of how to validate against only a subset of the original schema. For
> example, say we have a User schema defined with several properties, and a
> user wants to change his/her username. How would you go about verifying the
> username conforms to its schema definition while ignoring the rest of the
> schema? Could you point me towards an implementation of such a partial
> validation using a schema validation library such as JSV?
>
> Much appreciated,
> Paul
>

What you want here is a Link Description Object, the core schema
itself can validate data but it is hyper schema which can bring help
to editing.

I'll let David Gerraint give you a more complete answer, I am only
just starting to get the hang of hyperschema.

--
Francis Galiegue, fgal...@gmail.com
"It seems obvious [...] that at least some 'business intelligence'
tools invest so much intelligence on the business side that they have
nothing left for generating SQL queries" (Stéphane Faroult, in "The
Art of SQL", ISBN 0-596-00894-5)

Geraint (David)

unread,
Jan 6, 2013, 6:24:02 AM1/6/13
to json-...@googlegroups.com
Err, actually I'm afraid a Link Description Object is not useful here - this is actually a question of validation.

The question, as I understand it, is how to validate against a smaller corner of a larger schema.  For example, say you have this schema, at http://example.com/mySchema:
{
    "title": "User",
    "type": "object",
    "properties": {
        "username": {
            "title": "User name",
            "type": "string",
            "pattern": "[a-zA-Z][a-zA-Z0-9]*",
            "maxLength": "20",
            ...
        },
        ...
    },
    "required": ["username", "groups", "hobbies"]
}

Now, a user wants to change their username - however, they haven't completely re-submitted all the rest of the data (such as "groups" or "hobbies"), so the big schema is not useful.  What we really want is to validate just against the schema for "username".

So basically, your only challenge is accessing the correct schema.  If you already have the schema object loaded this is pretty simple - instead of having:
        validate(submittedData, userSchema);
you can just use:
        validate(submittedData.username, userSchema.properties.username);
Because the schema titled "User name" is a completely valid schema, you can use it just like any other schema.

If you aren't dealing with the schema object directly, but you are instead passing your validator a URL, like so:
        validate(submittedData, "http://example.com/mySchema");
then you can use a JSON Pointer fragment to address that specific part of the schema document:
        validate(submittedData.username, "http://example.com/mySchema#/properties/username");

It's worth noting that JSON Pointer fragments are not supported in all tools, and there is alternative syntax for version 3 of the draft.

I have to say, I haven't used JSV, so I don't know how it works in particular, but I hope that's helpful?

Geraint

Francis Galiegue

unread,
Jan 6, 2013, 6:32:33 AM1/6/13
to json-...@googlegroups.com
On Sun, Jan 6, 2013 at 12:24 PM, Geraint (David) <gerai...@gmail.com> wrote:
> Err, actually I'm afraid a Link Description Object is not useful here - this
> is actually a question of validation.
>

Well, my reasoning here is that it was about modifying existing data
interactively... I may have misread the question then.

I definitely don't understand hyper schema :p

Geraint (David)

unread,
Jan 6, 2013, 6:38:43 AM1/6/13
to json-...@googlegroups.com
No problem. :)

To clarify: the time when you want to use a Link Description Object is when you are defining the link itself using JSON Schema.  In that case, there is a property of the LDO called "schema", which defines the schema for the submitted data.  For example:
{
    "rel": "edit",
    "href": "/me/username",
    "method": "PUT",
}
That is a Link Description Object declaring that the data to be submitted to /me/username should follow whatever schema you have defined.  It addresses the schema by its URI, using the $ref property.

The constraints on the data itself come from the core JSON Schema spec - the hyper-schema uses the core spec to specify constraints on submission data.  But the nature of the constraints (and the language used to specify them) are all entirely from the core spec.

As an example, here's something I've been working on - I'm trying to make an interactive guide to JSON Schema.  The keywords there are all from the core specification - it's not actually using any of the hyper-schema spec, it's simply using the core spec to inform an editable interface.

Geraint

Jan Drake

unread,
Jan 6, 2013, 10:56:54 PM1/6/13
to json-...@googlegroups.com

Being an architect, I am compelled to define more precisely the scenarios being discussed.  The scenario of interest to me is as follows:

- I have a RESTful API with JSON payloads
- I rely on JSON Schema validation for full payloads (in other words, I am always passing the resource back and forth)
- I have the realities of a mobile environment which is sensitive to battery and bandwidth consumption
- Thus, I have a partial update scenario (as do we all)

So, how do I validate the partial update against the schema itself without having to do something stupid like suck in the resource to be modified, apply the JSON Patcha and run it through the schema as a whole object?  How do I do that without partitioning my schema into client-specific sub-schemas?

I love the idea of using JSON Pointer within the schema to validate the particular changes coming from the client AND it seems like that approach actually syncs pretty well with the partial update strategy of JSON Patch.

At this point I don't know what hyper-schema is but it doesn't sound good.

So, what would be awesome is to understand whether the requirements I've laid out above are supported by the schema and toolsets available right now?  If not, then also good to know.



Jan

Geraint (David)

unread,
Jan 7, 2013, 4:49:09 AM1/7/13
to json-...@googlegroups.com
If you are only interested in validation, then the hyper-schema is not relevant to you at all.  Don't worry about it.

As for your question: it depends on what form your partial update is in.  If it's a JSON Patch, it could get rather complicated.

For example, look at this schema:
{
    "type": "object",
    "required": ["keyPair1", "keyPair2"],
    "oneOf": [
        {
            "description": "keyPair1/keyPair2 are strings",
            "properties": {
                "keyPair1": {
                    "type": "string"
                },
                "keyPair2": {
                    "type": "string"
                }
            }
        },
        {
            "description": "keyPair1/keyPair2 are integers",
            "properties": {
                "keyPair1": {
                    "type": "integer"
                },
                "keyPair2": {
                    "type": "integer"
                }
            }
        }
    ]
}

In the above schema, there are two properties: keyPair1 and keyPair2.  They must either both be strings or both be integers.

Now, suppose you get a JSON Patch update as follows:
[
    {
        "op": "replace",
        "path": "/keyPair1",
        "value": 15
    }
]

There is no way to validate that patch's data on its own.  If keyPair1/keyPair2 are currently both strings, then it's invalid.  But if they're integers, it's valid - so validating the patch requires knowledge of the full data.  Full automatic validation of updates based entirely on the patch is not possible in the general case.

However you, as the API creator, might know that certain properties are completely independent of any surrounding data.  In that case, it should be fairly simple to validate those specific properties when they come in as a patch.

Does that make sense?

Geraint (David)

unread,
Jan 7, 2013, 4:54:24 AM1/7/13
to json-...@googlegroups.com
So basically my answer is: if you're looking an existing tool that you can plug a JSON Patch into, then I don't think one exists at the moment.

However, if you know what properties might be updated, and you know they are independent of the surrounding data, then you should be able to address the appropriate sub-schemas using fragments and match them up to the incoming data.  I don't know what language/environment you're working in, but there's a good chance that there's a suitable validator that can do that.

Francis Galiegue

unread,
Jan 7, 2013, 7:04:32 AM1/7/13
to json-...@googlegroups.com
On Mon, Jan 7, 2013 at 4:56 AM, Jan Drake <jan.s...@gmail.com> wrote:
[...]
>
> - I have a RESTful API with JSON payloads
> - I rely on JSON Schema validation for full payloads (in other words, I am
> always passing the resource back and forth)
> - I have the realities of a mobile environment which is sensitive to battery
> and bandwidth consumption
> - Thus, I have a partial update scenario (as do we all)
>
> So, how do I validate the partial update against the schema itself without
> having to do something stupid like suck in the resource to be modified,
> apply the JSON Patcha and run it through the schema as a whole object? How
> do I do that without partitioning my schema into client-specific
> sub-schemas?
>
> I love the idea of using JSON Pointer within the schema to validate the
> particular changes coming from the client AND it seems like that approach
> actually syncs pretty well with the partial update strategy of JSON Patch.
>

This is not as simple as it sounds. A pointer into a document does not
necessarily lead to one and only one schema against which to validate.

Take this schema, for instance:

{
"properties": { "p9": { "schema1": "here" } },
"patternProperties": {
"^p": { "schema2": "here" },
"\d+$": { "schema3": "here" }
}
}

If you were to validate against pointer /p9, your partial update would
have to validate against _all three schemas_.

Jan Drake

unread,
Jan 7, 2013, 10:44:09 PM1/7/13
to json-...@googlegroups.com
Uh, actually, there is.... but it involves hydrating the entire object for comparison, knowing when to do that, and having to write custom code (most likely).

Thank you.  I get the potential complexity and gaps.  My goals is to maximize automated validation and minimize hard-coded validation in order to reduce code complexity.  I realize there will be a balance.  I'm not sure I'm convinced yet that the community understands the need for finding that balance.  Naturally, 80-20 works... 50-50 is meh... and less is unacceptable.

For instance, I could possibly accept telling my clients via HTTP response codes and data that their request cannot be validated due to its structure... but then I need to give them a model that enables them to conceptualize the categories of things to feed the APIs during partial update.  It has to be a predictable definition of what can and can't be validated... and, ideally, my schema tools tell me via pleasantly worded WTF messages when they can't validate. 

It's quite a neat problem for someone to solve... and feasible against a reasonable target definition of behavior.


Jan

Francis Galiegue

unread,
Jan 7, 2013, 10:53:20 PM1/7/13
to json-...@googlegroups.com
On Tue, Jan 8, 2013 at 4:44 AM, Jan Drake <jan.s...@gmail.com> wrote:
> Uh, actually, there is.... but it involves hydrating the entire object for
> comparison, knowing when to do that, and having to write custom code (most
> likely).
>
> Thank you. I get the potential complexity and gaps. My goals is to
> maximize automated validation and minimize hard-coded validation in order to
> reduce code complexity. I realize there will be a balance. I'm not sure
> I'm convinced yet that the community understands the need for finding that
> balance. Naturally, 80-20 works... 50-50 is meh... and less is
> unacceptable.
>

I don't think the community is involved in this specific matter:
rules are clearly defined for what schema(s) a "child instance" has to
obey, depending on whether the "parent" is an object or an array. And,
of course, it is recursive.

This is not undoable -- in fact I know how to do it, personally. It is
just that if I had a JSON Patch as an argument instead of a target
value, this would be easier ;) But that is an implementation detail.

Personal plug but I'll afford this for once: see here:

https://groups.google.com/forum/?fromgroups=#!topic/json-schema-validator/2Qn69f0NRv4

There are more details that I'll post as I tear the problem apart --
but it is definitey doable.

Cheers,

Francis Galiegue

unread,
Jan 7, 2013, 11:20:57 PM1/7/13
to json-...@googlegroups.com
On Mon, Jan 7, 2013 at 10:49 AM, Geraint (David) <gerai...@gmail.com> wrote:
[...]
Very good point! I did not see this one... At best, you'd have to have
access to the parent data to check both of these constraints.
Reply all
Reply to author
Forward
0 new messages