Marking deprecated properties within the schema.

3,041 views
Skip to first unread message

Josef Karthauser

unread,
Aug 31, 2012, 5:11:46 AM8/31/12
to json-...@googlegroups.com
(Sorry if this is a duplicate; I just posted a similar query but it doesn't appear to have shown up! Presuming it's lost!)

We have an application with a json response which has been evolving over time. As a consequence we have a number of properties in the response which have been deprecated and are just being carried around until we are sure that the clients have integrated against the more recent API changes.

What I'd like to do is to formally describe, in the json-schema for the response, which fields are deprecated and why. We are currently documenting this externally, but it makes sense to include it in the schema instead.

Here's an example. We used to have a property called 'anrUid', but due to business domain changes it is now called 'identityUID'. We're carried both properties at the moment, but I'd like to make it clear for new clients that they should not be using anrUid. I anticipate being able to do this, perhaps like this, in the assocated schema:
This is what I'm anticipating:

{
    "type" : "object",
    "required" : true,
    "additionalProperties" : false,
    "properties" : {
        "identityUID" : {
            "$ref": "../shared/non-empty-string.json",
            "required" : true
        },
        "anrUid" : { 
            "$ref": "../shared/non-empty-string.json",
            "required" : true,
            "deprecated" : {
                "deprecatedOn" : "20120831",
                "description" : "Please use 'indentityUID' instead."
            } 
        },
        ...
    }
}

It would be nice to formally support something similar to this in the official spec.

Does anyone have any thoughts on this?

Cheers,
Joe
 

Francis Galiegue

unread,
Aug 31, 2012, 5:20:03 AM8/31/12
to json-...@googlegroups.com
Hello,

On Fri, Aug 31, 2012 at 11:11 AM, Josef Karthauser <j...@karthauser.co.uk> wrote:
> (Sorry if this is a duplicate; I just posted a similar query but it doesn't
> appear to have shown up! Presuming it's lost!)
>

It is just that you tried and posted with a mail address which wasn't
registered to the group ;)
(you could use an ISO 8601 date format instead -- also, why require
the obsolete property? The new one is required anyway)

I don't really have a thought on this. The problem is that you are
opening up another possible use of JSON Schema and the specification
is currently big enough already ;)

This keyword is not about validation proper, it is not about
hyperlinking either...

--
Francis Galiegue, fgal...@gmail.com
"It seems obvious [...] that at least some 'business intelligence'
tools invest so much intelligence on the business side that they have
nothing left for generating SQL queries" (Stéphane Faroult, in "The
Art of SQL", ISBN 0-596-00894-5)

Josef Karthauser

unread,
Aug 31, 2012, 5:32:14 AM8/31/12
to json-...@googlegroups.com
On Friday, August 31, 2012 10:20:04 AM UTC+1, fge wrote:
> (Sorry if this is a duplicate; I just posted a similar query but it doesn't
> appear to have shown up! Presuming it's lost!)

It is just that you tried and posted with a mail address which wasn't
registered to the group ;)

Ah, that explains it :).
 
> We have an application with a json response which has been evolving over
> time. As a consequence we have a number of properties in the response which
> have been deprecated and are just being carried around until we are sure
> that the clients have integrated against the more recent API changes.
 
> This is what I'm anticipating:
>
> {
>     "type" : "object",
>     "required" : true,
>     "additionalProperties" : false,
>     "properties" : {
>         "identityUID" : {
>             "$ref": "../shared/non-empty-string.json",
>             "required" : true
>         },
>         "anrUid" : {
>             "$ref": "../shared/non-empty-string.json",
>             "required" : true,
>             "deprecated" : {
>                 "deprecatedOn" : "20120831",
>                 "description" : "Please use 'indentityUID' instead."
>             }
>         },
>         ...
>     }
> }

(you could use an ISO 8601 date format instead -- also, why require
the obsolete property? The new one is required anyway)

Sure with the date. What I described was only illustrative, not formal.

The field itself is actually required. It's not an optional part of the response, it will always be there. 
 
I don't really have a thought on this. The problem is that you are
opening up another possible use of JSON Schema and the specification
is currently big enough already ;)

This keyword is not about validation proper, it is not about
hyperlinking either...

I wonder, has there been much discussion about changing responses over time, and how to describe that. It is rare that an API comes into existence and stays that way forever with no changes. Given the vogue of agile practices it is more likely that APIs are under constant refinement. It seems essential to have some way of encapsulating those changes from the point of view of an anonymous API consumer. Technologies such as REST require this kind of flexibility to be future proof, and to avoid the unnecessary duplication of end points to support just single property changes.

Joe 

Francis Galiegue

unread,
Aug 31, 2012, 5:42:51 AM8/31/12
to json-...@googlegroups.com
On Fri, Aug 31, 2012 at 11:32 AM, Josef Karthauser <j...@karthauser.co.uk> wrote:
[...]
>
> I wonder, has there been much discussion about changing responses over time,
> and how to describe that. It is rare that an API comes into existence and
> stays that way forever with no changes. Given the vogue of agile practices
> it is more likely that APIs are under constant refinement. It seems
> essential to have some way of encapsulating those changes from the point of
> view of an anonymous API consumer. Technologies such as REST require this
> kind of flexibility to be future proof, and to avoid the unnecessary
> duplication of end points to support just single property changes.
>

The point I was really trying to make is that when it comes to
validation, JSON Schema is currently completely agnostic as to _where_
the data it validates comes from: user input, a MongoDB database, a
ZooKeeper database, an API call or whatever. The keyword you propose
would add metadata information, I am not sure this is a good thing.

That said, there is the "links" keyword already...

--
Francis Galiegue, fgal...@gmail.com
JSON Schema: https://github.com/json-schema

Dr Josef Karthauser

unread,
Aug 31, 2012, 5:52:17 AM8/31/12
to json-...@googlegroups.com
> On Fri, Aug 31, 2012 at 11:32 AM, Josef Karthauser <j...@karthauser.co.uk> wrote:
>>
>> I wonder, has there been much discussion about changing responses over time,
>> and how to describe that. It is rare that an API comes into existence and
>> stays that way forever with no changes. Given the vogue of agile practices
>> it is more likely that APIs are under constant refinement. It seems
>> essential to have some way of encapsulating those changes from the point of
>> view of an anonymous API consumer. Technologies such as REST require this
>> kind of flexibility to be future proof, and to avoid the unnecessary
>> duplication of end points to support just single property changes.
>>
>
> The point I was really trying to make is that when it comes to
> validation, JSON Schema is currently completely agnostic as to _where_
> the data it validates comes from: user input, a MongoDB database, a
> ZooKeeper database, an API call or whatever. The keyword you propose
> would add metadata information, I am not sure this is a good thing.
>
> That said, there is the "links" keyword already…

That makes sense. I guess what I'm describing then are extensions that would be part of a rest-json-schema which extends json-schema. Are there any such formal json-schema extensions documented already, do you know? I've not seen anything in my searching around. Clearly "links" it would be proposed to move "links" to the REST extension and out of the json-schema spec.

On the other hand, what I'm proposing is a general documentation facility for many kinds of json-schema. We have backwards compatibility within our mongo json schemas too. Given a collection with millions of records in which have been written over a long period of time, we quite often find ourselves with many different records with slightly different schemas. We make the effort to "uplift" these to the latest schema upon read, and we don't formally have a mongo-json-schema written down for each of our collections. If we were to write them down, however, we would definitely want to call out the deprecated fields.

Joe

David

unread,
Aug 31, 2012, 10:37:22 AM8/31/12
to json-...@googlegroups.com
I think the biggest problem with this for me is not the fact that it's meta-data (forget "links" - we already have the "title" and "description" properties), it's the fact that it references information outside the schema or the JSON data (in particular, "what is today's date").

Currently JSON Schema either matches or it doesn't, on the basis of nothing but the schema and the data.  What you are describing sounds like "matches, but with warnings", or "matches when run today, but won't match with the same schema/data in a year's time".

Have you thought about having two schemas - a "permissive" which describes the data your server is outputting, and a "strict" schema for developers to be looking at?

Dr Josef Karthauser

unread,
Aug 31, 2012, 11:30:53 AM8/31/12
to json-...@googlegroups.com
On 31 Aug 2012, at 15:37, David <gerai...@gmail.com> wrote:

I think the biggest problem with this for me is not the fact that it's meta-data (forget "links" - we already have the "title" and "description" properties), it's the fact that it references information outside the schema or the JSON data (in particular, "what is today's date").

Forget the date. If you're worried about self referentiality, then I'd do something like this instead:

{
 "version" : 3,
    "type" : "object",
    "required" : true,
    "additionalProperties" : false,
    "properties" : {
        "identityUID" : {
            "$ref": "../shared/non-empty-string.json",
            "required" : true
        },
        "anrUid" : { 
            "$ref": "../shared/non-empty-string.json",
            "required" : true,
            "deprecated" : {
                "version" : "2",
                "description" : "Please use 'indentityUID' instead."
            } 
        },
        ...
    }
}

Indicating that the schema is current at revision 3, and that the 'anrUid' field was deprecated at revision 2. 
Does this alleviate your external references worry?

Currently JSON Schema either matches or it doesn't, on the basis of nothing but the schema and the data.  What you are describing sounds like "matches, but with warnings", or "matches when run today, but won't match with the same schema/data in a year's time".

Surely the JSON schema serves two purposes. One is for validation of data, the other is to allow users of data to anticipate the data that they are to expect. Given data changes shape over time the schema ought to reflect that.

Have you thought about having two schemas - a "permissive" which describes the data your server is outputting, and a "strict" schema for developers to be looking at?

The problem with that approach is that there is actually many steeps between those two, given a constantly evolving data resource, and many consumers in different groups.

Joe


--
You received this message because you are subscribed to the Google Groups "JSON Schema" group.
To view this discussion on the web visit https://groups.google.com/d/msg/json-schema/-/re7udeNpUnQJ.
To post to this group, send email to json-...@googlegroups.com.
To unsubscribe from this group, send email to json-schema...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/json-schema?hl=en.

David

unread,
Aug 31, 2012, 12:51:28 PM8/31/12
to json-...@googlegroups.com
Surely the JSON schema serves two purposes. One is for validation of data, the other is to allow users of data to anticipate the data that they are to expect. Given data changes shape over time the schema ought to reflect that.

True - I agree that it's not just for validation, and that documentation is a goal as well.

Do you need this documentation to be machine-readable?  If it's intended for humans, could it just go in the "description" property?  ({"description": "DEPRECATED IN v2", ...}).  Or even, just remove the "anrUid" property from the schema altogether, and put something in the description for "identityUID", saying that it's been replaced?

Have you thought about having two schemas - a "permissive" which describes the data your server is outputting, and a "strict" schema for developers to be looking at?
The problem with that approach is that there is actually many steeps between those two, given a constantly evolving data resource, and many consumers in different groups.

Yeah, I can see how that's a problem.  Hmm.

I still feel like a schema should describe a single data format, though, and not try and keep the entire history of the format in one document.  As you haven't set "additionalProperties" to be false, could you just omit the "anrUid" field from your v3 documentation, but include it in the data?  Old clients would still work, and if anyone's curious, they could go and find the v2 documentation to see what it used to do.

I also wonder whether that would help you test backwards-compatability.  It sounds like your sever isn't outputting just the v3 format - it's outputting v3, plus some deprecated fields so that v2 clients can still work.  If you had a history of schemas going back (instead of the entire history in one file), then you could validate your current server output against them individually, instead of having one cluttered master schema representing the amalgamated output format.

Dr Josef Karthauser

unread,
Sep 4, 2012, 3:09:22 AM9/4/12
to json-...@googlegroups.com
On 31 Aug 2012, at 17:51, David <gerai...@gmail.com> wrote:

Surely the JSON schema serves two purposes. One is for validation of data, the other is to allow users of data to anticipate the data that they are to expect. Given data changes shape over time the schema ought to reflect that.

True - I agree that it's not just for validation, and that documentation is a goal as well.

Do you need this documentation to be machine-readable?  If it's intended for humans, could it just go in the "description" property?  ({"description": "DEPRECATED IN v2", ...}).  Or even, just remove the "anrUid" property from the schema altogether, and put something in the description for "identityUID", saying that it's been replaced?

Yes, ideally. We want to be able to filter the result based upon the schema that that the caller is expecting it to conform to. Our consumers have several branches of development, and we anticipate that they want to run our API in "bleeding edge" mode, whereby we would automatically strip the deprecated properties in a jackson output filter. That way they can see which of their unit tests fail and fix up as necessary.

Have you thought about having two schemas - a "permissive" which describes the data your server is outputting, and a "strict" schema for developers to be looking at?
The problem with that approach is that there is actually many steeps between those two, given a constantly evolving data resource, and many consumers in different groups.

Yeah, I can see how that's a problem.  Hmm.

I still feel like a schema should describe a single data format, though, and not try and keep the entire history of the format in one document.  As you haven't set "additionalProperties" to be false, could you just omit the "anrUid" field from your v3 documentation, but include it in the data?  Old clients would still work, and if anyone's curious, they could go and find the v2 documentation to see what it used to do.

The might make sense, but when you consider that all of our clients, as well as us, are doing incremental development and have different release cycles you can see that it's not so simple as supporting just a single version of a schema. As a provider we need to provide compatibility for a range of schema versions, and we want to know which old schema we can drop support for as we track that our touch points have move for a later schema format. We don't want to maintain multiple schema files in the code base, as that makes for a manageability issue. Better for us to have a hybrid schema that declares what that output conforms to, and which parts of the output are there just for backward compatibility. This enables us to both clearly document for the consumer what the output API does and what they should use, and also to transform the output to conform to the schema when the client specifies particular constrains on the response.

I also wonder whether that would help you test backwards-compatability.  It sounds like your sever isn't outputting just the v3 format - it's outputting v3, plus some deprecated fields so that v2 clients can still work.  If you had a history of schemas going back (instead of the entire history in one file), then you could validate your current server output against them individually, instead of having one cluttered master schema representing the amalgamated output format.

That's true. And, given that we are incrementally developing the services, it can be as bad as one new schema required per commit! Clearly maintaining a single schema file per release wins outright.

Joe

Andrei Neculau

unread,
Sep 4, 2012, 4:40:43 AM9/4/12
to json-...@googlegroups.com
Josef,


with that in mind, consider a release cycle where you bump the version based on removing properties in a schema, or changing their validation in a non-backward compatible way (e.g. loosening validation is not part of this)

I do not know your specifics, but you can run a long time without bumping the version in this way. Keep in mind that deprecating stuff at an insane pace means basically that you give it very little thought to what can happen 3 moves from now (in chess terms). You need to create a small safety net. You cannot predict the future, but you can fuck it up less or more.

Reply triggered by this tweet 
mikekelly85
"Design is more the art of preserving changeability than it is the act of achieving perfection." -- @sandimetz
9/3/12 10:19 PM
Reply all
Reply to author
Forward
0 new messages