In my work developing JSON-stat visualizations and viewers like the Table Browser (
https://json-stat.org/format/browser/), I stumbled upon the convenience of providing more hints on how to first display the data. For instance, it seems useful to allow producers to tag a dimension category as the "default", meaning that, if an interface requires to pre-select one and only one value (for example, in a pull-down menu), the "default" category should be the one to be picked. With no indication of a "default" category, the interface is forced to arbitraryly select the first (or the last) category, or one with a particular id (like "0") or label (like "total").
Long time ago I also discussed with Trygve Falch (Statistics Norway) the use of JSON-stat as a metadata format, replacing the PC-Axis metadata format. I was warned by Trygve that this is currently not a 100% possible because JSON-stat has no way to express the PC-Axis concept of "elimination". Lars Knudsen (Statistics Denmark) expressed a similar concern more recently.
Considering JSON-stat adoption has been partly tied to PX-Web, it makes a lot of sense to be very attentive to the needs of PC-Axis-based producers, as they are most of the JSON-stat providers. That said, I try to keep the standard as simple as possible and only add features that can be considered general needs. That's why I do not plan to propose to add an "elimination" property, which would match perfectly PC-Axis needs: if the meaning of the term is too narrow it won't address a problem as general as possible (but of course I might be wrong trying to solve too many cases at the same time).
"elimination" is used in PC-Axis metadata responses to provide information on how a dataset can be customized (what dimensions can be dropped). It can take two forms:
ELIMINATION("sex")=YES;
ELIMINATION("region")="All Denmark";
My standardization proposal is based on the general properties "default" (already mentioned) and "required", as a simple way to provide more metadata on dimensions. As I said earlier, "default" only means some category in a dimension is to be picked if the context requires to pick a single one. "required" (by default, true) only means that, when false, some dimension might be removed from a dataset ("how" is up to the provider).
To express ELIMINATION("sex")=YES PC-Axis providers would be able to use:
"sex" : {
"label" : ...
"category" : {
"index": ...
"label": ...
},
"required": false
}
To express ELIMINATION("region")="All Denmark" PC-Axis providers would be able to use:
"region" : {
"label" : ...
"category" : {
"index": ...
"label": ...
"default": "000" //Category id for "All Denmark"
},
"required": false
}
("default" applies to categories; "required" to dimensions. Both are non-mandatory.)
In a non-PC-Axis context, a provider that does not offer a mechanism to remove dimensions would probably not use "required" (always true) but would still benefit from "default" to inform of the existence of a main category in a dimension. (Or maybe it could use "required": false in data responses to suggest a simplified view of a dataset.)
Should "required" and "default" be added to the standard? Is there a general need for them? (Otherwise they are better kept inside "extension") Is their meaning too open? Are they trying to solve too many (different) cases at the same time?
If added, do you plan to use any of these new properties? How?
All feedback is welcome and appreciated.