Abstract
JSON Schema is a specification for a JSON-based format for defining
the structure of JSON data. JSON Schema provides a contract for what
JSON data is required for a given application and how it can be
modified, much like what XML Schema provides for XML. JSON Schema is
intended to provide validation, documentation, and interaction control
of JSON data. JSON Schema is based on the concepts from XML Schema, RelaxNG, and Kwalify,
but is intended to be JSON-based, so that JSON data in the form of a
schema can be used to validate JSON data, the same
serialization/deserialization tools can be used for the schema and
data, and it can be self descriptive.
Terminology
For this specification, a schema will be used to denote a JSON Schema definition, and an instance refers to the JSON object that the schema will be describing and validating.
Schema Definition
A JSON Schema is a JSON Object that defines various attributes of
the instance and defines it’s usage and valid values. A JSON Schema is
a JSON Object with schema attribute properties. The following is the
grammar of a JSON Schema:
schema = "{" *(<schema-attribute>,) "}" ; a JSON Object with schema attribute properties
schema-attribute
= (<schema attribute name> ":" <schema attribute value>) |
("type:" <type-definition>) | ("properties:"
<object-type-definition>)
type-definition = union-type-definition | simple-type-definition
simple-type-definition = "string" | "number" | "integer" | "boolean" | "object" | "array" | "null" | "any"
union-type-definition = "[" 2+(<simple-type-definition> | schema) "]"
object-type-definition = { *(property-definition,) }
property-definition = <the instance property name> ":" schema
And an example JSON Schema definition could look like:
{"description":"A person",
"type":"object",
"properties":
{"name": {"type":"string"},
"age" : {"type":"integer",
"maximum":125}}
}
A schema can have the following properties which are schema attributes (all attributes are optional):
- type
- Union type definition - An array with two or more items which
indicates a union of type definitions. Each item in the array may be a
simple type definition or a schema. The instance value is valid if it
is of the same type as one the type definitions in the array or if it
is valid by one of the schemas in the array. For example to indicate
that a string or number is a valid:
{"type":["string","number"]}
- Simple type definition - A string indicating a primitive or simple type. The following are acceptable strings:
- string - Value must be a string.
- number - Value must be a number, floating point numbers are allowed.
- integer - Value must be an integer, no floating point numbers are allowed. This is a subset of the number type.
- boolean - Value must be a boolean.
- object - Value must be an object.
- array - Value must be an array.
- null - Value must be null. Note this is mainly for purpose of being able use union types to define nullability.
- any - Value may be of any type including null.
If the
property is not defined or is not in this list, than any type of value
is acceptable. Other type values may be used for custom purposes, but
minimal validators of the specification implementation can allow any
instance value on unknown type values.
- properties
This should be an object type definition,
which is an object with property definitions that correspond to
instance object properties. When the instance value is an object, the
property values of the instance object must conform to the property
definitions in this object. In this object, each property definition's
value should be a schema, and the property's name should be the name of
the instance property that it defines.
- items
This should be a schema or an array of
schemas. When this is an object/schema and the instance value is an
array, all the items in the array must conform to this schema. When
this is an array of schemas and the instance value is an array, each
position in the instance array must conform to the schema in the
corresponding position for this array. This called tuple typing. When
tuple typing is used, additional items are allowed, disallowed, or
constrained by the additionalProperties attribute using the same rules
as extra properties for objects.
- optional
This indicates that the instance property in the instance object is optional. This is false by default.
- additionalProperties
This provides a default
property definition for all properties that are not explicitly defined
in an object type definition. The value must be a schema. If false is
provided, no additional properties are allowed, and the schema can not
be extended. The default value is an empty schema which allows any
value for additional properties.
- requires
This indicates that if this property is present, the property given by requires attribute must also be present. For example if a object type definition is defined:
{"state":{optional:true},"town":{"requires":"state",optional:true}}
An instance must include a state property if a town property is included. If a town property is not included, the state property is optional. This constraint can not be validated by event based parsers.
- identity
This indicates that the instance property can be used to uniquely identify
the instance. It is false by default. A property or a set of properties that have their identity property set to true should be unique for all instances of this schema. It is
not necessary to validate the identity property. Most validators may only
validate a single instance object at a time and therefore cannot determine
whether other instances have unique values. This property is informative,
indicating which property can be used to identify an object from a data
store. An identity property may correspond to a database key column.
- minimum
This indicates the minimum value for the instance property when the type of the instance value is a number.
- maximum
This indicates the minimum value for the instance property when the type of the instance value is a number.
- minItems
This indicates the minimum number of values in an array when an array is the instance value.
- maxItems
This indicates the maximum number of values in an array when an array is the instance value.
- pattern
When the instance value is a string, the "pattern" property must be interpreted as a regular expression as defined by
ECMA
262. The instance value must match the regular expression defined in this property.
- maxLength
When the instance value is a string, this indicates maximum length of the string.
- minLength
When the instance value is a string, this indicates minimum length of the string.
- enum
This provides an enumeration of possible values
that are valid for the instance property. This should be an array, and
each item in the array represents a possible value for the instance
value. If "enum" is included, the instance value must be one of the
values in enum array in order for the schema to be valid.
- options
This
provides a list of choices for the instance property. This should be an
array, and each item in the array should be an object with two possible
properties. Each object must have a value property that indicates the
value for the given selection. Each object may also have a label
property that defines a label for the corresponding value. For example:
{"options":
[{"value":1,"label":"Small"},
{"value":2,"label":"Medium"},
{"value":3,"label":"Large"}]}
The
"options" attribute does not affect validation, rather it provides
suggested values. The "options" attribute can be used for user
interfaces in conjunction with "enum" to define the labels for the
possible values, or it may be used on it's own to provide suggested
values without constraining the instance value.
- readonly
This indicates that the instance property
should not be changed (this is only for interaction, it has no effect
for standalone validation).
- title
This provides a short description of the instance property. The value must be a string.
- description
This provides a full description of the purpose the instance property. The value must be a string.
- format
This indicates what format the data is among predefined formats which are defined here: http://groups.google.com/group/json-schema/web/json-schema-possible-formats.
This property does not need to be validated by validators. The format
is intended to informative (although validators may optionally choose
to validate that the instance value corresponds to the defined format).
- default
This indicates the default for the instance property.
- transient
This indicates that the property will be
used for transient/volatile values that should not be persisted. This
is false by default.
- maxDecimal
This indicates the maximum number of decimal places in a floating point number. By default there is no maximum.
- hidden
This specifies that an
instance property is an internal property that should not be made
visible to users. This should be ignored by validators, but form
generation or object editing tools can utilize this attribute. - disallow
This
attribute may take the same values as the "type" attribute, however if
the instance matches the type or if this value is an array and the
instance matches any type or schema in the array, than this instance is
not valid.
- extends
The value of this property should be another
schema which will provide a base schema which the current schema will
inherit from. The inheritance rules are such that any instance that is
valid according to the current schema must be valid according to the
referenced schema. This may also be an array, in which case, the
instance must be valid by all the schemas in the array.
Below is a more sophisticated example, first the instance object:
{
"name" : "John Doe",
"born" : "",
"gender" : "male",
"address" :
{"street":"123 S Main St",
"city":"Springfield",
"state":"CA"}
}
And here is a schema to validate it:
{"description":"A person",
"type":"object",
"properties": {
"name": {"type":"string"},
"born" : {"type":["integer","string"], allow for a numeric year, or a full date
"minimum":1900, min/max for when a numberic value is used
"maximum":2010,
"format":"date-time", format when a string value is used
"optional":true},
"gender" : {"type":"string",
"enum":["male","female"],
"options":[
{"value:"male","label":"Guy"},
{"value":"female","label":"Gal"}]},
"address" : {"type":"object",
"properties":{
"street":{"type":"string"},
"city":{"type":"string"},
"state":{"type":"string"}
}
}
}
}
And for an example using an array, here is an instance:
{
"name" : "John Doe",
"hobbies" : ["Skiing", "Stamp collecting"]
}
We can validate that the hobbies property must be an array of strings:
{"description":"A person",
"type":"object",
"properties": {
"name": {"type":"string"},
"hobbies" : {
"type":"array",
"items": {"type":"string"}
}
}
}
MIME Type
The MIME type for JSON Schema entities is application/x-schema+json.
Extending and Referencing
Reusing existing JSON Schemas is strongly encouraged. JSON Schemas
can be reused by using referencing and inheritance. Referencing
utilizes an identity property for identification and a "$ref" property
is used to reference other objects as defined here: http://www.json.com/2007/10/19/json-referencing-proposal-and-library/. The schema for schemas
defines the identity property for schemas to be "id", and therefore the
"id" property can be used to identity a schema. Referencing by path is
also allowed with this convention. This allows for circular references
within schemas. Schemas from other domains may also be referencing
using this convention, and it is recommended to utilize other common
schemas whenever possible. There are a set of common schemas that can
be found at: http://groups.google.com/group/json-schema/web/common-json-schema-definitions (and in the future at http://json-schema.org).
Extending a schema can be done by referring to the base schema with the
"extends" property. If schema A has an "extends" property that refers
to schema B, that means A extends the base schema, B, which requires
that any value that is valid instance of schema A, MUST be a valid
instance of schema B. An extended schema may redefine object
properties, as long as the definition is the equally or more
constraining the same property in the base schema. If the base schema
has set additionalProperties to false, the extended schema may not
define additional object properties. If the additionalProperties
attribute is not defined, the extended schema may define additional
object properties. If the additionalProperties attribute is defined,
additional object properties must be at least as constraining as the
definition in additionalProperties. For example, the following
reference and extending may be made:
{
"id":"person",
"type":"object",
"properties":{
"name":{"type":"string"},
"age":{"type":"integer"}
}
}
Which can be extended:
{"id":"marriedperson",
"extends":{"$ref":"person"},
"properties":{
"age":{"type":"integer",
"minimum":17},
"spouse":{"$ref":"marriedperson"}
}
}
Here the second schema, marriedperson, inherits from the first schema, person. The marriedperson schema inherits the name property from person and redefines the age property to have a minimum value of 17 (which is more restrictive than the person's definition). It also adds an additional property, spouse. The spouse is then defined to
be have values that conform to the marriedperson schema.
We can also reuse the person's definition for defining properties and items in other schemas as well. For example,
{"id":"group",
"properties":{
"leader":{"$ref":"person"},
"members":{
"type":"array",
"items":{
{"$ref":"person"}
}
} }
}
Here a group schema defines that groups (group instances) should have a leader property whose value should be an instance of the person schema, and a members property whose value is an array where every item in the array should be an instance of the person schema.
Self-Defined Schema Convention
JSON instance objects can also have a self defined object type
definition. The recommended convention is that an instance object use a
$schema property to refer to an object type definition which defines
the type of the referring object. When defining a schema from an
instance, the $schema property SHOULD be the first property (for linear
parsers). Note that this is not limited to the root object, but any
JSON object can refer to a schema to provide self-definition. For
example:
{
"$schema":
{"properties":{
"name": {"type":"string"},
"age" : {"type":"integer",
"maximum":125,
"optional":true}
}
}
"name" : "John Doe",
"age" : 30,
"type" : "object",
}
Note that self-defined object type definitions are optional, due to the possibility of name clashes.
Schema Definition Location Conventions
There are a couple of ways for schemas to be correlated with JSON
data without actually including the schema in the object that are
recommended. By using id referencing (per JSPON), ids provide an
implicit URL through the web’s relative URL scheme. For example if an
object is requested from http://mydomain.com/jsonData and returns:
{ "$schema":{"$ref":"mySchema"},
"foo":"bar"
}
The reference to the schema here is an id reference, and using
relative URL rules, it indicates to a client that a schema for this
object can retrieved from http://mydomain.com/mySchema. We can also use
an absolute URL reference:
{ "$schema":{"$ref":"http://mydomain.com/myObjectTypeDefinition"},
"foo":"bar"}
Tools
An JavaScript implementation of a JSON Schema validator is available here.