Abstract
JSON Schema is a specification for a JSON-based format for defining the structure of JSON data. JSON Schema provides a contract for what JSON data is required for a given application and how it can be modified, much like what XML Schema provides for XML. JSON Schema is intended to provide validation, documentation, and interaction control of JSON data. JSON Schema is based on the concepts from XML Schema, RelaxNG, and Kwalify, but is intended to be JSON-based, so that JSON data in the form of a schema can be used to validate JSON data, the same serialization/deserialization tools can be used for the schema and data, and it can be self descriptive.
Terminology
For this specification, a schema will be used to denote a JSON Schema definition, and an instance refers to the JSON object that the schema will be describing and validating.
Schema Definition
A JSON Schema is a JSON Object that defines various attributes of the instance and defines it’s usage and valid values. A JSON Schema is a JSON Object with schema attribute properties. The following is the grammar of a JSON Schema:
schema = "{" *(<schema-attribute>,) "}" ; a JSON Object with schema attribute properties
schema-attribute = (<schema attribute name> ":" <schema attribute value>) | ("type:" <type-definition>) | ("properties:" <object-type-definition>)
type-definition = union-type-definition | simple-type-definition
simple-type-definition = "string" | "number" | "integer" | "boolean" | "object" | "array" | "null" | "any"
union-type-definition = "[" 2+(<simple-type-definition> | schema) "]"
object-type-definition = { *(property-definition,) }
property-definition = <the instance property name> ":" schema
And an example JSON Schema definition could look like:
{"description":"A person",
"type":"object",
"properties":
{"name": {"type":"string"},
"age" : {"type":"integer",
"maximum":125}}
}
A schema can have the following properties which are schema attributes (all attributes are optional):
- type
- Union type definition - An array with two or more items which indicates a union of type definitions. Each item in the array may be a simple type definition or a schema. The instance value is valid if it is of the same type as one the type definitions in the array or if it is valid by one of the schemas in the array. For example to indicate that a string or number is a valid:
{"type":["string","number"]}
- Simple type definition - A string indicating a primitive or simple type. The following are acceptable strings:
- string - Value must be a string.
- number - Value must be a number, floating point numbers are allowed.
- integer - Value must be an integer, no floating point numbers are allowed. This is a subset of the number type.
- boolean - Value must be a boolean.
- object - Value must be an object. Note this is equivalent to "type":{}, which is a type referring to an empty schema which allows any object.
- array - Value must be an array. Note this is equivalent to "type":[] or ["any"], which is a type referring to an schema which allows arrays of any values.
- null - Value must be null. Note this is mainly for purpose of being able use union types to define nullability.
- any - Value may be of any type including null.
If the property is not defined or is not in this list, than any type of value is acceptable. Other type values may be used for custom purposes, but minimal validators of the specification implementation can allow any instance value on unknown type values.
- properties
This should be an object type definition, which is an object with property definitions that correspond to instance object properties. When the instance value is an object, the property values of the instance object must conform to the property definitions in this object. In this object, each property definition's value should be a schema, and the property's name should be the name of the instance property that it defines.
- items
This should be a schema or an array of schemas. When this is an object/schema and the instance value is an array, all the items in the array must conform to this schema. When this is an array of schemas and the instance value is an array, each position in the instance array must conform to the schema in the corresponding position for this array. This called tuple typing. When tuple typing is used, additional items are allowed, disallowed, or constrained by the additionalProperties attribute using the same rules as extra properties for objects.
- optional
This indicates that the instance property in the instance object is optional. This is false by default.
- additionalProperties
This provides a default property definition for all properties that are not explicitly defined in an object type definition. The value must be a schema. If false is provided, no additional properties are allowed, and the schema can not be extended. The default value is an empty schema which allows any value for additional properties.
- requires
This indicates that if this property is present, the property given by requires attribute must also be present. For example if a object type definition is defined:
{"state":{optional:true},"town":{"requires":"state",optional:true}}
An instance must include a state property if a town property is included. If a town property is not included, the state property is optional. This constraint can not be validated by event based parsers.
- identity
This indicates that the instance property can be used to uniquely identify the instance. The value for this property should be unique for all instances of this schema. It is not necessary to validate this property. Most validators may only be validating a single instance object at a time and therefore a validator can not determine that whether other instances have unique values. This property is informative; indicating that what property can be used to indentify an object from a data store. An identity property may correspond to a database key column.
- minimum
This indicates the minimum value for the instance property when the type of the instance value is a number.
- maximum
This indicates the minimum value for the instance property when the type of the instance value is a number.
- minItems
This indicates the minimum number of values in an array when an array is the instance value.
- maxItems
This indicates the maximum number of values in an array when an array is the instance value.
- pattern
When the instance value is a string, the "pattern" property must be interpreted as a regular expression as defined by
ECMA
262. The instance value must match the regular expression defined in this property.
- maxLength
When the instance value is a string, this indicates maximum length of the string.
- minLength
When the instance value is a string, this indicates minimum length of the string.
- enum
This provides an enumeration of possible values that are valid for the instance property. This should be an array, and each item in the array represents a possible value for the instance value. If "enum" is included, the instance value must be one of the values in enum array in order for the schema to be valid.
- options
This provides a list of choices for the instance property. This should be an array, and each item in the array should be an object with two possible properties. Each object must have a value property that indicates the value for the given selection. Each object may also have a label property that defines a label for the corresponding value. For example:
{"options":
[{"value":1,"label":"Small"},
{"value":2,"label":"Medium"},
{"value":3,"label":"Large"}]}
The "options" attribute does not affect validation, rather it provides suggested values. The "options" attribute can be used for user interfaces in conjunction with "enum" to define the labels for the possible values, or it may be used on it's own to provide suggested values without constraining the instance value.
- readonly
This indicates that the instance property should not be changed (this is only for interaction, it has no effect for standalone validation).
- title
This provides a short description of the instance property. The value must be a string.
- description
This provides a full description of the purpose the instance property. The value must be a string.
- format
This indicates what format the data is among predefined formats which are defined here: http://groups.google.com/group/json-schema/web/json-schema-possible-formats. This property does not need to be validated by validators. The format is intended to informative (although validators may optionally choose to validate that the instance value corresponds to the defined format).
- default
This indicates the default for the instance property.
- transient
This indicates that the property will be used for transient/volatile values that should not be persisted. This is false by default.
- maxDecimal
This indicates the maximum number of decimal places in a floating point number. By default there is no maximum.
- hidden
This specifies that an instance property is an internal property that should not be made visible to users. This should be ignored by validators, but form generation or object editing tools can utilize this attribute. - disallow
This attribute may take the same values as the "type" attribute, however if the instance matches the type or if this value is an array and the instance matches any type or schema in the array, than this instance is not valid.
- extends
The value of this property should be another schema which will provide a base schema which the current schema will inherit from. The inheritance rules are such that any instance that is valid according to the current schema must be valid according to the referenced schema. This may also be an array, in which case, the instance must be valid by all the schemas in the array.
Below is a more sophisticated example, first the instance object:
{
"name" : "John Doe",
"born" : "",
"gender" : "male",
"address" :
{"street":"123 S Main St",
"city":"Springfield",
"state":"CA"}
}
And here is a schema to validate it:
{"description":"A person",
"type":"object",
"properties": {
"name": {"type":"string"},
"born" : {"type":["integer","string"], allow for a numeric year, or a full date
"minimum":1900, min/max for when a numberic value is used
"maximum":2010,
"format":"date-time", format when a string value is used
"optional":true}
],
"gender" : {"type":"string",
"enum":["male","female"],
"options":[
{"value:"male","label":"Guy"},
{"value":"female","label":"Gal"}]},
"address" : {"type":"object",
"properties":{
"street":{"type":"string"},
"city":{"type":"string"},
"state":{"type":"string"}
}
}
}
}
Extending and Referencing
Reusing existing JSON Schemas is strongly encouraged. JSON Schemas can be reused by using referencing and inheritance. It is recommended that an "id" property is used for identification and a "$ref" property is used to reference other objects as defined here: http://www.json.com/2007/10/19/json-referencing-proposal-and-library/. Referencing by path is also allowed with this convention. This allows for circular references within schemas. Schemas from other domains may also be referencing using this convention, and it is recommended to utilize other common schemas whenever possible. There are a set of common schemas that can be found at: http://groups.google.com/group/json-schema/web/common-json-schema-definitions (and in the future at http://json-schema.org). Extending a schema can be done by referring to the base schema with the "extends" property. If schema A has an "extends" property that refers to schema B, that means A extends the base schema, B, which requires that any value that is valid instance of schema A, MUST be a valid instance of schema B. An extended schema may redefine object properties, as long as the definition is the equally or more constraining the same property in the base schema. If the base schema has set additionalProperties to false, the extended schema may not define additional object properties. If the additionalProperties attribute is not defined, the extended schema may define additional object properties. If the additionalProperties attribute is defined, additional object properties must be at least as constraining as the definition in additionalProperties. For example, the following reference and extending may be made:
{
"id":"person",
"type":"object",
"properties":{
"name":{"type":"string"},
"age":{"type":"integer"}
}
}
Which can be extended:
{"id":"marriedperson",
"extends":{"$ref":"person"},
"properties":{
"age":{"type":"integer",
"minimum":17},
"spouse":{"$ref":"marriedperson"}
}
}
Here the second schema, marriedperson, inherits from the first schema, person. The marriedperson schema inherits the name property from person and redefines the age property to have a minimum value of 17 (which is more restrictive than the person\\47s definition). It also adds an additional property, spouse. The spouse is then defined to be have values that conform to the marriedperson schema.
Self-Defined Schema Convention
JSON instance objects can also have a self defined object type definition. The recommended convention is that an instance object use a $schema property to refer to an object type definition which defines the type of the referring object. When defining a schema from an instance, the $schema property SHOULD be the first property (for linear parsers). Note that this is not limited to the root object, but any JSON object can refer to a schema to provide self-definition. For example:
{
"$schema":
{"properties":{
"name": {"type":"string"},
"age" : {"type":"integer",
"maximum":125,
"optional":true}
}
}
"name" : "John Doe",
"age" : 30,
"type" : "object",
}
Note that self-defined object type definitions are optional, due to the possibility of name clashes.
Schema Definition Location Conventions
There are a couple of ways for schemas to be correlated with JSON data without actually including the schema in the object that are recommended. By using id referencing (per JSPON), ids provide an implicit URL through the web’s relative URL scheme. For example if an object is requested from http://mydomain.com/jsonData and returns:
{ "$schema":{"$ref":"mySchema"},
"foo":"bar"
}
The reference to the schema here is an id reference, and using relative URL rules, it indicates to a client that a schema for this object can retrieved from http://mydomain.com/mySchema. We can also use an absolute URL reference:
{ "$schema":{"$ref":"http://mydomain.com/myObjectTypeDefinition"},
"foo":"bar"}
Tools
An JavaScript implementation of a JSON Schema validator is available here.