Proposal: An ergonomic micro-language that round-trips to JSONSchema

83 views
Skip to first unread message

Lloyd Hilaiel

unread,
Oct 2, 2009, 12:35:17 PM10/2/09
to JSON Schema
Hi,

Here's a little informal proposal I wrote: http://trickyco.de/2009/10/02/orderly-jsonschema

The idea is to add a layer that could make JSONSchema easier to read
and write.

curiously,
lloyd

P.S. Seems like this group needs some moderation. The level of spam
on goog's groups lately is horrible :(

Hatem Nassrat

unread,
Oct 2, 2009, 1:56:44 PM10/2/09
to json-...@googlegroups.com
On Fri, Oct 2, 2009 at 10:35 AM, Lloyd Hilaiel <llo...@gmail.com> wrote:
> Here's a little informal proposal I wrote:  http://trickyco.de/2009/10/02/orderly-jsonschema
>
> The idea is to add a layer that could make JSONSchema easier to read
> and write.

So would the idea be to create a compiler from Orderly to JSONSchema.
If so then all the features of Orderly must be a strict subset of
JSONSchema (I am not sure if it is or isn't, I am not well versed in
JSONSchema). I really like the Diet, and I am a supporter of easier to
read and write schema versus larger harder to maintain documents. I
joined this list to specifically mention this issue. I really hope
this picks up as I would love to write, use, and read Schema's using
Orderly.

--
Hatem Nassrat

Toby Inkster

unread,
Oct 2, 2009, 6:34:43 PM10/2/09
to json-...@googlegroups.com
On Fri, 2009-10-02 at 09:35 -0700, Lloyd Hilaiel wrote:
> Here's a little informal proposal I wrote:
> http://trickyco.de/2009/10/02/orderly-jsonschema

I would suggest placing the maxima and minima for values after the type
rather than the name. That is, not:

integer age[,125];

But rather:

integer[,125] age;

This makes the syntax more consistent with you suggested syntax for
unions:

union {
integer[1900,2010];
string;
} born;

And require, or at least allow, the name to be quoted:

integer[,125] "age";

Because JSON keys may contain spaces.

Not sure about the square brackets to represent optional bits. How about
the question mark?

object {
string "name";
integer[,125] "age" ?;
array {
object {
string "street-address" ?;
string "location";
string "region" ?;
};
} "address" ?;
} "person";

--
Toby A Inkster
<mailto:ma...@tobyinkster.co.uk>
<http://tobyinkster.co.uk>

Alexandre Morgaut

unread,
Oct 2, 2009, 7:14:53 PM10/2/09
to json-...@googlegroups.com

you can omit brackets from object (based on indentation)
you can use pipes for unions

union {
integer[1900,2010];
string;
} born;

could become

integer[1900,2010] | string born;

you can use min and max numbers for repetitions (arrays)


object {
string "name";
integer[,125] "age" ?;
array {
object {
string "street-address" ?;
string "location";
string "region" ?;
};
} "address" ?;
} "person";


might be written even without semicolons as


person
string name
integer[,125] age ?
address[1,] ?
string "street-address" ?
string location
string region ?


YAML is also an interesting format
winmail.dat

Tatu Saloranta

unread,
Oct 3, 2009, 3:23:19 AM10/3/09
to json-...@googlegroups.com
On Fri, Oct 2, 2009 at 4:14 PM, Alexandre Morgaut
<alexandr...@4d.fr> wrote:
>
> you can omit brackets from object (based on indentation)

I would recommend against using indentation or linefeeds for semantic
information, or making separators optional.

-+ Tatu +-

Alexandre Morgaut

unread,
Oct 3, 2009, 4:38:30 AM10/3/09
to json-...@googlegroups.com

That's just a method used in YAML
I think JSON is more powerful as communication format
On other hand, one thing I saw is that JSON introduce a lot of "noise" around the meaning word
The result is not really human friendly, some formats like YAML introduce ways to kill this noise
I don't use it actually but I find the approach interesting
The main problem to me is not in linefeeds, but in a property of indentations
-> Editors have different behaviors about indentation using tabs or spaces
So YAML or other similar formats should be only edited with specific YAML editors or specific configuration

Note: my previous example wasn't YAML but I find some similar aproaches

-----Original Message-----
From: json-...@googlegroups.com on behalf of Tatu Saloranta
Sent: Sat 03/10/2009 09:23
To: json-...@googlegroups.com
Subject: Re: Proposal: An ergonomic micro-language that round-trips to JSONSchema


winmail.dat

Ganesh and Sashi Prasad

unread,
Oct 3, 2009, 9:58:25 PM10/3/09
to json-...@googlegroups.com
Indentation works (very well) for Python. It deserves some consideration, I think.

Ganesh

2009/10/3 Tatu Saloranta <tsalo...@gmail.com>

Ganesh and Sashi Prasad

unread,
Oct 3, 2009, 10:06:39 PM10/3/09
to json-...@googlegroups.com
This is very interesting, but should we not use the qualifier "object" for consistency?

object person

       string name
       integer[,125] age ?
       object address[1,] ?

               string "street-address" ?
               string location
               string region ?

I can see this carrying the same semantic information with much less syntactic noise. I may have missed this if it exists, but we would need a way to specify types in addition to instances (e.g., person, address) with this syntax so that the relevant structures can be reused.

Regards,
Ganesh

2009/10/3 Alexandre Morgaut <alexandr...@4d.fr>

Ganesh and Sashi Prasad

unread,
Oct 3, 2009, 10:07:36 PM10/3/09
to json-...@googlegroups.com
Oops, the blank lines in my example were unintentional. Blame Gmail.

Ganesh

2009/10/4 Ganesh and Sashi Prasad <g.c.p...@gmail.com>

Lloyd Hilaiel

unread,
Oct 5, 2009, 1:22:53 PM10/5/09
to JSON Schema
Hi Hatem,

Yes, I would hope orderly would remain a strict subset of JSONSchema
and there would be a tool that could go bi-directionally between
orderly and jsonschema. The goal would be to not further fragment the
goal of a schema language for JSON, and to keep orderly totally
optional. I'm very pleased you like the idea :)

lloyd


> Hatem Nassrat

Lloyd Hilaiel

unread,
Oct 5, 2009, 1:28:50 PM10/5/09
to JSON Schema


On Oct 2, 4:34 pm, Toby Inkster <m...@tobyinkster.co.uk> wrote:
> On Fri, 2009-10-02 at 09:35 -0700, Lloyd Hilaiel wrote:
> > Here's a little informal proposal I wrote:
> >http://trickyco.de/2009/10/02/orderly-jsonschema
>
> I would suggest placing the maxima and minima for values after the type
> rather than the name. That is, not:
>
>         integer age[,125];
>
> But rather:
>
>         integer[,125] age;
>
> This makes the syntax more consistent with you suggested syntax for
> unions:
>
>         union {
>                 integer[1900,2010];
>                 string;
>         } born;

I like this idea, and I think it will improve elegance of the orderly
grammar.

> And require, or at least allow, the name to be quoted:
>
>         integer[,125] "age";

I like the idea of allowing, but not requiring. This gives orderly a
clean way of escaping type names (just use a subset of JSON's
grammar), but also preserves the quote free shorthand.

>
> Because JSON keys may contain spaces.
>
> Not sure about the square brackets to represent optional bits. How about
> the question mark?

Hmm. The more I stare at the question mark, the more it's growing on
me. Other ideas?

>         object {
>                 string "name";
>                 integer[,125] "age" ?;
>                 array {
>                         object {
>                                 string "street-address" ?;
>                                 string "location";
>                                 string "region" ?;
>                         };
>                 } "address" ?;
>         } "person";
>

Thanks for the feedback, toby!

lloyd

Lloyd Hilaiel

unread,
Oct 5, 2009, 1:39:19 PM10/5/09
to JSON Schema
On Oct 3, 1:23 am, Tatu Saloranta <tsalora...@gmail.com> wrote:
> On Fri, Oct 2, 2009 at 4:14 PM, Alexandre Morgaut
>
> <alexandre.MORG...@4d.fr> wrote:
>
> > you can omit brackets from object (based on indentation)
>
> I would recommend against using indentation or linefeeds for semantic
> information, or making separators optional.
>
> -+ Tatu +-

This is clearly a judgement call on aesthetics. My personal
preference is to ignore whitespaces as much as possible from a
syntactic standpoint. This also suggests that there may be slightly
more "syntactic noise" required, but that's a tradeoff that feels
right to me.

The nice thing about keeping orderly as a decoupled optional layer on
top of JSONSchema is that if any really "bad" decisions are made,
there's a simple sociological fix (fork it and run :P)

lloyd

Toby Inkster

unread,
Oct 5, 2009, 5:44:30 PM10/5/09
to json-...@googlegroups.com
On Mon, 2009-10-05 at 10:28 -0700, Lloyd Hilaiel wrote:
> Hmm. The more I stare at the question mark, the more it's growing on
> me. Other ideas?

The question mark I take from regular expressions, where it make the
previous subpattern match optional. Meta characters from regular
expressions include:

(foo|bar) = alternatives
foo? = optional
foo+ = one or more
foo* = zero or more
foo{2,6} = between two and six
foo{2,} = at least two
foo{,6} = at most six

Some of those concepts might be useful in schemas. Some of those
concepts you already cover.

Hatem Nassrat

unread,
Oct 5, 2009, 6:33:44 PM10/5/09
to json-...@googlegroups.com
On Mon, Oct 5, 2009 at 6:44 PM, Toby Inkster <ma...@tobyinkster.co.uk> wrote:
>
> On Mon, 2009-10-05 at 10:28 -0700, Lloyd Hilaiel wrote:
>> Hmm.  The more I stare at the question mark, the more it's growing on
>> me.  Other ideas?
>
> The question mark I take from regular expressions, where it make the
> previous subpattern match optional. Meta characters from regular
> expressions include:
>
[...]

>   foo{,6}     = at most six

if foo was an int, how would you say that they had acceptable values of 1-10

int foo[1,10]{,6}

Seems complex.

Also does Orderly cover non inclusive ranges like (1,10] or [1, 10)?

--
Hatem Nassrat

Lloyd Hilaiel

unread,
Oct 6, 2009, 1:56:04 PM10/6/09
to JSON Schema
Hi all,

I've incorporated a bunch of feedback into v0, set up a site, and
pushed all my work so far to github, summarized here: http://bit.ly/bbNaX

Appreciate all the feedback and ideas so far, and much of it has
influenced v0.

best,
lloyd

Ganesh and Sashi Prasad

unread,
Oct 6, 2009, 5:47:31 PM10/6/09
to json-...@googlegroups.com
Hi Lloyd,

I had a look at your site. Thanks, that has captured quite nicely the development of the syntax so far.

You have also shown an example that I reproduce below:

# A schema describing the data returned from the BrowserPlus services
# API at http://browserplus.yahoo.com/api/v3/corelets/osx
array {
object {
string name;
string versionString;
string os [ "ind", "osx", "win32" ];
integer size;
string documentation;
string CoreletType [ "standalone", "dependent", "provider" ];
# if CoreletType is "standalone" or "provider", then
# CoreletAPIVersion must be present
integer CoreletAPIVersion ?;
# if CoreletType is "dependent", then CoreletRequires must be present
object {
string Name;
string Version;
string Minversion;
} CoreletRequires ?;
};
};

Although succinct, this syntax still has a drawback in that we need comments to explain the exact conditions under which CoreletAPIVersion and CoreletRequires are optional elements. Can we not formalise this further?

I suggest

integer CoreletAPIVersion ? <= ( ( CoreletType == "standalone" ) || ( CoreletType == "provider" ) );

and

object {
  string Name;
  string Version;
  string Minversion;
} CoreletRequires ? <= ( CoreletType == "dependent" );

The symbol '<=' is just the mirror-image of the "implies" symbol in logic ('=>'). In other words, the presence of the optional element "is implied by" a condition specified by a boolean expression.

Alternatively, we could use a Java Generics/C++ template style expression to capture the optionality and condition more succinctly:

integer CoreletAPIVersion ?<( CoreletType == "standalone" ) || ( CoreletType == "provider" )>

where '?<>' denotes a "conditionally typed" optionality.

What do you think?

Regards,
Ganesh

2009/10/7 Lloyd Hilaiel <llo...@gmail.com>

Ganesh and Sashi Prasad

unread,
Oct 6, 2009, 6:36:12 PM10/6/09
to json-...@googlegroups.com
I've blogged about Orderly :-). I think it will give JSON (and JSON Schema) the impetus they need to challenge XML.

Regards,
Ganesh

2009/10/7 Ganesh and Sashi Prasad <g.c.p...@gmail.com>

Lloyd Hilaiel

unread,
Oct 7, 2009, 5:13:38 PM10/7/09
to JSON Schema


On Oct 6, 3:47 pm, Ganesh and Sashi Prasad <g.c.pra...@gmail.com>
wrote:> Although succinct, this syntax still has a drawback in that we
need comments
> to explain the exact conditions under which CoreletAPIVersion and
> CoreletRequires are optional elements. Can we not formalise this further?
>
> I suggest
>
> integer CoreletAPIVersion ? <= ( ( CoreletType == "standalone" ) || (
> CoreletType == "provider" ) );
>
> and
>
> object {
>   string Name;
>   string Version;
>   string Minversion;
>
> } CoreletRequires ? <= ( CoreletType == "dependent" );
>
> The symbol '<=' is just the mirror-image of the "implies" symbol in logic
> ('=>'). In other words, the presence of the optional element "is implied by"
> a condition specified by a boolean expression.

Hey Ganesh,

I too wondered if there were a way to capture this idea, but the
question is really can you represent this constraint inside
*JSONSchema*. I cannot yet see a way to do so.

FWIW, here's the canonical home of such examples:
http://github.com/lloyd/orderly/tree/master/examples/

lloyd

Tatu Saloranta

unread,
Oct 7, 2009, 5:27:13 PM10/7/09
to json-...@googlegroups.com
On Wed, Oct 7, 2009 at 2:13 PM, Lloyd Hilaiel <llo...@gmail.com> wrote:
>
...

> I too wondered if there were a way to capture this idea, but the
> question is really can you represent this constraint inside
> *JSONSchema*.  I cannot yet see a way to do so.

I think the idea of keeping this a representation of JSON schema is good.

But one aspect not yet covered on this thread (I think?) is creation
of named types (subtypes), within schema definition itself.
It was mentioned in the comments section; and I think it's an area
where improvements would be very important for data binding use cases
(which is more common use case than validation for many XML users).

That is: ability to define reusable types; such that 'object' is only
used as the base class for reusable types, or maybe for one-off
"private" types.
So instead of repeating object (or array etc) definition, define type
once, use and reuse where needed.
This not only makes definition more compact (and I would argue
readable), but also allows code generation or binding tools to know
that instances are related types. This is not a problem for loosely
typed language (like js, ruby, perl), but is for more statically typed
(like Java etc); without explicit types, new dummy classes would need
to be created for each 'anonymous' object being encoutnered.

I know that JSON schema can do this, although I am not sure if that
always requires creation of separate schemas and direct linkage via
URLs. But even if it does, perhaps Orderly definition could generate
multiple separate JSON schema instances, act as sort of uber-schema,
if necessary.

-+ Tatu +-

Pykler

unread,
Oct 19, 2009, 12:06:05 AM10/19/09
to JSON Schema
Hi Lloyd,

I had a quick read over http://orderly-json.org/ it looks great. I had
one comment, the numerical ranges do not explicitly mention that the
ranges are inclusive to the end points. Although it is somewhat clear
from the examples that they are inclusive ranges. I guess this is not
really an issue since they map to the JSONSchema minimum and maximum
attributes.

Thanks,

--
Hatem Nassrat

Atif Aziz

unread,
Oct 21, 2009, 1:56:00 PM10/21/09
to JSON Schema
Hi Lloyd,

Orderly is an excellent proposal making JSON Schema a lot more
palatable and human.

A suggestion I'd like to start with is to prefer property names on the
left, prior to the type and schema information . This gives symbolic
information precedence, which is useful if someone wants to get an
idea of the overall structure at a quick glance. In other words,
instead of this:

object {
string foo;
integer bar;
number baz;
} myobject;

prefer:

object {
foo string;
bar integer;
baz number;
} myobject;

If someone wants to get a quick idea of the properties of an object,
their eyes can just move down on the left edge. The effect is more
pronounced when you consider complicated cases like:

array { integer; } {0,10} myArrayOfSmallInts;

Here, one has to roll eyes all the way across to just to read
myArrayOfSmallInts, the symbolic bit of information. On the other
hand, here it's right upfront:

myArrayOfSmallInts array { integer; } {0,10};

- Atif

> Seems like this group needs some moderation. The level of spam
> on goog's groups lately is horrible :(

Any plans to start a group specific to Ordelry, one that is moderated
correctly? :)

- Atif

Baptiste Lepilleur

unread,
Oct 31, 2009, 3:25:51 PM10/31/09
to json-...@googlegroups.com
I find the idea of a lighter syntax based on a reduced subset of json-schema is excellent. I find myself commonly wanting to document simple structure, but I find using example/pseudo json more readable as documentation than json schema.

Here is a few change I propose with goal of making the schema clearer to read, and the language easier to learn.

I second previous proposition of making the property name first, followed by '?' to indicate optionality.

I would also change the symbols used for range, valid enumeration values and min/max string size:
  • Use [] for min/max string size
  • Use () for enumeration values and range specification
Rational:
  • It does not make sense to specific both a range of possible values and a closed list of valid enumeration values (I can't think of a use case)
  • It is more readable to denote value range using a special operator instead of ','
  • Currently the syntax is similar to express string size and numeric value range. This is somewhat confusing (e.g. you have to look-up the type to figure out how to parse {1,42})
  • This way {} is only used for aggregate type, which makes reading easier
I'm not sure what operator should be used to denote value range. I find that Ada '..' is fairly clear, but that is something to play with.

If find that introducing ':' to separate the property and keyword 'in' to introduce the valid values (set or range) make it easier to humanly parse the schema (type definition is after ':') and can be easily ignored.

Also, I would remove the extra semi-colon ';' that serve no purpose (grammar is not ambigous)

Which those changes, you could only have:

secretOfLife? : integer in (7,42)
or
secretOfLife? : integer in (7..42)
secretOfLife: array { integer in (7,42) }

and taking another example:

temps: object {
  beast: string
  normalTemperature: number
} in (
  { "beast": "canine", "normalTemperature": 101.2 },
  { "beast": "human", "normalTemperature": 98.6 }
)

login: string[4..] # login requires at least 4 chars
name: string[..32] name # name may not be longer than 32 chars

I'm also pondering if it would be possible to distinguish between two types of regular expression, e.g. "match" and "contains". Having to parse the expression to see if it starts/ends with ^$ is a bother. It would be nice if we could have something like:

mood: string match /(happy)|(sad)|(meh)/
=> would automatically adds ^( )$ around the regular expression

mood: string contains /(happy)|(sad)|(meh)/
=> keep the regular expression as is

I've tried to apply the above on one of my json type:

array {
    union {
        object {
            project: object {
                name: string[1..]
                project_dir: string[1..]
                public_headers: array {
                    include_dir: string[1..]
                    headers: array{ string[1..] }
                }
                compiled_sources: array{ string[1..] }
                private_headers: array{ string[1..] }
            }
        }
        object {
            secondary_project: object {
                name: string[1..]
                public_headers: array {
                    include_dir: string[1..]
                    headers: array{ string[1..] }
                }
            }
        }
    }
}

Clearer than it would be in jsonschema, but being able to declare type 'project', 'secondary_project' and 'public_headers' separately would greatly clarify the schema. Here is an attempt at that:

type public_headers_type: array {
    include_dir: string[1..]
    headers: array{ string[1..] }
}

type project_type: object {
    project: object {
        name: string[1..]
        project_dir: string[1..]
        public_headers: public_headers_type
        compiled_sources: array{ string[1..] }
        private_headers: array{ string[1..] }
    }
}

type secondary_project_type: object {
    secondary_project: object {
        name: string[1..]
        project_dir: string[1..]
        public_headers: public_headers_type
    }
}

array {
    union {
        project_type
        secondary_project_type
    }
}

Looking back on the above example, the advantage of putting the property name followed by ':' before the type description is that you have something that read mostly like the corresponding json object, except that the property value is replaced by the type description.

Just my two cents (way longer than I expected :) ),
Baptiste.
Reply all
Reply to author
Forward
0 new messages