JSON support for the C++ standard library

6,213 views
Skip to first unread message

Niels Lohmann

unread,
Mar 1, 2017, 4:25:33 PM3/1/17
to std-pr...@isocpp.org
Hi there,

I am wondering whether JSON [RFC7159] support would be a helpful extension to the C++ standard library (pure library extension), including, but not limited to, the following aspects:

1. A variant-like container type (for this mail, let's call it "std::json") that combines C++ types for the JSON value types [RFC7159, chapter 3]:
- string (e.g. std::string),
- number (e.g., double or int64_t),
- boolean (e.g., bool),
- array (e.g., std::vector), and
- object (e.g., std::map).

This type should have an intuitive API (i.e., all expected container methods), but also use as much syntactic sugar as possible (e.g., using initializer lists to express arrays like "std::json my_array = {"a string", 17, 42.12};".

2. A serialization function to create a textual representation (called "JSON text" in [RFC7159]) from a std::json value that conforms to the JSON grammar [RFC7159, chapter 2-7].

3. A deserialization function (i.e., a parser) [RFC7159, chapter 9] to create a std::json value from a JSON text.

There are currently dozens of libraries [json.org] written in C or C++ solving these aspects. However, it would be of great convenience to have JSON be part of the C++ standard library. In particular, the wide use of JSON as exchange format for structured data as well as to express simple configuration data would could solve a lot of use cases within the C++ standard library.

I would be willing to draft a proposal based on the experience I made with my C++ JSON library [nlohmann/json]. Of course, I would be interested in your thoughts on this.

All the best,
Niels

References

[RFC7159] https://tools.ietf.org/html/rfc7159.html
[json.org] http://json.org
[nlohmann/json] https://github.com/nlohmann/json
signature.asc

Nicol Bolas

unread,
Mar 1, 2017, 4:41:59 PM3/1/17
to ISO C++ Standard - Future Proposals

If you're going to do this, then you need to take into account the following:

1) Allocator support. Whatever containers you use, the user must be able to decide how allocation works.
2) Visitation support. Or just use `std::variant`.
3) For strings, you need to make a determination if the JSON object should own the string or not. For JSON objects parsed from in-memory strings, users might want to have it store `string_view`s that reference the string directly rather than `string`.

Also, I wouldn't call it "serialization".

Zhihao Yuan

unread,
Mar 1, 2017, 5:13:22 PM3/1/17
to std-pr...@isocpp.org
On Wed, Mar 1, 2017 at 3:25 PM, Niels Lohmann <ma...@nlohmann.me> wrote:
>
> I would be willing to draft a proposal based on the experience I made with my C++ JSON library [nlohmann/json]. Of course, I would be interested in your thoughts on this.

You must have been aware of rapidjson for quite
long time, so it won't be too surprising to you if I
say "no SAX, no consideration" I guess :/

--
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
___________________________________________________
4BSD -- http://blog.miator.net/

Daniel Frey

unread,
Mar 1, 2017, 5:32:20 PM3/1/17
to std-pr...@isocpp.org, Niels Lohmann
Hi Niels, all,

> On 1 Mar 2017, at 22:25, Niels Lohmann <ma...@nlohmann.me> wrote:
>
> I am wondering whether JSON [RFC7159] support would be a helpful extension to the C++ standard library (pure library extension), including, but not limited to, the following aspects:

I believe it would certainly be helpful to a large amount of users.

Disclaimer: I'm an author of a different JSON library, <https://github.com/taocpp/json>. Documentation is still a weak point, but the library is quite usable already. More information about the library, including the below mentioned SAX-approach can be found in some slides of a talk I gave: <http://www.wilkening-online.de/programmieren/c++-user-treffen-aachen/2016_11_10/taocpp-json_DanielFrey.pdf>

> 1. A variant-like container type (for this mail, let's call it "std::json") that combines C++ types for the JSON value types [RFC7159, chapter 3]:

Or std::json::value to have std::json::* available for a lot of other types/functions/... that fall into the JSON realm.

> This type should have an intuitive API (i.e., all expected container methods), but also use as much syntactic sugar as possible (e.g., using initializer lists to express arrays like "std::json my_array = {"a string", 17, 42.12};".

I think that having "all the expected container methods" is going to far, it seems "forced" to me and not a natural fit. A JSON value *can* be a container, but it is not generally a container. And all JSON values form a tree, not a flat container.

Concerning the initializer list support, which I really like in practice:

1) No magic, please. In your library you scan the initializer list to check if it *could* be an object, otherwise you turn it into an array. I'd avoid that and use a clear and unambiguous syntax. In our library we use initializer lists only to create JSON objects and json::value::array(std::initializer_list<...>) to create JSON arrays.

2) Initializer lists usually require *copies* of all elements. Hopefully there is some independent proposal to allow *moving* the values of an initializer list. This is especially important for the JSON use case, as JSON values can be (deeply) nested.

3) Both of our libraries already have that feature, but I would think it is also important for standardization that the specialization of UDTs is possible (how they are converted from and to JSON).

> 2. A serialization function to create a textual representation (called "JSON text" in [RFC7159]) from a std::json value that conforms to the JSON grammar [RFC7159, chapter 2-7].
>
> 3. A deserialization function (i.e., a parser) [RFC7159, chapter 9] to create a std::json value from a JSON text.
>
> There are currently dozens of libraries [json.org] written in C or C++ solving these aspects. However, it would be of great convenience to have JSON be part of the C++ standard library. In particular, the wide use of JSON as exchange format for structured data as well as to express simple configuration data would could solve a lot of use cases within the C++ standard library.
>
> I would be willing to draft a proposal based on the experience I made with my C++ JSON library [nlohmann/json]. Of course, I would be interested in your thoughts on this.

I think that the SAX-like approach we use in our library would be a very good fit for a standardization proposal as it allows to decouple the different parts and if only a small sub-set of the possible event producers and consumers are standardized, it allows the user to add and mix those parts with other things like a non-JSON-conformant parser that accepts "relaxed JSON" or binary formats like CBOR, etc.

It would even allow for different value classes, depending on the use case. Some people might need the fastest value class possible and hence might want something with special allocators to minimize the number of allocations (basically what RapidJSON does to achieve speed), some would need value classes which always sort the keys or allow multiple keys (which I personally don't like at all, but if it should be standardized, I am not the one to make the call). It again allows a standardized value class for "most" users and still allows to combine all other parts (parsers, to_string, ...) to be re-used when the user decides to use another value class.

Thinking about the scope of what this proposal has to cover, I think this is a *lot* of work. If you'd like help/cooperation on writing a proposal, I'm certainly interested.

Best regards,
Daniel

Tony V E

unread,
Mar 1, 2017, 5:46:25 PM3/1/17
to Daniel Frey, Niels Lohmann
‎> 1) No magic, please. In your library you scan the initializer list to check if it *could* be an object, otherwise you turn it into an array.

Objects need names for each member, right? So the magic would check for string, value, string, value,... pattern? Yeah, that's a bit scary. 

I once did a DSL that would allow something like

Json json = { "x"k = 17, "y"k = { 1,2,3 }, "z"k = "hello"} ;


‎You can make it quite json-like (replace : with =, etc). 



Sent from my BlackBerry portable Babbage Device
  Original Message  
From: Daniel Frey
Sent: Wednesday, March 1, 2017 5:32 PM
To: std-pr...@isocpp.org
Reply To: std-pr...@isocpp.org
Cc: Niels Lohmann
Subject: Re: [std-proposals] JSON support for the C++ standard library
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/7015CA97-5D3D-4DB0-BA67-62F768800B6E%40gmx.de.

Daniel Frey

unread,
Mar 1, 2017, 5:59:34 PM3/1/17
to std-pr...@isocpp.org, Niels Lohmann
> On 1 Mar 2017, at 23:46, Tony V E <tvan...@gmail.com> wrote:
>
> ‎> 1) No magic, please. In your library you scan the initializer list to check if it *could* be an object, otherwise you turn it into an array.
>
> Objects need names for each member, right? So the magic would check for string, value, string, value,... pattern? Yeah, that's a bit scary.

Our solution is nesting. Think: std::initializer_list<std::pair<std::string, std::json::value>>. Example:

std::json::value j =
{
{ "foo", 42 },
{ "bar", "Hello, world!" },
{ "baz",
{
{ "nested", "object },
{ "key", true }
}
}
};

This worked quite well for us in practice.

> I once did a DSL that would allow something like
>
> Json json = { "x"k = 17, "y"k = { 1,2,3 }, "z"k = "hello"} ;
>
>
> ‎You can make it quite json-like (replace : with =, etc).

Your method sound more complicated and I'd be surprised if it is as efficient as the above way we have chosen. But it is interesting as it is closer to the original JSON syntax and it potentially allows mixing objects and arrays by providing overloads for different initializer lists? Not sure if the latter would actually work.

darrell...@gmail.com

unread,
Mar 1, 2017, 6:39:47 PM3/1/17
to ISO C++ Standard - Future Proposals, ma...@nlohmann.me, d.f...@gmx.de
I think I would be in flavor of keeping it in a 3rd party library.  There are so many trade offs for different problem types and requirements.

Do you require an Input Iterator, Forward, or Random.  Each have a cost/benefit.  With input you can pull straight from a network stream but need to store the current string if the value is a string or a number(majority of the time).  Forward lets you not have to buffer the string internally or own more than a few iterators but precludes many data sources that are common to json.  Random, similarly will allow for about the same as forward plus being able to use the json file as storage for many of the types or to do decoding on demand of the values(like numbers) but is the most restrictive on file types.

Is decoding to a variant/map like structure the most cost effective, it does allow for nice queries as some libraries have shown?  It will definitely consume more memory than parsing direct to a previously known c++ structure with a little bit of boiler plate to join the json member names/types with the data structure.

Will there be more than one type?  Like a parse to variant/map and a sax like version?  But what restrictions on source types?  Or with the variety of needs/methods would be best not to put the std stamp on one of the nine valid ways I listed.

Brittany Friedman

unread,
Mar 1, 2017, 7:48:17 PM3/1/17
to std-pr...@isocpp.org
On Wed, Mar 1, 2017 at 5:39 PM, <darrell...@gmail.com> wrote:
I think I would be in flavor of keeping it in a 3rd party library.  There are so many trade offs for different problem types and requirements.

Do you require an Input Iterator, Forward, or Random.  Each have a cost/benefit.  With input you can pull straight from a network stream but need to store the current string if the value is a string or a number(majority of the time).  Forward lets you not have to buffer the string internally or own more than a few iterators but precludes many data sources that are common to json.  Random, similarly will allow for about the same as forward plus being able to use the json file as storage for many of the types or to do decoding on demand of the values(like numbers) but is the most restrictive on file types.

Is decoding to a variant/map like structure the most cost effective, it does allow for nice queries as some libraries have shown?  It will definitely consume more memory than parsing direct to a previously known c++ structure with a little bit of boiler plate to join the json member names/types with the data structure.

Will there be more than one type?  Like a parse to variant/map and a sax like version?  But what restrictions on source types?  Or with the variety of needs/methods would be best not to put the std stamp on one of the nine valid ways I listed.

As someone who uses their own vector, string, span, and other classes I do really sympathize with the desire to have a fully configurable library solution with infinite flexibility. But--

I don't think that we have to have a perfect solution in the standard library in order to have a useful version that we can point to and say, hey, here's the easy built-in solution for json parsing. If you don't like the std version you can still use an external library. Nothing we propose is ever perfect. What it needs to be is useful for a large enough audience.

Robert Ramey

unread,
Mar 1, 2017, 7:53:23 PM3/1/17
to std-pr...@isocpp.org
On 3/1/17 4:48 PM, Brittany Friedman wrote:

> As someone who uses their own vector, string, span, and other classes I
> do really sympathize with the desire to have a fully configurable
> library solution with infinite flexibility. But--
>
> I don't think that we have to have a perfect solution in the standard
> library in order to have a useful version that we can point to and say,
> hey, here's the easy built-in solution for json parsing. If you don't
> like the std version you can still use an external library. Nothing we
> propose is ever perfect. What it needs to be is useful for a large
> enough audience.

I really think that the standard contains too many "big". This would be
a perfect example. I know it's not big now, but by the time it got
massaged by the/a committee to try to placate differing trade offs, use
cases, etc. , it would be. I think the committee would spend it's time
more wisely by focusing more on "core" type libraries.

Robert Ramey

darrell...@gmail.com

unread,
Mar 1, 2017, 8:31:01 PM3/1/17
to ISO C++ Standard - Future Proposals
I am too naive of the ecosystem to know what would be the balance?  I suspect a callback based(SAX like) parser would use the least resources(cpu/memory) and requiring only Input Iterators would be the most flexable in data sources but at the cost of a buffer to hold the current value if it is a string or a number.   That would be a cost in a non-callback parser too, but not with a Forward Iterator or Random as one could store pointers(But then the source has to remain open).  The penalty could be you might run out of memory on some strings.

Daniel Frey

unread,
Mar 2, 2017, 12:44:54 AM3/2/17
to std-pr...@isocpp.org
> On Wed, Mar 1, 2017 at 5:39 PM, <darrell...@gmail.com> wrote:
> I think I would be in flavor of keeping it in a 3rd party library. There are so many trade offs for different problem types and requirements.
>
> Do you require an Input Iterator, Forward, or Random. Each have a cost/benefit. With input you can pull straight from a network stream but need to store the current string if the value is a string or a number(majority of the time). Forward lets you not have to buffer the string internally or own more than a few iterators but precludes many data sources that are common to json. Random, similarly will allow for about the same as forward plus being able to use the json file as storage for many of the types or to do decoding on demand of the values(like numbers) but is the most restrictive on file types.
>
> Is decoding to a variant/map like structure the most cost effective, it does allow for nice queries as some libraries have shown? It will definitely consume more memory than parsing direct to a previously known c++ structure with a little bit of boiler plate to join the json member names/types with the data structure.
>
> Will there be more than one type? Like a parse to variant/map and a sax like version? But what restrictions on source types? Or with the variety of needs/methods would be best not to put the std stamp on one of the nine valid ways I listed.

With a SAX-like approach, you could have independent building blocks. Write a parser that takes Input Iterators as a SAX producer. Write another parser that takes Random Access Iterators, And another parser that read a file with mmap. Or write a SAX producer for your own binary format. Or for an nlohmann::json instance.

And you don't have to put a "std" stamp on all of these, but the really generic ones that benefit the majority of users. But with the right SAX interface, you are still open for extensions. From the slides I linked, this is our SAX interface:

struct ...
{
void null();

void boolean( const bool );

void number( const std::int64_t );
void number( const std::uint64_t );
void number( const double );

void string( std::string && );
void string( const std::string & );

// array
void begin_array();
void element();
void end_array();

// object
void begin_object();
void key( std::string && );
void key( const std::string & );
void member();
void end_object();
};

In the slides, I also show an example of a SAX producer (e.g., a parser) and a SAX consumer (e.g., stringify, pretty_print). Obviously, the above still has a lot of potential for discussion, mainly because of the question which overloads should exist for numeric types and even for string/key (think std::string_view, ...).

<http://www.wilkening-online.de/programmieren/c++-user-treffen-aachen/2016_11_10/taocpp-json_DanielFrey.pdf>

> On 2 Mar 2017, at 01:48, Brittany Friedman <fourt...@gmail.com> wrote:
>
> As someone who uses their own vector, string, span, and other classes I do really sympathize with the desire to have a fully configurable library solution with infinite flexibility. But--

> I don't think that we have to have a perfect solution in the standard library in order to have a useful version that we can point to and say, hey, here's the easy built-in solution for json parsing. If you don't like the std version you can still use an external library. Nothing we propose is ever perfect. What it needs to be is useful for a large enough audience.

Also here, the SAX interface could benefit you greatly. Have your own value type to store the JSON values, but re-use a JSON Schema validator, parsers, stringify/prettify, support for binary formats, etc. For each value type to be useable with the rest of the ecosystem, your only need two SAX adapters: one SAX producer, one SAX consumer. In our library, those are

SAX producer: <https://github.com/taocpp/json/blob/master/include/tao/json/sax/from_value.hh> (137 lines for two version b/c move-semantics)
SAX consumer: <https://github.com/taocpp/json/blob/master/include/tao/json/sax/to_value.hh> (118 lines)

For Niels' library, those adapters look like:

SAX producer: <https://github.com/taocpp/json/blob/master/contrib/nlohmann/from_value.hh> (70 lines)
SAX consumer: <https://github.com/taocpp/json/blob/master/contrib/nlohmann/to_value.hh> (117 lines)

The stringify/prettify SAX consumers are also a good example how they can be both simple and powerful, while also being quite efficient.

Stringify: <https://github.com/taocpp/json/blob/master/include/tao/json/sax/to_stream.hh> (137 lines)
Prettify: <https://github.com/taocpp/json/blob/master/include/tao/json/sax/to_pretty_stream.hh> (152 lines)

IMHO the above shows the power of the SAX-like approach. You could still have 3rd-party libraries which implement better/different parsers, other libraries implement binary formats, and yet some other libraries might concentrate on providing a better/different value class (e.g. RapidJSON style for those that need pure speed above all else). Each of those libraries could concentrate one exactly *one* problem/solution and you can easily combine all of these through a standardized SAX interface. RapidJSON, as an example, also uses a SAX approach, but has a slightly different interface than we have. Even if we could *only* standardize the SAX interface for JSON, that would be an extremly helpful step for everyone.

Nicol Bolas

unread,
Mar 2, 2017, 12:52:47 AM3/2/17
to ISO C++ Standard - Future Proposals, d.f...@gmx.de
Personally, I've always disliked the SAX approach, where reads are forced on you. I much prefer the reader approach, where you read at whatever pace you like. It's rather more like an iterator in that respect.

Such an approach makes it very easy on the code doing the reading, since it can do things like pass the reader to other code to interpret those parameters. Doing this with a SAX interface requires basically writing a state machine into a type. And why would you want to encourage that kind of coding?

You can get the same effects of a SAX interface with a Reader-like approach. Rather than standardizing the struct that interprets the JSON, you standardize the Reader as a Concept. A Reader has functions to read the next datum, tell you want that datum is, etc. And you can standardize a similar Writer Concept, which has interfaces for writing the data.

So you get all of the benefits you outline, but without the huge detriment of using SAX to process stuff.

Daniel Frey

unread,
Mar 2, 2017, 1:07:29 AM3/2/17
to ISO C++ Standard - Future Proposals, Nicol Bolas
I wonder if the SAX approach when combined with coroutines could provide the kind of decoupling you are looking for. We (the authors of taocpp/json and the PEGTL, which is the underlying parser library) are already waiting for coroutines to be able to play with pull-like interfaces instead of push-like interfaces. I do wonder about the efficiency, though. Anyways, for now, we are using the SAX-like interface and I'd like to see some real code of an alternative, as your above statement are too theoretical for me. Do you have a link to a library which uses such an approach?

Thiago Macieira

unread,
Mar 2, 2017, 2:11:41 AM3/2/17
to std-pr...@isocpp.org
Em quarta-feira, 1 de março de 2017, às 13:41:59 PST, Nicol Bolas escreveu:
> 3) For strings, you need to make a determination if the JSON object should
> own the string or not. For JSON objects parsed from in-memory strings,
> users might want to have it store `string_view`s that reference the string
> directly rather than `string`.

Zero-copy is not possible with JSON strings, since they can be escaped and the
parser needs to unescape it. There are certain characters that must always be
escaped in JSON text.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

Jeffrey Yasskin

unread,
Mar 2, 2017, 2:27:43 AM3/2/17
to std-pr...@isocpp.org
On Wed, Mar 1, 2017 at 11:25 AM, Niels Lohmann <ma...@nlohmann.me> wrote:
Hi there,

I am wondering whether JSON [RFC7159] support would be a helpful extension to the C++ standard library

Yes please. I'll leave it to y'all to figure out which existing library or libraries should go into the exact proposal. Please do try to invent as little as possible when writing the proposal: stay to behavior that's been vetted by existing users or behavior that's been requested in bug reports for existing libraries.

Thanks,
Jeffrey (LEWG chair)

Magnus Fromreide

unread,
Mar 2, 2017, 2:51:07 AM3/2/17
to std-pr...@isocpp.org
Are there any limits on JSON strings?

If there ain't, wouldn't a method like

void string(json_string_iterator& begin, const json_string_iterator& end);

allow the processing of strings larger than the available memory (which might
be quite limited).

I think the same questuion is valid for number as well.

/MF

Thiago Macieira

unread,
Mar 2, 2017, 2:55:51 AM3/2/17
to std-pr...@isocpp.org
Em quarta-feira, 1 de março de 2017, às 21:52:46 PST, Nicol Bolas escreveu:
> Personally, I've always disliked the SAX approach, where reads are forced
> on you. I much prefer the reader approach, where you read at whatever pace
> you like.

That's called the StAX approach. https://en.wikipedia.org/wiki/StAX

ol...@join.cc

unread,
Mar 2, 2017, 4:16:01 AM3/2/17
to ISO C++ Standard - Future Proposals
Op woensdag 1 maart 2017 22:25:33 UTC+1 schreef Niels Lohmann:
There are currently dozens of libraries [json.org] written in C or C++ solving these aspects. However, it would be of great convenience to have JSON be part of the C++ standard library. In particular, the wide use of JSON as exchange format for structured data as well as to express simple configuration data would could solve a lot of use cases within the C++ standard library.

What's the advantage of a standard-library over a third-party library (in this case), apart from not being a dependency? 

Daniel Frey

unread,
Mar 2, 2017, 10:48:35 AM3/2/17
to std-pr...@isocpp.org
> On 2 Mar 2017, at 08:11, Thiago Macieira <thi...@macieira.org> wrote:
>
> Em quarta-feira, 1 de março de 2017, às 13:41:59 PST, Nicol Bolas escreveu:
>> 3) For strings, you need to make a determination if the JSON object should
>> own the string or not. For JSON objects parsed from in-memory strings,
>> users might want to have it store `string_view`s that reference the string
>> directly rather than `string`.
>
> Zero-copy is not possible with JSON strings, since they can be escaped and the
> parser needs to unescape it. There are certain characters that must always be
> escaped in JSON text.

That is not necessarily true. If the source is a binary protocol or you create a JSON value programmatically, the string may very well exist in raw form already. Only the JSON string representation requires escaping/unescaping, but not everyone will need that.

Daniel Frey

unread,
Mar 2, 2017, 10:51:10 AM3/2/17
to std-pr...@isocpp.org
As I wrote in more detail in another mail already, a standard SAX-like interface would allow different libraries or parts to be mixed by the user as needed. With existing libraries it is often the choice between either/or.

Robert Ramey

unread,
Mar 2, 2017, 11:28:23 AM3/2/17
to std-pr...@isocpp.org
On 3/2/17 7:51 AM, Daniel Frey wrote:
>> On 2 Mar 2017, at 10:16, ol...@join.cc wrote:
>>
>> Op woensdag 1 maart 2017 22:25:33 UTC+1 schreef Niels Lohmann:
>> There are currently dozens of libraries [json.org] written in C or C++ solving these aspects. However, it would be of great convenience to have JSON be part of the C++ standard library. In particular, the wide use of JSON as exchange format for structured data as well as to express simple configuration data would could solve a lot of use cases within the C++ standard library.
>>
>> What's the advantage of a standard-library over a third-party library (in this case), apart from not being a dependency?
>
> As I wrote in more detail in another mail already, a standard SAX-like interfacewould allow different libraries or parts to be mixed by the user as
needed. With existing libraries it is often the choice between either/or.
>

That's not the question. The questions are -

a) what's the advantage of having something like this in the standard vs
the current situation where one imports code from some other place?

b) adding it to the standard would require adding features, reconciling
differing points of view, a ton of work to code the spec, and time taken
from other (more important/urgent) tasks. How would this benefit the
C++ community?

c) Why do we all have to agree on what a json library (or any non core)
library should look like? What's the point of that?

d) what value are we adding here - as opposed just the current situation?

> As I wrote in more detail in another mail already, a standard
SAX-like interface would allow different libraries or parts to be mixed
by the user as needed. With existing libraries it is often the choice
between either/or.

So you want to design/create a better version than the "dozens" of JSON
libraries "out there". Why not just do it? Why do you need to involve
the standards committee in this?

Robert Ramey

Daniel Frey

unread,
Mar 2, 2017, 11:38:32 AM3/2/17
to std-pr...@isocpp.org
> On 2 Mar 2017, at 17:24, Robert Ramey <ra...@rrsd.com> wrote:
>
> So you want to design/create a better version than the "dozens" of JSON libraries "out there". Why not just do it? Why do you need to involve the standards committee in this?

a) I'm not the one who started the thread here, I just state my opinion and offer to help *if* some people want to work on standardizing something. For the record, Niels started the thread after being encouraged by STL on Reddit (and I think STL is not the only one who thinks that more larger libraries should be standardized).

b) I already work on such a library. I will continue to do so and if nothing gets standardized, that is also fine by me.

Nicol Bolas

unread,
Mar 2, 2017, 11:46:15 AM3/2/17
to ISO C++ Standard - Future Proposals, jmck...@gmail.com, d.f...@gmx.de

I haven't investigated JSON parsers, but Reader/StAX-style APIs are significant in the XML parsing world. C# lives by them; XMLReader and XMLWriter are the foundation of their XML processing systems. Even LibXML2 has a reader interface.

Robert Ramey

unread,
Mar 2, 2017, 11:55:56 AM3/2/17
to std-pr...@isocpp.org
On 3/2/17 8:38 AM, Daniel Frey wrote:
>> On 2 Mar 2017, at 17:24, Robert Ramey <ra...@rrsd.com> wrote:
>>
>> So you want to design/create a better version than the "dozens" of JSON libraries "out there". Why not just do it? Why do you need to involve the standards committee in this?
>
> a) I'm not the one who started the thread here, I just state my opinion and offer to help *if* some people want to work on standardizing something.

For the record, Niels started the thread after being encouraged by STL
on Reddit (and I think STL is not the only one who thinks that more
larger libraries should be standardized).

Hmmm - I'll have to talk to him about that!

>
> b) I already work on such a library. I will continue to do so and if nothing gets standardized, that is also fine by me.
>

I certainly don't want to discourage anyone from working on a library.
I just question that the goal of getting such a library in the standard
is the best one. The standards process is a long distraction for
something like this. It adds years to the effort. I think it's more
productive for authors of such libraries to have a goal of getting their
library used by the largest number of users. If the standards committee
want's to adopt some particular library as "standardization of existing
practice" fine. But to aim at standardization as a means to making the
library widely used I think puts the cart before horse.

Of course I'm not really talking about the JSON library here - it could
have been any one of a number of them. It just turns out that this is
perfect example of where I think the standards process goes wrong.

Robert Ramey

Nicol Bolas

unread,
Mar 2, 2017, 12:02:25 PM3/2/17
to ISO C++ Standard - Future Proposals
On Thursday, March 2, 2017 at 11:28:23 AM UTC-5, Robert Ramey wrote:
On 3/2/17 7:51 AM, Daniel Frey wrote:
>> On 2 Mar 2017, at 10:16, olaf wrote:
>>
>> Op woensdag 1 maart 2017 22:25:33 UTC+1 schreef Niels Lohmann:
>> There are currently dozens of libraries [json.org] written in C or C++ solving these aspects. However, it would be of great convenience to have JSON be part of the C++ standard library. In particular, the wide use of JSON as exchange format for structured data as well as to express simple configuration data would could solve a lot of use cases within the C++ standard library.
>>
>> What's the advantage of a standard-library over a third-party library (in this case), apart from not being a dependency?
>
> As I wrote in more detail in another mail already, a standard SAX-like interfacewould allow different libraries or parts to be mixed by the user as
needed. With existing libraries it is often the choice between either/or.
>

That's not the question. The questions are -

a) what's the advantage of having something like this in the standard vs
the current situation where one imports code from some other place?

What's the advantage of having anything in the standard library outside of "Table 19: C++ headers for freestanding implementations"? After all, the other stuff could just be a library you download. So why put it in the standard?

Because people need this stuff.

People need networking. People need containers. People need filesystem access. People need to spawn threads and deal with inter-thread communication. People need to match strings against regular expressions. And so on.

Having a standard way to do this stuff facilitates interoperability and allows people to use the language without having to rely on so many external libraries.

It is not unreasonable for the standard library to provide a tool that programmers commonly use to get things done.

b) adding it to the standard would require adding features, reconciling
differing points of view, a ton of work to code the spec, and time taken
from other (more important/urgent) tasks.  How would this benefit the
C++ community?

See above.
 
c) Why do we all have to agree on what a json library (or any non core)
library should look like?  What's the point of that?

What is "non core"? How do you define what is and is not "core"?

d) what value are we adding here - as opposed just the current situation?

That users can parse JSON without having to download a library. That users can have and rely on this functionality without having to fish around among dozens of different solutions of various quality, functionality, and behavior.

Daniel Frey

unread,
Mar 2, 2017, 12:41:10 PM3/2/17
to std-pr...@isocpp.org, jmck...@gmail.com
For some reason my mail client (MacOS Mail) screws up the quoting levels - sorry.
After reading the examples you linked, I still feel that JSON is different from XML. And the SAX-like approach I am currently using is more light-weight and can be used to feed a coroutine-based reader approach. I would like to explore this option with real code in the future, but for now I'll just keep it in the back of my head unless you could come up with something concrete :)

Also, you might have misunderstood the SAX approach wrt struct vs. concept. The struct I posted earlier implements the methods that the SAX interface (concept) requires. Depending on whether the "SAX consumer" (an actual class) benefits from, for example, the move-overloads it can drop those. Several of my consumers don't have them which also makes them easier. I linked examples of real-world SAX producers and consumers earlier, I hope the can clear up any possible confusion in this area.

Daniel

Robert Ramey

unread,
Mar 2, 2017, 12:48:49 PM3/2/17
to std-pr...@isocpp.org
On 3/2/17 9:02 AM, Nicol Bolas wrote:
> On Thursday, March 2, 2017 at 11:28:23 AM UTC-5, Robert Ramey wrote:
>
> On 3/2/17 7:51 AM, Daniel Frey wrote:
> >> On 2 Mar 2017, at 10:16, olaf wrote:
> >>
> >> Op woensdag 1 maart 2017 22:25:33 UTC+1 schreef Niels Lohmann:
> >> There are currently dozens of libraries [json.org
> <http://json.org>] written in C or C++ solving these aspects.
> However, it would be of great convenience to have JSON be part of
> the C++ standard library. In particular, the wide use of JSON as
> exchange format for structured data as well as to express simple
> configuration data would could solve a lot of use cases within the
> C++ standard library.
> >>
> >> What's the advantage of a standard-library over a third-party
> library (in this case), apart from not being a dependency?
> >
> > As I wrote in more detail in another mail already, a standard
> SAX-like interfacewould allow different libraries or parts to be
> mixed by the user as
> needed. With existing libraries it is often the choice between
> either/or.
> >
>
> That's not the question. The questions are -
>
> a) what's the advantage of having something like this in the
> standard vs
> the current situation where one imports code from some other place?
>
>
> What's the advantage of having /anything/ in the standard library
> outside of "Table 19: C++ headers for freestanding implementations"?
> After all, the other stuff could just be a library you download. So why
> put it in the standard?
>
> Because people need this stuff.

Right. People need lot's of stuff. It doesn't have to be in the standard.
>
> People need networking. People need containers.

> People need filesystem access.
> People need to spawn threads and deal with inter-thread
> communication.

Here I see the value and importance. The standards specifies a common
interface over variable implementations. It's necessary to have a
standard here for application portability. Note that I'm not against
the standards effort itself. I'm advocating for narrowing it's focus to
those important issues where the standards effort can really make a
difference.

> People need to match strings against regular expressions.
> And so on.

>
> Having a standard way to do this stuff facilitates interoperability and
> allows people to use the language without having to rely on so many
> external libraries.

Even if one imports <vector> he is relying on an "external" library.
There's no real guarantee that's better than some other version. Or
that it doesn't have a bug. It does have the guarantee that it meets the
standard interface so that's something.

Of course this raises the question that a compiler isn't considered
"conforming" if it doesn't include and implementation of the standard
library. (leave aside that it's doubtful that any compiler can actually
implement the standard correctly). To me we're coupling things where
the coupling doesn't add anything and has detrimental effects.

>
> It is not unreasonable for the standard library to provide a tool that
> programmers commonly use to get things done.
>
> b) adding it to the standard would require adding features, reconciling
> differing points of view, a ton of work to code the spec, and time
> taken
> from other (more important/urgent) tasks. How would this benefit the
> C++ community?
>
>
> See above.
>
>
> c) Why do we all have to agree on what a json library (or any non core)
> library should look like? What's the point of that?
>
>
> What is "non core"? How do you define what is and is not "core"?

non core would be something like JSON for which adding to the standard
doesn't add value.

>
> d) what value are we adding here - as opposed just the current
> situation?
>
>
> That users can parse JSON without having to download a library. That
> users can have and rely on this functionality without having to fish
> around among dozens of different solutions of various quality,
> functionality, and behavior.

Right. So now you download the library as part of your compiler.

The issues of quality, documentation, tests etc. still remain. Some
standard libraries are better than others. It's harder to mix/match
components between standard libraries when their distributed as
something monolithic..

There are a number of issues with third party libraries. But adding
every thing to the standard library doesn't really address them - rather
it hides them.

Robert Ramey


Michał Dominiak

unread,
Mar 2, 2017, 1:28:25 PM3/2/17
to std-pr...@isocpp.org
One particular advantage of having a library in the standard library that seems to be frequently overlooked is that having a thing in the standard guarantees at least a minimal amount of maintenance for both its interface and its implementations. A third party library's author can one day disappear and you might end up having to either find someone else to maintain that particular library you're using, or maintaining it yourself, or migrating to a different one. This is not an issue with the standard library (well, unless you heavily depend on <codecvt>, but that is not a thing, right?), since the interface will be maintained and updated for as long as the committee functions by the committee itself, and multiple implementations will exist in the wild for you to choose from.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Robert Ramey

unread,
Mar 2, 2017, 1:41:53 PM3/2/17
to std-pr...@isocpp.org
On 3/2/17 10:20 AM, Michał Dominiak wrote:
> One particular advantage of having a library in the standard library
> that seems to be frequently overlooked is that having a thing in the
> standard guarantees at least a minimal amount of maintenance for both
> its interface /and/ its implementations. A third party library's author
> can one day disappear and you might end up having to either find someone
> else to maintain that particular library you're using, or maintaining it
> yourself, or migrating to a different one. This is not an issue with the
> standard library (well, unless you heavily depend on <codecvt>, but that
> is not a thing, right?), since the interface will be maintained and
> updated for as long as the committee functions by the committee itself,
> and multiple implementations will exist in the wild for you to choose from.

Damn - I do depend on codecvt (more accurately codecvt facets) and it's
been a PITA for the last 15 years.

I appreciate the point. And generally maintenence by compiler providers
had been pretty good. On the other hand, I'm really wondering how they
are going to sustain that with incorporation of very large and elaborate
packages such as Ranges, ASIO and ?. Then what?
are we going to fall back on the original ASIO package? If I do that,
why not just depend on that in the first place.

Basically, I'm worried about

a) where are libraries going to come from.
b) how long is it going to take to "get them into the standard"
c) what happens when they need to be deprecated or fixed in some
fundamental way. For example, we've got problems with complexity of the
io streams. How is that going to evolve in the future?
d) who is going to keep maintenance of the giant things up to date?
e) and who is going to pay for it?

When something is successful, there is a natural tendency to expand it's
scope. Often times this ends up stifling the progress that one is
actually hoping to promote. Often times - less is more.

Robert Ramey

Nicol Bolas

unread,
Mar 2, 2017, 2:41:12 PM3/2/17
to ISO C++ Standard - Future Proposals, jmck...@gmail.com, d.f...@gmx.de
On Thursday, March 2, 2017 at 12:41:10 PM UTC-5, Daniel Frey wrote:
For some reason my mail client (MacOS Mail) screws up the quoting levels - sorry.

> On 2 Mar 2017, at 17:46, Nicol Bolas <jmck...@gmail.com> wrote:
>
> On Thursday, March 2, 2017 at 1:07:29 AM UTC-5, Daniel Frey wrote:
> > On 2 Mar 2017, at 06:52, Nicol Bolas <jmck...@gmail.com> wrote:
> >
> > Personally, I've always disliked the SAX approach, where reads are forced on you. I much prefer the reader approach, where you read at whatever pace you like. It's rather more like an iterator in that respect.
> >
> > Such an approach makes it very easy on the code doing the reading, since it can do things like pass the reader to other code to interpret those parameters. Doing this with a SAX interface requires basically writing a state machine into a type. And why would you want to encourage that kind of coding?
> >
> > You can get the same effects of a SAX interface with a Reader-like approach. Rather than standardizing the struct that interprets the JSON, you standardize the Reader as a Concept. A Reader has functions to read the next datum, tell you want that datum is, etc. And you can standardize a similar Writer Concept, which has interfaces for writing the data.
> >
> > So you get all of the benefits you outline, but without the huge detriment of using SAX to process stuff.
>
> I wonder if the SAX approach when combined with coroutines could provide the kind of decoupling you are looking for. We (the authors of taocpp/json and the PEGTL, which is the underlying parser library) are already waiting for coroutines to be able to play with pull-like interfaces instead of push-like interfaces. I do wonder about the efficiency, though. Anyways, for now, we are using the SAX-like interface and I'd like to see some real code of an alternative, as your above statement are too theoretical for me. Do you have a link to a library which uses such an approach?
>
> I haven't investigated JSON parsers, but Reader/StAX-style APIs are significant in the XML parsing world. C# lives by them; XMLReader and XMLWriter are the foundation of their XML processing systems. Even LibXML2 has a reader interface.

After reading the examples you linked, I still feel that JSON is different from XML.

SAX stands for "Simple API for XML". If one XML-based interface was good enough for JSON, why not the other? StAX-style interfaces would work just as well as SAX.

And the SAX-like approach I am currently using is more light-weight

How is StAX more heavy-weight? Indeed, I find that StAX is lower-level.

You can implement SAX APIs on top of an StAX API; doing the opposite is far more complex. Just look at what you said: "We [...] are already waiting for coroutines to be able to play with pull-like interfaces instead of push-like interfaces." Why would you wait for a complex feature like coroutines, instead of just making your parser work in a pull fashion?

And implementing SAX on top of StAX doesn't require coroutines or anything of the sort.

and can be used to feed a coroutine-based reader approach. I would like to explore this option with real code in the future, but for now I'll just keep it in the back of my head unless you could come up with something concrete :)

Also, you might have misunderstood the SAX approach wrt struct vs. concept. The struct I posted earlier implements the methods that the SAX interface (concept) requires.

I have a working understanding of SAX-style interfaces. My problem with SAX-style interfaces is that it forces you to implement a state machine, because all data is routed through a single object.

Pull APIs allow you to structure your code in a way that is structured. You get in a value. You test what it is. You then pass call a function to process that kind of data, passing it the StAX reader. That function reads more values, recursively descending through the JSON in a manner reminiscent of a recursive descent parser.

Push APIs require you to route all of this processing work through a single object. And therefore, it has to store which state its currently on, so that it can send that state the data in question. It's just a big pain to deal with, one which doesn't use the C++ stack effectively.

If you're parsing tiny JSON fragments, then such an approach is probably OK. But if you're parsing anything of particular size and/or complexity, it becomes increasingly unwieldy.

Daniel Frey

unread,
Mar 2, 2017, 3:01:12 PM3/2/17
to std-pr...@isocpp.org, jmck...@gmail.com
> On 2 Mar 2017, at 20:41, Nicol Bolas <jmck...@gmail.com> wrote:
>
> I have a working understanding of SAX-style interfaces. My problem with SAX-style interfaces is that it forces you to implement a state machine, because all data is routed through a single object.

No.

> Pull APIs allow you to structure your code in a way that is structured. You get in a value. You test what it is. You then pass call a function to process that kind of data, passing it the StAX reader. That function reads more values, recursively descending through the JSON in a manner reminiscent of a recursive descent parser.

A pull interface like the one you linked as an example needs to track state of the current path, etc. and thus has a lot more overhead than the code I have right now.

> Push APIs require you to route all of this processing work through a single object. And therefore, it has to store which state its currently on, so that it can send that state the data in question. It's just a big pain to deal with, one which doesn't use the C++ stack effectively.

Wrong. Have you even looked at the examples I linked?

Nicol Bolas

unread,
Mar 2, 2017, 3:37:45 PM3/2/17
to ISO C++ Standard - Future Proposals, jmck...@gmail.com, d.f...@gmx.de
On Thursday, March 2, 2017 at 3:01:12 PM UTC-5, Daniel Frey wrote:
> On 2 Mar 2017, at 20:41, Nicol Bolas <jmck...@gmail.com> wrote:
>
> I have a working understanding of SAX-style interfaces. My problem with SAX-style interfaces is that it forces you to implement a state machine, because all data is routed through a single object.

No.

Um... yes? Your examples show precisely that interface. Every time the parser reaches a new JSON value, you get a function call in that object. And therefore, it must forward that information to the place where it will actually be processed.
 

> Pull APIs allow you to structure your code in a way that is structured. You get in a value. You test what it is. You then pass call a function to process that kind of data, passing it the StAX reader. That function reads more values, recursively descending through the JSON in a manner reminiscent of a recursive descent parser.

A pull interface like the one you linked as an example needs to track state of the current path, etc. and thus has a lot more overhead than the code I have right now.

Current path to what? Pull APIs don't need to "track" anything except where it currently is in the file/string/data stream/etc.

I've actually written an XML Writer (in Lua), and the only state it had was a stack of element names. And that's only because XML requires you to put element names in the closing tag. For JSON, it wouldn't even have that; it'd be virtually stateless, aside from any of the pretty-printng stuff.

> Push APIs require you to route all of this processing work through a single object. And therefore, it has to store which state its currently on, so that it can send that state the data in question. It's just a big pain to deal with, one which doesn't use the C++ stack effectively.

Wrong. Have you even looked at the examples I linked?

If you're referring to this post, those are examples of trivial things that don't need state, or whose state is trivial. "stringify" and "prettify" do the same operation on the same things, no matter where they appear in the JSON hierarhcy. Yes, SAX shines in such circumstances.

But these are not useful circumstances for parsing JSON files. Generally speaking, the way you interpret data depends on how that data was interpreted before you.

Show me what your code would have to look like to be able to parse a JSON langauge of decent complexity, like glTF. I guarantee you that you're going to have to build some form of state machine.

Tony V E

unread,
Mar 2, 2017, 4:26:47 PM3/2/17
to Robert Ramey
What is 'core':

- vocabulary types. That is, types that are likely to be used as building blocks for other APIs. If we didn't all use the same unique_ptr, then my API would need to invent one way to express "you own this now", and your API another. Vocabulary types include: unique/shared_ptr, variant, optional,any,expected, the base exceptions, iterators (for template APIs at least - a ABI stable type erased iterator could also be useful) string, string_view, etc. 

I would add vector, although most would put it in the next category. When you want to take or return a 'bunch' of items, we have no good vocabulary type - you either do iterators (forcing templates), or callbacks/visitors/lambdas, or coroutines...? or just use vector, since that's what your client is probably using anyhow.

- common/generic building blocks - containers like vector and (unordered_)map. And the generic algorithms. These are building blocks - they don't solve your problem, but they are pieces used to solve a wide variety of problems. They are not domain specific. 

- compiler specific / platform specific. Intrinsic type_traits such as is_union, etc. ‎Platform things like threads. These are required to avoid writing compiler specific or platform spcific code. 

I think that may be it. (I probably forgot something). That's core.

Having said that, the other question is:
Should the STL only do core?

Robert has some good reasons for 'yes'. I agree. 
There are also good reasons for 'no':
1. other languages ship with more in the box
2. we have no good or (de facto) standard way to manage library packages.
3. Picking the right non-standard library takes time - just try starting a new Web project - with a new UI paradigm each week, there are too many choices - literally too many to be able to make an informed choice.

If we could do 2, that might lessen the need for 1.
Boost helps with 3. And 2 somewhat (and thus 1)

Sent from my BlackBerry portable Babbage Device
  Original Message  
From: Robert Ramey
Sent: Thursday, March 2, 2017 1:41 PM
To: std-pr...@isocpp.org
Reply To: std-pr...@isocpp.org
Subject: [std-proposals] Re: JSON support for the C++ standard library
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/59e8bbb1-c45d-dca0-97bd-3ae8dea49959%40rrsd.com.

Daniel Frey

unread,
Mar 2, 2017, 4:57:05 PM3/2/17
to Nicol Bolas, ISO C++ Standard - Future Proposals

> On 2 Mar 2017, at 21:37, Nicol Bolas <jmck...@gmail.com> wrote:
>
> On Thursday, March 2, 2017 at 3:01:12 PM UTC-5, Daniel Frey wrote:
> > > On 2 Mar 2017, at 20:41, Nicol Bolas <jmck...@gmail.com> wrote:
> > >
> > > I have a working understanding of SAX-style interfaces. My problem with SAX-style interfaces is that it forces you to implement a state machine, because all data is routed through a single object.
>
> > No.
>
> Um... yes? Your examples show precisely that interface. Every time the parser reaches a new JSON value, you get a function call in that object. And therefore, it must forward that information to the place where it will actually be processed.

The "No" refers to "forces you to implement a state machine". You make a general claim that is simply not true. In some use-cases you might need a state machine, and in those use-cases the StAX approach might work better. But it's a trade-off, not a clear win. You are pessimizing other use-cases where the intermediate generic value (and a switch on the type) is not needed. An example would be a pretty printer. Reading a file generates SAX events, passed directly to the SAX pretty printer. No json::value instance (or whatever the reader in the StAX case stores as its state) is created/updated.

> > > Pull APIs allow you to structure your code in a way that is structured. You get in a value. You test what it is. You then pass call a function to process that kind of data, passing it the StAX reader. That function reads more values, recursively descending through the JSON in a manner reminiscent of a recursive descent parser.

You are thinking of a use-case where you are not creating/using a generic JSON value, but a JSON value with a specific schema and you are skipping the JSON level, directly using the structure you are expecting, right? I can see how the StAX approach is helpful there, but...

> If you're referring to this post, those are examples of trivial things that don't need state, or whose state is trivial. "stringify" and "prettify" do the same operation on the same things, no matter where they appear in the JSON hierarhcy. Yes, SAX shines in such circumstances.
>
> But these are not useful circumstances for parsing JSON files. Generally speaking, the way you interpret data depends on how that data was interpreted before you.

...you are the one who defines for other what are "useful" circumstances?? You can't just declare everyone else's use-cases as "not useful" and only your own as "useful". For me, your use-case never came up. In fact the most important use-case to me is:

* Construct json::value instances with an std::initializer_list directly in the code. (More information was in the presentation I linked earlier)
* Stringify the value.
* Efficiency - it must be *very* fast.

I don't think a StAX approach would benefit me.

> Show me what your code would have to look like to be able to parse a JSON langauge of decent complexity, like glTF. I guarantee you that you're going to have to build some form of state machine.

Yes, sure, but that is your use-case - not mine. Show me the StAX-style code (and benchmarks) of my use-case, will you?

Domen Vrankar

unread,
Mar 2, 2017, 5:29:09 PM3/2/17
to std-pr...@isocpp.org
Unfortunately you need to have balance between larger libraries and primitives in the standard in order for a language to grow in use.

You produced a valid list but unfortunately other languages come with larger standard libraries out of the box and allot of hype mixed with some truth regarding better equipped languages for a task can produce an even larger list of worries of what will happen over time.
And yes it's annoying that from the looks of it in the development world hype is far more influential than quality and truth but that's the mess we're in so I'm guessing that either the standard library will grow over time or the hype will distort the truth so much that C++ will really just become a language without support. I might be wrong but my experience over the years are telling me that I'm right...

Regarding external libraries - at the company where I work there was a veto on Boost libraries already before my time and the ban still stands (argument was that requiring to patch Boost before compiling it with xlC_r IBM compiler on AIX is not stable enough). In the end in my case it's easier to get a newer version of the compiler than to get a new library into a project (and IBM with poorly implemented C++11 support in their compiler on AIX surely doesn't help when arguing for the language). That's the state of things from where I stand.

In the end our company migrated most of the software to Java mostly due to hype and the promise of allot of extra standard library features which are mostly not used (and hasn't changed nearly as much as the hype promised but people don't like to admit such things). Years later I was told that the hype over the years caused that C++ people are harder to find than Java in our country so that was possibly the main reason for the migration... All I know for certain is that most of the conversations when deciding whether to write something in C++ or Java boils down to java-has-more-standard-library-features-so-its-better (it's surprising how people like to agree with more is better statement) so seeing C++ as a low level language only doesn't help anyone.

Regarding who's going to pay... Well either compiler vendors will bother as they do with languages such as Java or they'll stop bothering no matter what as they'll be able to sell more compilers for other languages with larger libraries - standing still only makes things worse over time.

That being said while I don't agree that Ranges and ASIO fall into non-essential basket I do admit that while my job and arguments would be allot easier by having an XML and Json library integrated into the language I would instead prefer a Boost Spirit Qi/X3 and Karma style of library in the standard (specially if it were implemented via pattern matching integrated into the language so that I'd be able to write/generate my own grammar/schema specific pattern matching/generating code).

But as a less generic alternative a Json library would still be a nice start non the less (it's still a basic building block for communication, configuration, data storing etc.).

Regards,
Domen



Nicol Bolas

unread,
Mar 3, 2017, 2:49:05 AM3/3/17
to ISO C++ Standard - Future Proposals, jmck...@gmail.com, d.f...@gmx.de
On Thursday, March 2, 2017 at 4:57:05 PM UTC-5, Daniel Frey wrote:
> On 2 Mar 2017, at 21:37, Nicol Bolas <jmck...@gmail.com> wrote:
>
> On Thursday, March 2, 2017 at 3:01:12 PM UTC-5, Daniel Frey wrote:
> > > On 2 Mar 2017, at 20:41, Nicol Bolas <jmck...@gmail.com> wrote:
> > >
> > > I have a working understanding of SAX-style interfaces. My problem with SAX-style interfaces is that it forces you to implement a state machine, because all data is routed through a single object.
>
> > No.
>
> Um... yes? Your examples show precisely that interface. Every time the parser reaches a new JSON value, you get a function call in that object. And therefore, it must forward that information to the place where it will actually be processed.

The "No" refers to "forces you to implement a state machine". You make a general claim that is simply not true. In some use-cases you might need a state machine, and in those use-cases the StAX approach might work better. But it's a trade-off, not a clear win. You are pessimizing other use-cases where the intermediate generic value (and a switch on the type) is not needed. An example would be a pretty printer.

The problem with tossing that example around is that it is trivial. It's so trivial that you could write it without even using a framework; just take the JSON-C++ objects and walk the hierarchy, outputting to the stream as you go. Seriously, your `to_stream` example is maybe an hour's worth of coding, assuming you have a JSON data structure in C++.

Your argument seems to be that SAX makes easy things easier. You're right; it does. But it also makes hard things harder. Why should we optimize the infrastructure for tasks so easy that they barely justify that infrastructure, rather than for the hard tasks where people actually need that JSON infrastructure?

Reading a file generates SAX events, passed directly to the SAX pretty printer. No json::value instance (or whatever the reader in the StAX case stores as its state) is created/updated.

But it does create that data. Just not in a visible or obvious way.

A SAX-style parser puts that data on the stack when it calls one of your functions, since the value is now a parameter. A StAX-reader puts that data in the object. Since readers will, more likely than not, be stack objects, their members will be on the C++ stack. Just like the function parameter.

The only difference is that StAX readers function as a variant of sorts. So they have to store a byte that tells what kind of data is in the variant. For SAX parsers, that's handled implicitly.

One byte is not much.

It should also be noted that SAX for parsing has other problems. Specifically, it's terrible at handling errors. The only thing it can do is throw; by contrast, a StAX reader is capable of using error codes/`expected` or various other mechanisms to represent malformed input.

> > > Pull APIs allow you to structure your code in a way that is structured. You get in a value. You test what it is. You then pass call a function to process that kind of data, passing it the StAX reader. That function reads more values, recursively descending through the JSON in a manner reminiscent of a recursive descent parser.

You are thinking of a use-case where you are not creating/using a generic JSON value, but a JSON value with a specific schema and you are skipping the JSON level, directly using the structure you are expecting, right? I can see how the StAX approach is helpful there, but...

> If you're referring to this post, those are examples of trivial things that don't need state, or whose state is trivial. "stringify" and "prettify" do the same operation on the same things, no matter where they appear in the JSON hierarhcy. Yes, SAX shines in such circumstances.
>
> But these are not useful circumstances for parsing JSON files. Generally speaking, the way you interpret data depends on how that data was interpreted before you.

...you are the one who defines for other what are "useful" circumstances?? You can't just declare everyone else's use-cases as "not useful" and only your own as "useful".

My issue is not merely one of "useful" vs. "not useful". It's a question of "easy" vs. "hard." And you cannot deny that `stringify` and `prettify` are far easier than "process glTF", as far as tasks go.

For me, your use-case never came up. In fact the most important use-case to me is:

* Construct json::value instances with an std::initializer_list directly in the code. (More information was in the presentation I linked earlier)
* Stringify the value.
* Efficiency - it must be *very* fast.

I see no reason why the StAX approach would be any slower than the SAX approach.

I don't think a StAX approach would benefit me.

And I'm quite sure a SAX approach would not benefit me. The difference is that an StAX approach would not harm you. Whereas a SAX approach very much would harm me. To you, StAX is a lateral move; to me, it's the difference between sane code and a lot of boilerplate.

It should also be noted that SAX, in its original XML definition, doesn't have a "writing" form. SAX is purely a parsing mechanism; generating textual XML, or JSON in this case, is not an operation that SAX handles.

What you're really talking about with your SAX "writer" is JSON visitation, a recursive-descent pass over a JSON C++ data structure which gets called appropriately. Now, given that JSON defines a recursive variant data structure, recursive visitation of that data structure is a perfectly legitimate operation. But this should not be considered the same sort of thing as parsing or writing. While obviously you can use it to do writing, that's merely one application of recursive visitation.

So I say we should have the following tools:

1: A JSON DOM (for want of a better term): A C++ data structure that represents the recursive variant concept defined by JSON.

2: The ability to apply a recursive visitor over a JSON DOM. Perhaps even with the ability to change the visitation object at different levels, but that's more of a wish-list feature than a requirement.

3: A StAX-style reader for processing text that is formatted in accord with JSON. The purpose of this object is to read data from a JSON file, typically into arbitrary data structures rather than the JSON DOM.

4: A StAX-style writer for generating text from JSON data. This is not something you can directly manipulate formatting issues like pretty printing and the like. But it should have options for that sort of thing. The purpose of this object is to write data to a JSON file, where the source data is not in the JSON DOM data structure.

5: A function that takes a reader and builds a JSON DOM of the data being read.

Olaf van der Spek

unread,
Mar 3, 2017, 3:45:40 AM3/3/17
to ISO C++ Standard - Future Proposals, d.f...@gmx.de
Op donderdag 2 maart 2017 16:51:10 UTC+1 schreef Daniel Frey:
Standard interfaces are nice to have but that doesn't mean it has to be part of ISO C++.. 

Olaf van der Spek

unread,
Mar 3, 2017, 4:01:44 AM3/3/17
to ISO C++ Standard - Future Proposals
Op donderdag 2 maart 2017 23:29:09 UTC+1 schreef Domen Vrankar:
Regarding external libraries - at the company where I work there was a veto on Boost libraries already before my time and the ban still stands (argument was that requiring to patch Boost before compiling it with xlC_r IBM compiler on AIX is not stable enough). In the end in my case it's easier to get a newer version of the compiler than to get a new library into a project (and IBM with poorly implemented C++11 support in their compiler on AIX surely doesn't help when arguing for the language). That's the state of things from where I stand.

If IBM can't be bothered to implement C++11 properly what's the chance they'll bother implementing a much larger standard library properly?

olafv...@gmail.com

unread,
Mar 3, 2017, 4:27:57 AM3/3/17
to ISO C++ Standard - Future Proposals


Op donderdag 2 maart 2017 22:26:47 UTC+1 schreef Tony V E:
I think that may be it. (I probably forgot something). That's core.

Having said that, the other question is:
Should the STL only do core?

Robert has some good reasons for 'yes'. I agree. 
There are also good reasons for 'no':
1. other languages ship with more in the box
2. we have no good or (de facto) standard way to manage library packages.
3. Picking the right non-standard library takes time - just try starting a new Web project - with a new UI paradigm each week, there are too many choices - literally too many to be able to make an informed choice.

If we could do 2, that might lessen the need for 1.
Boost helps with 3. And 2 somewhat (and thus 1)

Including some libraries in the standard isn't solving #2.
Most (if not all) programs require third-party libraries, so #2 is a problem worth working on.

Other languages do ship with more in the box but that box is not an ISO standard. What box are we talking about anyway?

I think #2 is more of a problem of library development, maintenance, especially long term, and distribution.

Daniel Frey

unread,
Mar 3, 2017, 12:01:12 PM3/3/17
to std-pr...@isocpp.org
> On 3 Mar 2017, at 08:49, Nicol Bolas <jmck...@gmail.com> wrote:
>
> The problem with tossing that example around is that it is trivial. It's so trivial that you could write it without even using a framework; just take the JSON-C++ objects and walk the hierarchy, outputting to the stream as you go. Seriously, your `to_stream` example is maybe an hour's worth of coding, assuming you have a JSON data structure in C++.

Great, so what I do is "trivial" and "maybe an hour's worth of coding". You enjoy judging and disregarding other people's efforts?

It is far from trivial to come up with the right interface and the right implementation. Why do we have element() and member() methods in the SAX interface? Why key() when we already have string()? Those are counter-measures against state tracking. How many other libraries track the state with a single boolean for the stringify consumer? And get it right with arbitrary nesting? But that might just be me, for you it's trivial and maybe an hour's work.

Also, just because it's trivial does not mean it's not important. In fact, stringify is a very fundamental thing in the JSON world. Feeding log data to an ELK stack? Stringify. Writing a REST server? Stringify. My goal is to make the simple things simple and efficient, while making the hard things possible. I am *not* trying to make the hard things harder on purpose. If I can make them simpler, fine, but not when it compromises the simplicity and efficiency of the simple cases. You seem to be willing to trade, or maybe you don't see it. But:

> A SAX-style parser puts that data on the stack when it calls one of your functions, since the value is now a parameter.

The data is a double, a std::int64_t, a boolean or maybe a std::string.

> A StAX-reader puts that data in the object. Since readers will, more likely than not, be stack objects, their members will be on the C++ stack. Just like the function parameter.
>
> The only difference is that StAX readers function as a variant of sorts. So they have to store a byte that tells what kind of data is in the variant. For SAX parsers, that's handled implicitly.
>
> One byte is not much.

It is an enormous different to me. You need that one byte and you need to hold a value of different types (union, std::variant, ...). And always switch on the type in the consumer. This is unneeded overhead for a lot of use-cases, this is where you are pessimizing simple use-cases in order to make the harder use-cases more simple. Write a JSON library using your approach and benchmark it.

> My issue is not merely one of "useful" vs. "not useful". It's a question of "easy" vs. "hard." And you cannot deny that `stringify` and `prettify` are far easier than "process glTF", as far as tasks go.

I strongly object to the notion that a hard use-case justifies making "stringify/prettify" more complex/inefficient. See above.

> What you're really talking about with your SAX "writer" is JSON visitation, a recursive-descent pass over a JSON C++ data structure which gets called appropriately. Now, given that JSON defines a recursive variant data structure, recursive visitation of that data structure is a perfectly legitimate operation. But this should not be considered the same sort of thing as parsing or writing. While obviously you can use it to do writing, that's merely one application of recursive visitation.

I am well aware that there are a lot more things that qualify as SAX producers and SAX consumers. How do you even arrive at such "conclusions"??

> So I say we should have the following tools:
>
> 1: A JSON DOM (for want of a better term): A C++ data structure that represents the recursive variant concept defined by JSON.
>
> 2: The ability to apply a recursive visitor over a JSON DOM. Perhaps even with the ability to change the visitation object at different levels, but that's more of a wish-list feature than a requirement.

That wouldn't even clash with *also* having a SAX-visitor. If you already have a JSON DOM object, attach a StAX reader to it and use it for your hard use-cases. I *might* even add a reader to our library for that, but don't hold your breath.

> 3: A StAX-style reader for processing text that is formatted in accord with JSON. The purpose of this object is to read data from a JSON file, typically into arbitrary data structures rather than the JSON DOM.
>
> 4: A StAX-style writer for generating text from JSON data. This is not something you can directly manipulate formatting issues like pretty printing and the like. But it should have options for that sort of thing. The purpose of this object is to write data to a JSON file, where the source data is not in the JSON DOM data structure.
>
> 5: A function that takes a reader and builds a JSON DOM of the data being read.

Finally, I don't think this will lead to anything here and it is apparently way to early for standardization. I'll stop discussing JSON libraries on this list now, feel free to send pull request for my library or, even better, create your own. Instead of claiming your approach is better, let code speak.

Thiago Macieira

unread,
Mar 3, 2017, 11:33:42 PM3/3/17
to std-pr...@isocpp.org
Em quinta-feira, 2 de março de 2017, às 08:46:15 PST, Nicol Bolas escreveu:
> I haven't investigated JSON parsers, but Reader/StAX-style APIs are
> significant in the XML parsing world. C# lives by them; XMLReader
> <https://msdn.microsoft.com/en-us/library/system.xml.xmlreader(v=vs.110).asp
> x> and XMLWriter
> <https://msdn.microsoft.com/en-us/library/system.xml.xmlwriter(v=vs.110).asp
> x> are the foundation of their XML processing systems. Even LibXML2 has a
> reader interface <http://xmlsoft.org/examples/index.html#xmlReader>.

Not to mention QXmlStreamReader and QXmlStreamWriter.

Thiago Macieira

unread,
Mar 3, 2017, 11:45:18 PM3/3/17
to std-pr...@isocpp.org
Em quinta-feira, 2 de março de 2017, às 23:49:05 PST, Nicol Bolas escreveu:
> 1: A JSON DOM (for want of a better term): A C++ data structure that
> represents the recursive variant concept defined by JSON.

Prior art: QJsonDocument, QJsonObject, QJsonArray, QJsonValue:

http://doc.qt.io/qt-5/qjsondocument.html

Given that the document that QJsonDocument refers to is known to be well-
formed, there is very little need to do error checking. You can easily access
any sub-object in the tree, and if one of the parents doesn't exist, you
simply end up with an invalid-state (not the same as null) QJsonValue.

> 2: The ability to apply a recursive visitor over a JSON DOM. Perhaps even
> with the ability to change the visitation object at different levels, but
> that's more of a wish-list feature than a requirement.

Sorry, can't help. We don't use that idiom in Qt, so I have nothing to offer.

> 3: A StAX-style reader for processing text that is formatted in accord with
> JSON. The purpose of this object is to read data from a JSON file,
> typically into arbitrary data structures rather than the JSON DOM.
>
> 4: A StAX-style writer for generating text from JSON data. This is not
> something you can directly manipulate formatting issues like pretty
> printing and the like. But it should have options for that sort of thing.
> The purpose of this object is to write data to a JSON file, where the
> source data is not in the JSON DOM data structure.

Not available in QJsonDocument, but I have an equivalent one for CBOR. It's in
C, though, but extremely small. And I mean *extremely*: compiled with -Os, the
parser is 3kB and the writer is 1kB. See https://01org.github.io/tinycbor/0.4/

> 5: A function that takes a reader and builds a JSON DOM of the data being
> read.

QJsonDocument has all of that internal.

Thiago Macieira

unread,
Mar 3, 2017, 11:53:43 PM3/3/17
to std-pr...@isocpp.org
Em quinta-feira, 2 de março de 2017, às 10:20:17 PST, Michał Dominiak
escreveu:
> One particular advantage of having a library in the standard library that
> seems to be frequently overlooked is that having a thing in the standard
> guarantees at least a minimal amount of maintenance for both its interface
> *and* its implementations. A third party library's author can one day
> disappear and you might end up having to either find someone else to
> maintain that particular library you're using, or maintaining it yourself,
> or migrating to a different one. This is not an issue with the standard
> library (well, unless you heavily depend on <codecvt>, but that is not a
> thing, right?), since the interface will be maintained and updated for as
> long as the committee functions by the committee itself, and multiple
> implementations will exist in the wild for you to choose from.

I can turn the argument around: one of the reasons the Standard Library is
well-maintained is that it is small and limited. If we start adding things to
it, its maintenance will grow and some parts of it may become less maintained.

As you've noted, parts of the Standard Library are already like that.

And what if JSON becomes less important in 5 years' time? It wasn't that long
ago that XML was more important. Remember when SOAP was important?

Bjorn Reese

unread,
Mar 4, 2017, 7:14:01 AM3/4/17
to std-pr...@isocpp.org
On 03/01/2017 10:25 PM, Niels Lohmann wrote:

> I would be willing to draft a proposal based on the experience I made with my C++ JSON library [nlohmann/json]. Of course, I would be interested in your thoughts on this.

Your suggestion is going to introduce many concepts. The bulk of your
suggestion is to introduce a dynamic variable (given your emphasis on
JSON, I assume that you want to model the JavaScript dynamic variable.)

Dynamic variables introduces data structures such as heterogenous
arrays, heterogenous maps, and hierarchical containers (nested
heteregenous arrays/maps.)

The heterogenous nature of dynamic variables means that illegal
operations (e.g. appending a string to a bool) have to be handled at
run-time instead of at compile-time. Should traversal be done with
iterators or visitors? How does it integrate with std algorithms?

The hierarchical nature of dynamic variables means that they can act
as trees, which immediately raises the question why we have no tree
structures in C++ despite proposals like N3700?

Then there is the topic of JSON serialization/deserialization. You
suggest a simple DOM-like parser/generator, and others have been quick
to point out that a SAX-like parser is missing. I side with Nicol on
this. A better starting point is a pull parser (reader). Pull parsers
are like (GOF) iterators where the user explicitly requests the next
token.

* Pull parsers can use lazy conversion, meaning that values are only
converted into C++ basic types (e.g. integers or strings) when the
user explicity requests it. That way we can create applications
like JSON prettifiers that never converts anything.

* Pull parsers can be used in parser combinators.

* Pull parsers are easy to integrate into serialization frameworks
such as Boost.Serialization or Cereal. This gives us the ability
to parse JSON directly into custom C++ data structures without
having to go through a dynamic variable.

* Push (SAX) parsers can be build from pull parsers quite easily.

* Full (DOM) parsers can be build from either pull or push parsers
fairly easy.

You can find such a JSON pull parser (with Boost.Serialization
integration) at

https://github.com/breese/trial.protocol
http://breese.github.io/trial/protocol/trial_protocol/json.html

The above project also contains a dynamic variable (still work in
progress, hence undocumented) at:


https://github.com/breese/trial.protocol/tree/feature/dynamic/include/trial/protocol/dynamic

It appears that you, Daniel, and I have arrived independently at many
of the same solutions for the dynamic variable (my work took its offset
in the C++03 dynamic::var by Ferruccio Barletta), but we deviate
substantially on the parser side.

In summary, I believe that your suggestion is going to be larger than
you may anticipate now, so I recommend that it should be split into two:

1. A proposal for dynamic variables, which should be completely
independent on JSON (albeit JSON is going to figure prominently in
the motivation section.)

2. A proposal for JSON parsing/generation.

Thiago Macieira

unread,
Mar 4, 2017, 11:26:59 AM3/4/17
to std-pr...@isocpp.org
Em sábado, 4 de março de 2017, às 04:23:05 PST, Bjorn Reese escreveu:
> * Pull parsers can use lazy conversion, meaning that values are only
> converted into C++ basic types (e.g. integers or strings) when the
> user explicity requests it. That way we can create applications
> like JSON prettifiers that never converts anything.

Pull parsers can also operate in zero-copy mode and most JSON strings can
benefit from that, though the API needs to offer an option to unescape a
possibly-encoded string. One just has to be careful in designing the API so
that users will default to unescaping the string unless they specifically
choose the zero-copy mode to get the escaped data.

Another big benefit is that the API paradigm can be reused for other, binary
formats, where zero-copy is even more interesting.

Bjorn Reese

unread,
Mar 4, 2017, 12:25:04 PM3/4/17
to std-pr...@isocpp.org
On 03/04/2017 05:26 PM, Thiago Macieira wrote:

> Pull parsers can also operate in zero-copy mode and most JSON strings can
> benefit from that, though the API needs to offer an option to unescape a
> possibly-encoded string. One just has to be careful in designing the API so
> that users will default to unescaping the string unless they specifically
> choose the zero-copy mode to get the escaped data.

I have no dereferencing operator* in my API for that reason. Instead
there are two differently named getters. The zero-copy getter is called
reader::literal() and returns a string_view. The converting getter,
called reader::value<R>(), converts the content to an object of type R
and returns that.

> Another big benefit is that the API paradigm can be reused for other, binary
> formats, where zero-copy is even more interesting.

Absolutely agree.

Matthew Woehlke

unread,
Mar 6, 2017, 2:46:17 PM3/6/17
to std-pr...@isocpp.org
On 2017-03-03 12:01, Daniel Frey wrote:
> On 3 Mar 2017, at 08:49, Nicol Bolas <jmck...@gmail.com> wrote:
>> The problem with tossing that example around is that it is trivial.
>> It's so trivial that you could write it without even using a
>> framework; just take the JSON-C++ objects and walk the hierarchy,
>> outputting to the stream as you go. Seriously, your `to_stream`
>> example is maybe an hour's worth of coding, assuming you have a
>> JSON data structure in C++.
>
> Great, so what I do is "trivial" and "maybe an hour's worth of
> coding". You enjoy judging and disregarding other people's efforts?

Processing JSON from an arbitrary source to convert it to a
pretty-printed string, *without an intermediary representation*, is
useful to... jsonlint. I'll certainly challenge you to name another.

The majority of applications that consume data from JSON are going to
expect that data to conform to some sort of schema in order to map to
some internal data structure. A SAX-style parser for these uses — *which
are the majority* — is clunky at best. If they don't just read the
entire input data as a JSON object¹ and deal with it as a structured C++
object once it's in memory, a pull-style API is much more usable.

(¹ The library implementation is free to use SAX for that, but that's an
*implementation detail*, not the API that consumers are expected to use.)

And yes, writing a pretty-printer (on top of an existing API) is
trivial. It's the parsing equivalent of "hello world". (Note that that
comment was not in reference to writing the implementation of the
SAX-like API, it was in reference to writing the just-about-only program
for which said API is not a poor fit.)

> Also, just because it's trivial does not mean it's not important.

True, but that functionality is almost certainly part of the library,
not something that consumers have to write themselves. Even if that's
not the case, SAX-like parsing has nothing to do with traversing an
existing in-memory representation, which is what you are almost always
doing when you want to serialize JSON (as a string or otherwise).

I repeat: just about the only time you will ever want to convert from an
input stream directly to a JSON string, *without* using the data in some
application-specific internal representation in between those steps, is
when you are jsonlint.

> My goal is to make the simple things simple and efficient, while
> making the hard things possible. I am *not* trying to make the hard
> things harder on purpose. If I can make them simpler, fine, but not
> when it compromises the simplicity and efficiency of the simple
> cases. You seem to be willing to trade.

I am certainly willing to trade simplicity of the cases that I actually
care about for simplicity of cases I don't! You seem to be focused on
making the simplest case, which only 0.01% of users will actually use,
as fast as possible, even though it makes life much harder for 99.99% of
users.

That's nice for doing something "because you can", as a benchmark, as a
proof of concept, etc., but it's not useful for a library meant to be
used by many people.

>> My issue is not merely one of "useful" vs. "not useful". It's a
>> question of "easy" vs. "hard." And you cannot deny that `stringify`
>> and `prettify` are far easier than "process glTF", as far as tasks
>> go.
>
> I strongly object to the notion that a hard use-case justifies making
> "stringify/prettify" more complex/inefficient.

I strongly object to the notion that optimization of a rare use case
justifies making the common use case more complex. I'd rather have a
library that is a little slower but makes it easy to write readable code
than a library that is faster but makes it much harder to write useful
code. In much the same manner, I prefer to write programs in C++ (or
even Python), rather than assembly.

--
Matthew

Daniel Frey

unread,
Mar 6, 2017, 2:59:14 PM3/6/17
to std-pr...@isocpp.org

> On 6 Mar 2017, at 20:46, Matthew Woehlke <mwoehlk...@gmail.com> wrote:
>
> On 2017-03-03 12:01, Daniel Frey wrote:
>> On 3 Mar 2017, at 08:49, Nicol Bolas <jmck...@gmail.com> wrote:
>>> The problem with tossing that example around is that it is trivial.
>>> It's so trivial that you could write it without even using a
>>> framework; just take the JSON-C++ objects and walk the hierarchy,
>>> outputting to the stream as you go. Seriously, your `to_stream`
>>> example is maybe an hour's worth of coding, assuming you have a
>>> JSON data structure in C++.
>>
>> Great, so what I do is "trivial" and "maybe an hour's worth of
>> coding". You enjoy judging and disregarding other people's efforts?
>
> Processing JSON from an arbitrary source to convert it to a
> pretty-printed string, *without an intermediary representation*, is
> useful to... jsonlint. I'll certainly challenge you to name another.

Please at least read what we were talking about (you even quoted it above).

The example was not about taking an input stream and pretty-printing it without an intermediate representation, it was about walking a JSON-C++ object and stringify/prettify it to a stream. And as I wrote, the use-cases in my case are REST servers and feeding data to an ELK stack.

Matthew Woehlke

unread,
Mar 6, 2017, 3:05:36 PM3/6/17
to std-pr...@isocpp.org
On 2017-03-02 16:26, Tony V E wrote:
> Should the STL only do core?
>
> Robert has some good reasons for 'yes'. I agree.
> There are also good reasons for 'no':
> 1. other languages ship with more in the box
> 2. we have no good or (de facto) standard way to manage library packages.
> 3. Picking the right non-standard library takes time.
>
> If we could do 2, that might lessen the need for 1.

I agree with Olaf; #2 is a real issue. It's not entirely clear if this
is within WG21's domain to try to solve, but solving it would certainly
benefit the software ecosystem as a whole. Certainly it helps with #1,
by blurring the lines between what is "the box" and what is outside the
box. (I would guess a lot of what seems to be "in the box" in some other
languages, actually isn't, but only appears to be by virtue of decent
packaging and distribution tools and a tendency to slurp down a lot of
technically third party stuff in default installs of the core language
tools.)

The biggest reason to limit the scope of the C++ libraries is that being
in an ISO standard makes it harder to change things later. Imagine what
Qt would look like now if it had needed to maintain compatibility with
the API it had 10+ years ago.

As far as #2... there are two facets of the problem; distribution, and
consumption. On the consumption side, CPS¹ might help, if it can be
completed enough to be useful and if enough users adopt it.

https://github.com/mwoehlke/cps — note that this is still in its
infancy; there *will* be breaking changes before 1.0! That said, anyone
wanting to help is more than welcome!)

--
Matthew

Matthew Woehlke

unread,
Mar 6, 2017, 3:37:06 PM3/6/17
to std-pr...@isocpp.org
On 2017-03-06 14:59, Daniel Frey wrote:
> On 6 Mar 2017, at 20:46, Matthew Woehlke wrote:
>> On 2017-03-03 12:01, Daniel Frey wrote:
>>> On 3 Mar 2017, at 08:49, Nicol Bolas <jmck...@gmail.com> wrote:
>>>> The problem with tossing that example around is that it is trivial.
>>>> It's so trivial that you could write it without even using a
>>>> framework; just take the JSON-C++ objects and walk the hierarchy,
>>>> outputting to the stream as you go. Seriously, your `to_stream`
>>>> example is maybe an hour's worth of coding, assuming you have a
>>>> JSON data structure in C++.
>>>
>>> Great, so what I do is "trivial" and "maybe an hour's worth of
>>> coding". You enjoy judging and disregarding other people's efforts?
>>
>> Processing JSON from an arbitrary source to convert it to a
>> pretty-printed string, *without an intermediary representation*, is
>> useful to... jsonlint. I'll certainly challenge you to name another.
>
> Please at least read what we were talking about (you even quoted it above).
>
> The example was not about taking an input stream and pretty-printing
> it without an intermediate representation, it was about walking a
> JSON-C++ object and stringify/prettify it to a stream.

I think you are successfully making this thread confusing. The earlier
posts were mainly about the benefits and drawbacks of a SAX-like
approach to consuming an unstructured JSON stream¹ vs. other approaches
(e.g. DOM-like, StAX-like). SAX-like only applies to this sort of
processing.

Once you have an in-memory representation, you are not doing SAX-like
processing, you are doing visitation, which is a problem that merits a
more general solution than processing just JSON... and I'm absolutely
unconvinced that a *recursive* visitor is optimal there, for any case
besides outputting.

The question was whether SAX-like processing is a reasonable approach to
either immediate processing of JSON data or getting from a raw
stream/bag of bytes to a user defined in-memory representation (which is
a subset of immediate processing, but the most common case). The
assertion is that it is clunky for anything but direct conversion to
another output stream (i.e. jsonlint) or a generic representation (e.g.
constructing a DOM-like representation). Neither of these are features
useful to *most consumers* of a JSON library, although the latter case
may be a useful means of *implementing* a JSON library.

I have not yet seen an example of where SAX-like parsing of raw JSON
data would be superior approach *for a consumer*. The only plausible
examples I've seen are for library implementations.

(¹ Meaning, a bag or stream of bytes, presumed to contain valid JSON,
but not yet converted to any in-memory representation, such as from a
raw string, file, network stream, etc.)

--
Matthew

Zhihao Yuan

unread,
Mar 6, 2017, 4:18:15 PM3/6/17
to std-pr...@isocpp.org
On Mon, Mar 6, 2017 at 2:37 PM, Matthew Woehlke
<mwoehlk...@gmail.com> wrote:
>
> I have not yet seen an example of where SAX-like parsing of raw JSON
> data would be superior approach *for a consumer*. The only plausible
> examples I've seen are for library implementations.

Example, parsing RESTful payload which is a list of
sha1 with rapidjson:

https://github.com/lichray/cpp-deuceclient/blob/master/src/json_parsers.cc

We have other things parsing/producing msgpack on
the fly as well.

Seriously, all structured, API style JSON should be done
in SAX. And about your comment:

"You seem to be focused on making the simplest case,
which only 0.01% of users will actually use, as fast as
possible, even though it makes life much harder for
99.99% of users."

I don't think anybody argued against having a
DOM interface, we are just saying an SAX interface
is necessary, otherwise I won't even bother a bit
-- how do I care a bit when 99.99% of my throughput
are processed with SAX while the coming standard
JSON utility has no such support?

--
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
___________________________________________________
4BSD -- http://blog.miator.net/

Matthew Woehlke

unread,
Mar 6, 2017, 4:47:40 PM3/6/17
to std-pr...@isocpp.org
On 2017-03-06 16:18, Zhihao Yuan wrote:
> On Mon, Mar 6, 2017 at 2:37 PM, Matthew Woehlke wrote:
>> I have not yet seen an example of where SAX-like parsing of raw JSON
>> data would be superior approach *for a consumer*. The only plausible
>> examples I've seen are for library implementations.
>
> Example, parsing RESTful payload which is a list of
> sha1 with rapidjson:
>
> https://github.com/lichray/cpp-deuceclient/blob/master/src/json_parsers.cc
>
> Seriously, all structured, API style JSON should be done
> in SAX.

Wow... that's parsing a list of homogeneous values, and *even that* is
ugly? No, thanks.

> I don't think anybody argued against having a DOM interface, we are
> just saying an SAX interface is necessary,

...but *why*? Because DOM-like is too slow/heavy? But if that's the
case, why would you want a horribly clunky SAX-like interface when
StAX-like can also be fast but is MUCH more friendly to use?

Compare your above example to:

try {
s.begin_array();
while (!s.at_end()) {
handle(s.get_string());
}
s.pop();
} catch (...) { ... }

...and I didn't have to write state management code or a template
conforming to the SAX handler. My StAX-style code is shorter than your
SAX-style code *by an order of magnitude*. If I need to parse
sub-objects, I just pass the parser to a function to handle the
sub-object. With SAX-style parsing, handling sub-objects is ugly and
painful.

See also https://github.com/mwoehlke/odf2txt/blob/master/odf2txt. I
manage to muddle through by having a "base" parser that implements the
SAX API that wraps a second "active" parser that is part of a stack that
allows me to push/pop parsers to handle context, but I would never call
this an "easy" approach. It's an ugly, awkward approach that is *forced
on me* by the SAX design.

State management code should live in *one* place: in the StAX-style
library. This way it only needs to be written once, not recreated by
every user, which leads to better testing and reliability, and less
overall work.

--
Matthew

Zhihao Yuan

unread,
Mar 6, 2017, 6:21:51 PM3/6/17
to std-pr...@isocpp.org
On Mon, Mar 6, 2017 at 3:47 PM, Matthew Woehlke
<mwoehlk...@gmail.com> wrote:
> ...but *why*? Because DOM-like is too slow/heavy?

Yes. Also, SAX defines the language, so no
extra Schema validation is needed (you can add it
of course).

> StAX-like can also be fast [...]

Prove it.

>
> Compare your above example to:
>
> try {
> s.begin_array();
> while (!s.at_end()) {
> handle(s.get_string());
> }
> s.pop();
> } catch (...) { ... }
>

That happens when the next element in the array
is not a string? Throws exception? Please don't.
Comparing element type like XMLStreamReader?
Then you can only break out one loop at a time if
you avoid `goto`. So far I feel more comfortable
with SAX, as you just return `false` from your
handler -- done.

> I manage to muddle through by having a "base" parser that implements the
> SAX API that wraps a second "active" parser that is part of a stack that
> allows me to push/pop parsers to handle context, but I would never call
> this an "easy" approach. It's an ugly, awkward approach that is *forced
> on me* by the SAX design.

PEGTL supports switching parsers by itself, may worth
investigation to allow it when parsing JSON.

Thiago Macieira

unread,
Mar 6, 2017, 6:35:10 PM3/6/17
to std-pr...@isocpp.org
Em segunda-feira, 6 de março de 2017, às 22:18:11 CET, Zhihao Yuan escreveu:
> On Mon, Mar 6, 2017 at 2:37 PM, Matthew Woehlke
>
> <mwoehlk...@gmail.com> wrote:
> > I have not yet seen an example of where SAX-like parsing of raw JSON
> > data would be superior approach *for a consumer*. The only plausible
> > examples I've seen are for library implementations.
>
> Example, parsing RESTful payload which is a list of
> sha1 with rapidjson:
>
> https://github.com/lichray/cpp-deuceclient/blob/master/src/json_parsers.cc

And here's the equivalent using TinyCBOR, a StAX parser written in C.

struct SHA1Digest { uint8_t data[20]; };
vector<SHA1Digest> parse(vector<uint8_t> data)
{
CborParser parser;
CborValue array;
cbor_parser_init(data.data(), data.size(), 0, &parser, &array);

vector<SHA1Digest> v;
v.resize(data.size() / sizeof(SHA1Digest));

CborValue it;
cbor_value_enter_container(&array, &it);
for (int i = 0; !cbor_value_at_end(&it); ++i) {
SHA1Digest &d = v[i];
size_t len = sizeof(d.data);
cbor_value_copy_byte_string(&it, d.data, &len, &it);
}

return v;
}

Let me repeat: this is a C API.

Sure, I ignored errors, because if this were a C++ API, it would either have
thrown or used std::expected. I also used CBOR byte strings instead of Base64-
encoded text strings, but that's because the decoding was not the point you
were trying to make.

> We have other things parsing/producing msgpack on
> the fly as well.
>
> Seriously, all structured, API style JSON should be done
> in SAX. And about your comment:

And you've seen other people disagree with you and think StAX is better. I'm
one of those.

Zhihao Yuan

unread,
Mar 6, 2017, 7:27:37 PM3/6/17
to std-pr...@isocpp.org
On Mon, Mar 6, 2017 at 5:35 PM, Thiago Macieira <thi...@macieira.org> wrote:
>
> CborValue array;
> [...]
> CborValue it;
>

Quoting:

"You need that one byte and you need to hold a value of different
types (union, std::variant, ...). And always switch on the type in the
consumer. This is unneeded overhead for a lot of use-cases, this is
where you are pessimizing simple use-cases in order to make the harder
use-cases more simple. Write a JSON library using your approach and
benchmark it."

>
> And you've seen other people disagree with you and think StAX is better. I'm
> one of those.

People want unicorns in the standard. People who
are on the horses named rapidjson and taojson
already raised concerns -- that's the kindest thing
we can do.

Nevin Liber

unread,
Mar 6, 2017, 8:23:17 PM3/6/17
to std-pr...@isocpp.org
On Mon, Mar 6, 2017 at 2:05 PM, Matthew Woehlke <mwoehlk...@gmail.com> wrote:
The biggest reason to limit the scope of the C++ libraries is that being
in an ISO standard makes it harder to change things later. Imagine what
Qt would look like now if it had needed to maintain compatibility with
the API it had 10+ years ago.
 
Ultimately, do you want this list to decide it shouldn't be in the standard, or do you want the committee to decide whether or not it should be in the standard?  If you want the committee to decide, you have to write a proposal.

I'm certainly interested in seeing a proposal for it, and I've seen that at least one other committee member on this list mention interest as well.  Of course, this is just encouragement and not an endorsement, and there is never any guarantee on what the committee will do.
-- 
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com>  +1-847-691-1404

Thiago Macieira

unread,
Mar 7, 2017, 1:36:56 AM3/7/17
to std-pr...@isocpp.org
Em terça-feira, 7 de março de 2017, às 01:27:34 CET, Zhihao Yuan escreveu:
> On Mon, Mar 6, 2017 at 5:35 PM, Thiago Macieira <thi...@macieira.org> wrote:
> > CborValue array;
> >
> > [...]
> >
> > CborValue it;
>
> Quoting:
>
> "You need that one byte and you need to hold a value of different
> types (union, std::variant, ...). And always switch on the type in the
> consumer. This is unneeded overhead for a lot of use-cases, this is
> where you are pessimizing simple use-cases in order to make the harder
> use-cases more simple. Write a JSON library using your approach and
> benchmark it."

First of all, the type information is there anyway. The only difference is
whether it is exposed to the user in the API or whether it's hidden. In the
SAX case, because of the push-style API, it's hidden and the parser does the
switching for you and calls your function.

Second, you must provide something like an iterator anyway if you want to
provide some type properties to the visited function. Properties like array
size, string length, etc. That is required if you want to implement string
zero-copy.

Third, I have written a library (albeit CBOR one) and it's extremely fast. The
advantage of using CBOR instead of JSON in this case is the ability to
benchmark *only* the library, since there's no malloc required to create
indeterminism. I'd like someone to implement a CBOR SAX-parser so we can
compare and contrast -- I don't think I am qualified, since I'd write it on top
of my StAX parser and an abstraction API can't be faster than the thing it
abstracts.

Finally, did you see a switch in my code?

> > And you've seen other people disagree with you and think StAX is better.
> > I'm one of those.
>
> People want unicorns in the standard. People who
> are on the horses named rapidjson and taojson
> already raised concerns -- that's the kindest thing
> we can do.

Sure.

Anyway, it looks to me that the proposal will need to provide both solutions,
so everyone is pleased.

Mark my words: the SAX parser will be implemented on top of the StAX one.

Magnus Fromreide

unread,
Mar 7, 2017, 3:42:18 AM3/7/17
to std-pr...@isocpp.org
On Tue, Mar 07, 2017 at 07:36:42AM +0100, Thiago Macieira wrote:
> Em terça-feira, 7 de março de 2017, às 01:27:34 CET, Zhihao Yuan escreveu:
> > On Mon, Mar 6, 2017 at 5:35 PM, Thiago Macieira <thi...@macieira.org> wrote:
> > > CborValue array;
> > >
>
> Anyway, it looks to me that the proposal will need to provide both solutions,
> so everyone is pleased.
>
> Mark my words: the SAX parser will be implemented on top of the StAX one.

One common use of JSON is for web services were the usage pattern - as far as
I understand it - is that you send an initial state followed by an unbounded
number of updates when some server event occurs. This implies that the data
won't be a complete JSON struct when you start parsing it since that part have
yet to be sent to you.

How does the StAX parser handle that case?

Thiago Macieira

unread,
Mar 7, 2017, 4:11:46 PM3/7/17
to std-pr...@isocpp.org
Em terça-feira, 7 de março de 2017, às 09:42:13 CET, Magnus Fromreide
escreveu:
> One common use of JSON is for web services were the usage pattern - as far
> as I understand it - is that you send an initial state followed by an
> unbounded number of updates when some server event occurs. This implies
> that the data won't be a complete JSON struct when you start parsing it
> since that part have yet to be sent to you.
>
> How does the StAX parser handle that case?

It will just tell you "unexpected EOF" and allow you to supply more data
before retrying.

Sure, that means you need to save state to return to where you were. But I
don't think this is any more difficult than resuming the state in a SAX parser:
if I have to exit my function in order to obtain more data, then I need to
save the parser and its state somewhere.

Matthew Woehlke

unread,
Mar 8, 2017, 7:02:06 PM3/8/17
to std-pr...@isocpp.org
On 2017-03-06 18:21, Zhihao Yuan wrote:
> On Mon, Mar 6, 2017 at 3:47 PM, Matthew Woehlke wrote:
>> ...but *why*? Because DOM-like is too slow/heavy?
>
> Yes. Also, SAX defines the language, so no extra Schema validation is
> needed (you can add it of course).

I'm not sure what you mean by "defines the language". (And you can do
DOM without schema validation also.)

>> StAX-like can also be fast [...]
>
> Prove it.

Prove it isn't. Trying to compare SAX to StAX is going to be heavily
dependent on the quality of implementation of either, so I'm not sure
how you'd set up a fair comparison.

Compared to DOM, however, it should be obvious that both are faster:
they are both streaming processors with limited memory overhead
(compared to DOM which parses an entire document at once and holds the
whole thing in memory). SAX still needs *some* state (internal to the
library, that is), unless it pushes the problem of grammar validation to
the consumer (yuck). StAX doesn't need much more. (Both need at least a
minimal stack.)

I could also implement StAX on top of SAX without much overhead, though
this almost certainly is not the most efficient implementation.

>> Compare your above example to:
>>
>> try {
>> s.begin_array();
>> while (!s.at_end()) {
>> handle(s.get_string());
>> }
>> s.pop();
>> } catch (...) { ... }
>
> That happens when the next element in the array
> is not a string? Throws exception? Please don't.

That depends on the implementation. Probably you just ignore stuff that
isn't as expected (that's how I envision the above working, anyway), but
you can have implementations that throw, or do speculative reads, or...

p.s. I liked this answer [edited for grammar] from
http://stackoverflow.com/questions/7521803:

"I guess the only time I think of preferring SAX over StAX is in cases
when you don't need to handle/process XML content; e.g. the only thing
you want to do is check for well-formedness of incoming XML and just
want to handle errors if it has. [...] basically StAX is definitely the
preferable choice in scenarios where you want to handle content because
SAX content handlers are too difficult to code..."

...which essentially is exactly what I said earlier; SAX is fine when
you don't care about the structure of the data (beyond maybe indentation
level), but StAX is much easier to use otherwise.

--
Matthew

Matthew Woehlke

unread,
Mar 8, 2017, 7:03:13 PM3/8/17
to std-pr...@isocpp.org
On 2017-03-07 01:36, Thiago Macieira wrote:
> Em terça-feira, 7 de março de 2017, às 01:27:34 CET, Zhihao Yuan escreveu:
>> People want unicorns in the standard. People who are on the horses
>> named rapidjson and taojson already raised concerns -- that's the
>> kindest thing we can do.

People want something that is easy to use. SAX may be slightly faster,
but it is much, MUCH harder to use in most cases.

SAX is like a stock F1 car: crazy fast, but the controls are complicated
and esoteric (12-step start procedure, anyone?). StAX is like an F1 car
modified for use by normal people; maybe you lose a little (not a lot)
in the performance, but you get familiar controls and amenities.

> Mark my words: the SAX parser will be implemented on top of the StAX one.

I wouldn't put money on that ;-), although I believe it's a plausible
implementation.

The "good" (note the quotes) thing about SAX is that it is very close to
a likely parser implementation.

--
Matthew

Zhihao Yuan

unread,
Mar 8, 2017, 7:38:28 PM3/8/17
to std-pr...@isocpp.org
On Wed, Mar 8, 2017 at 6:02 PM, Matthew Woehlke
<mwoehlk...@gmail.com> wrote:
>
>> Yes. Also, SAX defines the language, so no extra Schema validation is
>> needed (you can add it of course).
>
> I'm not sure what you mean by "defines the language". (And you can do
> DOM without schema validation also.)

"Language" as a term in theory of computation, every
instance of an SAX parser defines a language that is
a subset of JSON, a JSON input that is not a member
of this language can't even be parsed; parsing can
terminate early without schema validation. In contrast,
if no schema validation is performed before using DOM
parser, since every valid JSON representation has a
valid corresponding DOM, your service can waste lots
of resource building DOMs that are logically illegal to
your API.

Victor Dyachenko

unread,
Mar 9, 2017, 3:30:52 AM3/9/17
to ISO C++ Standard - Future Proposals
Have plain sequential code - use stream API (StAX).
Have async non-blocking code where you from time to time lose the control (inversion of control) and process a chunk at once - use event-based API (SAX).
Need random access and have no schema - tree API (DOM).

There is no one-size-fits-all solution here. We need all three.

Matthew Woehlke

unread,
Mar 9, 2017, 10:33:07 AM3/9/17
to std-pr...@isocpp.org
On 2017-03-08 19:38, Zhihao Yuan wrote:
> On Wed, Mar 8, 2017 at 6:02 PM, Matthew Woehlke wrote:
>> On 2017-03-06 18:21, Zhihao Yuan wrote:
>>> Yes. Also, SAX defines the language, so no extra Schema validation is
>>> needed (you can add it of course).
>>
>> I'm not sure what you mean by "defines the language". (And you can do
>> DOM without schema validation also.)
>
> "Language" as a term in theory of computation, every instance of an
> SAX parser defines a language that is a subset of JSON, a JSON input
> that is not a member of this language can't even be parsed; parsing
> can terminate early without schema validation. In contrast, if no
> schema validation is performed before using DOM parser, since every
> valid JSON representation has a valid corresponding DOM, your service
> can waste lots of resource building DOMs that are logically illegal
> to your API.

Ah, so it isn't "SAX", it's "a particular implementation of a particular
parser *using* SAX". Anyway, this is equally true of StAX...

--
Matthew

Matthew Woehlke

unread,
Mar 9, 2017, 10:40:54 AM3/9/17
to std-pr...@isocpp.org
On 2017-03-09 03:30, Victor Dyachenko wrote:
> Have async non-blocking code where you from time to time lose the control
> (inversion of control) and process a chunk at once - use event-based API
> (SAX).

...or co-routines, or a blocking StAX API that can be fed from another
thread.

--
Matthew

Victor Dyachenko

unread,
Mar 9, 2017, 10:47:46 AM3/9/17
to ISO C++ Standard - Future Proposals

Sure.  But I hope you don't mean "SAX is unnecessary". SAX still can be simpler solution - it doesn't require own stack which threads or co-routines or fibers allot us.

Thiago Macieira

unread,
Mar 9, 2017, 11:01:17 AM3/9/17
to std-pr...@isocpp.org
On quinta-feira, 9 de março de 2017 16:47:45 CET Victor Dyachenko wrote:
> Sure. But I hope you don't mean "SAX is unnecessary". SAX still can be
> simpler solution - it doesn't require own stack which threads or
> co-routines or fibers allot us.

It requires memory allocation to keep the state while you unwind the parser
back to the function that can obtain more data.

Matthew Woehlke

unread,
Mar 9, 2017, 3:17:38 PM3/9/17
to std-pr...@isocpp.org
On 2017-03-04 07:23, Bjorn Reese wrote:
> Your suggestion is going to introduce many concepts. The bulk of your
> suggestion is to introduce a dynamic variable (given your emphasis on
> JSON, I assume that you want to model the JavaScript dynamic variable.)
>
> Dynamic variables introduces data structures such as heterogenous
> arrays, heterogenous maps, and hierarchical containers (nested
> heteregenous arrays/maps.)
>
> The heterogenous nature of dynamic variables means that illegal
> operations (e.g. appending a string to a bool) have to be handled at
> run-time instead of at compile-time. Should traversal be done with
> iterators or visitors? How does it integrate with std algorithms?

I implemented a (DOM-style) JSON library for Qt4. It had the following
type aliases:

JsonValue -> QVariant
JsonObject -> QVariantMap -> QMap<QString, QVariant>
JsonArray -> QVariantArray -> QList<QVariant>

The STL equivalent would be:

JsonValue -> variant<JsonObject, JsonArray, ...¹>
JsonObject -> map<string, JsonValue>
JsonArray -> vector<JsonValue>

Qt5 used "strong" types, but the function remains essentially the same.
(Personally, I am in favor of sticking to the basic container types
rather than implementing 'new' types with the same API, but either way
works...)

The only illegal operations handled at run-time are bad casts. If you
want to "add" two values, you write casts of the values to concrete
types that can be "added" (either mathematical addition or
concatenation, depending if they are numbers or strings), and the
application of the operator is still type safe (but the casts will fail
at run-time if the values aren't the expected type(s)). Traversal and
such is solved the same as for the generic types (map/vector/variant).

(¹ I believe the rest of the list is <string, double, bool, nullptr_t>,
but one can make an argument for additional numeric types. One of the
niceties of QVariant is it also supports conversion between numeric
types; you can store a value as whichever of double/int64_t/uint64_t
best preserves the input, and get it back, possibly via a lossy
conversion, as any numeric type. For this and other reasons², a
json_value class that is a true class and not just a variant<...> might
make sense.)

(² In particular, it makes a good place to hang I/O operations; the DOM
API can pretty well be just member functions on json_value.)

> The hierarchical nature of dynamic variables means that they can act
> as trees, which immediately raises the question why we have no tree
> structures in C++ despite proposals like N3700?

I doubt you want a "true" tree structure, because you can freely mix
numbers/strings, objects and arrays, which makes for a strange looking
"tree". I think the above approach that recursively references a 'value'
which may be an object or array makes more sense.

> In summary, I believe that your suggestion is going to be larger than
> you may anticipate now, so I recommend that it should be split into two:
>
> 1. A proposal for dynamic variables, which should be completely
> independent on JSON (albeit JSON is going to figure prominently in
> the motivation section.)

Please explain why you can't just use `variant`.

--
Matthew

vin...@gmail.com

unread,
May 21, 2017, 8:19:34 AM5/21/17
to ISO C++ Standard - Future Proposals
It seems that this discussion has stalled, after diving into technical details...

Question for Niels Lohmann and Daniel Frey: what's your opinion after all these comments, are you yet interested in producing proposal for standard C++ JSON library?


Niels Lohmann

unread,
May 21, 2017, 9:01:03 AM5/21/17
to std-pr...@isocpp.org
Hi there,

> It seems that this discussion has stalled, after diving into technical details...
>
> Question for Niels Lohmann and Daniel Frey: what's your opinion after all these comments, are you yet interested in producing proposal for standard C++ JSON library?

there is a repository for this now: https://github.com/nlohmann/std_json. I haven’t had time to spend much time on this so far, but I hope this will change. :-)

All the best,
Niels
signature.asc

zamaz...@gmail.com

unread,
Jul 3, 2017, 11:57:18 PM7/3/17
to ISO C++ Standard - Future Proposals
Hello.

I want to work in the same way. I think you know about Boost.Property_tree. It's universal library for working with json, xml, ini and other formats. And i want see smth like this in the C++. What do you think about the idea?


Reason: Property_tree is universal represenation for data. And it's universal structure for creating parsers for JSON, XML, INI, etc. I am not sure, that si unviersal thing can work perfectly with each format (but why not?). And if we will add to Standard something really general with parsers for the most usable formats, it will be great!


Best regards,

Alexander Zaitsev


четверг, 2 марта 2017 г., 0:25:33 UTC+3 пользователь Niels Lohmann написал:
Hi there,

I am wondering whether JSON [RFC7159] support would be a helpful extension to the C++ standard library (pure library extension), including, but not limited to, the following aspects:

1. A variant-like container type (for this mail, let's call it "std::json") that combines C++ types for the JSON value types [RFC7159, chapter 3]:
  - string (e.g. std::string),
  - number (e.g., double or int64_t),
  - boolean (e.g., bool),
  - array (e.g., std::vector), and
  - object (e.g., std::map).

This type should have an intuitive API (i.e., all expected container methods), but also use as much syntactic sugar as possible (e.g., using initializer lists to express arrays like "std::json my_array = {"a string", 17, 42.12};".

2. A serialization function to create a textual representation (called "JSON text" in [RFC7159]) from a std::json value that conforms to the JSON grammar [RFC7159, chapter 2-7].

3. A deserialization function (i.e., a parser) [RFC7159, chapter 9] to create a std::json value from a JSON text.

There are currently dozens of libraries [json.org] written in C or C++ solving these aspects. However, it would be of great convenience to have JSON be part of the C++ standard library. In particular, the wide use of JSON as exchange format for structured data as well as to express simple configuration data would could solve a lot of use cases within the C++ standard library.

I would be willing to draft a proposal based on the experience I made with my C++ JSON library [nlohmann/json]. Of course, I would be interested in your thoughts on this.

All the best,
Niels

References

[RFC7159] https://tools.ietf.org/html/rfc7159.html
[json.org] http://json.org
[nlohmann/json] https://github.com/nlohmann/json

Domen Vrankar

unread,
Jul 4, 2017, 5:08:43 AM7/4/17
to std-pr...@isocpp.org
2017-07-04 5:57 GMT+02:00 <zamaz...@gmail.com>:
Hello.

I want to work in the same way. I think you know about Boost.Property_tree. It's universal library for working with json, xml, ini and other formats. And i want see smth like this in the C++. What do you think about the idea?


Reason: Property_tree is universal represenation for data. And it's universal structure for creating parsers for JSON, XML, INI, etc. I am not sure, that si unviersal thing can work perfectly with each format (but why not?). And if we will add to Standard something really general with parsers for the most usable formats, it will be great!

In that case I'd expect something like std::tree (or boost graph library? never used it but it sounds like something that would be useful for traversing the structure) to be standardized and I'm guessing that there's a reason it hasn't been done since std::filesystem was accepted and even though filesystem is a tree structure (could be viewed as XML DOM or Json hierarchy) and std::filesystem::path could be viewed as similar to XPath in XML we still don't have something like std::tree or tree iterators.

Regards,
Domen

Nicol Bolas

unread,
Jul 4, 2017, 10:04:37 AM7/4/17
to ISO C++ Standard - Future Proposals


On Monday, July 3, 2017 at 11:57:18 PM UTC-4, Alexander Zaitsev wrote:
Hello.

I want to work in the same way. I think you know about Boost.Property_tree. It's universal library for working with json, xml, ini and other formats.


NO! No, it isn't!

Property Trees are for working with property trees. It is not some "universal library" for handling JSON, XML and such. Its JSON and XML reading abilities are tailored specifically to JSON and XML that it wrote itself.

That library is not meant to consume arbitrary JSON or XML.

pb.c...@gmail.com

unread,
Jul 16, 2017, 5:57:13 PM7/16/17
to ISO C++ Standard - Future Proposals

Neal Meyer

unread,
Jun 25, 2018, 10:45:43 AM6/25/18
to std-pr...@isocpp.org
Niels,

I’ve been looking into your library on github that is in response to this discussion.  JSON handling is something I feel we desperately need to move forward in the next C++ revisions.  Have you started the proposal process with your libraries yet?  I could not find any papers in the isocpp mailings.  

Im just getting back to being involved with the committee after a few years off. 

-Neal Meyer


On Wed, Mar 1, 2017 at 1:25 PM Niels Lohmann <ma...@nlohmann.me> wrote:
Hi there,

I am wondering whether JSON [RFC7159] support would be a helpful extension to the C++ standard library (pure library extension), including, but not limited to, the following aspects:

1. A variant-like container type (for this mail, let's call it "std::json") that combines C++ types for the JSON value types [RFC7159, chapter 3]:
  - string (e.g. std::string),
  - number (e.g., double or int64_t),
  - boolean (e.g., bool),
  - array (e.g., std::vector), and
  - object (e.g., std::map).

This type should have an intuitive API (i.e., all expected container methods), but also use as much syntactic sugar as possible (e.g., using initializer lists to express arrays like "std::json my_array = {"a string", 17, 42.12};".

2. A serialization function to create a textual representation (called "JSON text" in [RFC7159]) from a std::json value that conforms to the JSON grammar [RFC7159, chapter 2-7].

3. A deserialization function (i.e., a parser) [RFC7159, chapter 9] to create a std::json value from a JSON text.

There are currently dozens of libraries [json.org] written in C or C++ solving these aspects. However, it would be of great convenience to have JSON be part of the C++ standard library. In particular, the wide use of JSON as exchange format for structured data as well as to express simple configuration data would could solve a lot of use cases within the C++ standard library.

I would be willing to draft a proposal based on the experience I made with my C++ JSON library [nlohmann/json]. Of course, I would be interested in your thoughts on this.

All the best,
Niels

References

[RFC7159] https://tools.ietf.org/html/rfc7159.html
[json.org] http://json.org
[nlohmann/json] https://github.com/nlohmann/json

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/3DF8137F-C90E-4D25-96AB-FF8B766418DF%40nlohmann.me.
--
-Neal

Jake Arkinstall

unread,
Jun 25, 2018, 11:38:15 AM6/25/18
to std-pr...@isocpp.org
I have been using this library for quite some time now, as part of a server application that communicates in JSON with client processes. It works beautifully.

In terms of bringing it into the standard, I do have one reservation: due to the nature of the task at hand - mainly that the scope of "what do people want to use JSON for" is rather wide - it is under continuous improvement. It's averaging more than a commit per day over the last month. I can't see it as a project that will happily fit in with the much slower and more granular update process that it would be under if brought into the STL. I expect that many users will be using a more up to date version of the library from GitHub rather than stick to the version standardised e.g. 3 years prior.

If Niels thinks otherwise, then I withdraw that reservation; he knows what he plans to do with the project better than anyone.

I'm also not sure why standardizing it is desperately needed, unless there are plans to incorporate it into future library components.

Gašper Ažman

unread,
Jun 25, 2018, 11:39:34 AM6/25/18
to std-pr...@isocpp.org
I'd like to add that keeping a "requirements" list is a necessary thing for this process, because I'd like to throw in some requirements:

- allocator aware
- need to specify concepts as well as the concrete library

The reason for the concepts is that I can totally see a json-overlay type implementation that uses string_view to access an in-memory version on the actual memory that's been read.



To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
--
-Neal

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

Nicol Bolas

unread,
Jun 25, 2018, 11:58:06 AM6/25/18
to ISO C++ Standard - Future Proposals
On Monday, June 25, 2018 at 11:38:15 AM UTC-4, Jake Arkinstall wrote:
I have been using this library for quite some time now, as part of a server application that communicates in JSON with client processes. It works beautifully.

In terms of bringing it into the standard, I do have one reservation: due to the nature of the task at hand - mainly that the scope of "what do people want to use JSON for" is rather wide - it is under continuous improvement. It's averaging more than a commit per day over the last month. I can't see it as a project that will happily fit in with the much slower and more granular update process that it would be under if brought into the STL. I expect that many users will be using a more up to date version of the library from GitHub rather than stick to the version standardised e.g. 3 years prior.

JSON itself is quite stable. Are the daily updates to this library actually meaningful to its interface? And if so, why is the interface undergoing daily revisions?

j c

unread,
Jun 25, 2018, 12:30:06 PM6/25/18
to std-pr...@isocpp.org
On Mon, Jun 25, 2018 at 4:38 PM, Jake Arkinstall <jake.ar...@gmail.com> wrote:
I have been using this library for quite some time now, as part of a server application that communicates in JSON with client processes. It works beautifully.

Have you compared it to competing json libraries such as rapidjson? 
 
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
--
-Neal

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

Jake Arkinstall

unread,
Jun 25, 2018, 2:03:11 PM6/25/18
to std-pr...@isocpp.org
On Mon, 25 Jun 2018, 16:58 Nicol Bolas, <jmck...@gmail.com> wrote:
JSON itself is quite stable. Are the daily updates to this library actually meaningful to its interface? And if so, why is the interface undergoing daily revisions?

It does appear that the majority of changes are more to do with the accessories, such as build configs.

However, every now and then there are upgrades that would not have been so easy to make if subject to the proposal process (such as removal of stringstreams, accessing keys as const references rather than copying values, all the way up to the introduction of a SAX parser a few months ago).

These kinds of changes are important for the development of the framework, but demonstrate that it is well suited to its current development process. Maybe a formal committee process will get everything ironed out in one fell swoop, but Im not convinced that this is the ideal approach. I'd also love to see how the source would change with proposals for C++20 implemented; it is currently written assuming C++11 as far as I'm aware, and I can't imagine this will be the situation if standardised.

There is also an ongoing discussion about casting ( https://github.com/nlohmann/json/issues/958 ) which needs resolving, and I'm under the impression that edge cases sometimes arise when people want to make the library work with their custom types (I have seen various discussions on this, but for the life of me I can't find any that I had in mind when I wrote that the nature of the project lends itself to continual changes. Perhaps I am imagining things.)


On Mon, 25 Jun 2018, 17:30 j c, <james.a...@gmail.com> wrote:
On Mon, Jun 25, 2018 at 4:38 PM, Jake Arkinstall <jake.ar...@gmail.com> wrote:
I have been using this library for quite some time now, as part of a server application that communicates in JSON with client processes. It works beautifully.

Have you compared it to competing json libraries such as rapidjson? 

I tried out quite a few, But nlohmann::json definitely stood out as the most comfortable to use, most familiar (i.e. it wouldn't feel out of place in the STL), and very flexible - in particular it works great with custom types. It fit my needs perfectly.

There are implementations that are faster (RapidJSON claimed to when I tried it out - it wasn't pretty to write with, though. It's so far removed from STL styles and approaches that it feels more like writing pre-generics Java). There are some that are lighter in memory. There are some that compile faster, and there are some that have smaller file sizes. I think it'd be quite hard to decide on a single implementation to suit the everyday user.

Dejan Milosavljevic

unread,
Jun 26, 2018, 5:54:16 AM6/26/18
to std-pr...@isocpp.org
Why not also: XML, INI, csv, ASN.1, ... ?
std::string::replace(s.find("JSON"), sizeof("XML")-1, "XML");

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Gašper Ažman

unread,
Jun 26, 2018, 5:56:43 AM6/26/18
to std-pr...@isocpp.org
AFAIK those have been on the wishlist for a long time, but nobody's yet offered to spec them.

XML is hard because of the whole schema thing.

CSV is hard because it does not have a spec (even the python library is a mess).

All of the above get easier once we have actual unicode support, also.

Richard Hodges

unread,
Jun 26, 2018, 7:07:57 AM6/26/18
to std-pr...@isocpp.org
> Why not also: XML, INI, csv, ASN.1, ... ?

XML - because we'll soon be celebrating its death and dancing on its grave.

INI - good idea, it's useful in in all popular OSs, and really easy to read and use. It has a long future.

CSV - another great idea!

ASN.1 - a little more esoteric, but nonetheless useful.


To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

Matthew Woehlke

unread,
Jun 26, 2018, 10:02:52 AM6/26/18
to std-pr...@isocpp.org, Dejan Milosavljevic
On 2018-06-26 05:54, Dejan Milosavljevic wrote:
> Why not also: XML, INI, csv, ASN.1, ... ?

XML might be a reasonable candidate, though it has even more problems
than JSON (e.g. DOM or SAX?). As for INI and CSV, neither is a well
specified format; every application tends to parse these a little bit
differently.

--
Matthew

Richard Hodges

unread,
Jun 26, 2018, 10:25:52 AM6/26/18
to std-pr...@isocpp.org
As for INI and CSV, neither is a well specified format; every application tends to parse these a little bit differently.

This is also an issue that affects std::regex, but is has been overcome (to a degree) with parameterisation at the point of compilation of the regex.

A file parser's idiosyncrasies could be configured with a traits type.



--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

Thiago Macieira

unread,
Jun 26, 2018, 10:44:05 AM6/26/18
to std-pr...@isocpp.org
On Tuesday, 26 June 2018 04:07:43 PDT Richard Hodges wrote:
> INI - good idea, it's useful in in all popular OSs, and really easy to read
> and use. It has a long future.

Except for having a plethora of slightly incompatible extension formats when
it comes to more than one grouping level.

Thiago Macieira

unread,
Jun 26, 2018, 10:46:00 AM6/26/18
to std-pr...@isocpp.org
On Tuesday, 26 June 2018 07:25:38 PDT Richard Hodges wrote:
> A file parser's idiosyncrasies could be configured with a traits type.

That is not a good idea, as that would imply the parser for a complex file
format is entirely inline. No, if there's anything configurable, it ought to
be at runtime.

Richard Hodges

unread,
Jun 26, 2018, 10:55:40 AM6/26/18
to std-pr...@isocpp.org
On Tue, 26 Jun 2018 at 16:46, Thiago Macieira <thi...@macieira.org> wrote:
On Tuesday, 26 June 2018 07:25:38 PDT Richard Hodges wrote:
> A file parser's idiosyncrasies could be configured with a traits type.

That is not a good idea, as that would imply the parser for a complex file
format is entirely inline. No, if there's anything configurable, it ought to
be at runtime.

That's one way. Another might be that the parser is actually a 'view' attached to a stream, range, polymorphic (possibly even coroutine-enabled) character source.
I think that this would allow customisation without being limited by enums/existing traits etc.

 

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center



--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Victor Dyachenko

unread,
Jun 27, 2018, 2:08:40 AM6/27/18
to ISO C++ Standard - Future Proposals, dmi...@gmail.com
On Tuesday, June 26, 2018 at 5:02:52 PM UTC+3, Matthew Woehlke wrote:
On 2018-06-26 05:54, Dejan Milosavljevic wrote:
> Why not also: XML, INI, csv, ASN.1, ... ?

XML might be a reasonable candidate, though it has even more problems
than JSON (e.g. DOM or SAX?).

Don't we have the same choice for JSON: document-tree vs stream processing? What is the difference in this respect?

j c

unread,
Jun 27, 2018, 4:57:33 AM6/27/18
to std-pr...@isocpp.org, dmi...@gmail.com
There is no choice, any parser would need to support both.
Otherwise people will just revert back to 3rd-party parsers for anything other than trivial use-cases

Victor Dyachenko

unread,
Jun 27, 2018, 5:09:10 AM6/27/18
to ISO C++ Standard - Future Proposals, dmi...@gmail.com
So don't forget to add:

1) "pull" vs "push" parsing control
2) Destructive (AKA in situ) parsing support
3) JSONPath (???)
4) What else?

I think trying to cover everything is a crazy idea. Any complex task has more than one solution, so providing std:: facilities can't eliminate 3rd-party solutions.

 

j c

unread,
Jun 27, 2018, 5:27:15 AM6/27/18
to std-pr...@isocpp.org, dmi...@gmail.com
Given the popularity of JSON in things HTTP-related it really would need to support stream processing if C++ is to try break into these domains.
In-situ also if C++ really believes that it's tool for writing performance-oriented applications.

Is JSONPath part of rfc7159? If not then, there's no pushing need to support it IMO.

Reply all
Reply to author
Forward
0 new messages