[boost] [review][JSON] Review of JSON starts today: Sept 14 - Sept 23

117 views
Skip to first unread message

Pranam Lashkari via Boost

unread,
Sep 14, 2020, 3:30:15 AM9/14/20
to boost, Pranam Lashkari
Boost formal review of Vinnie Falco and Krystian Stasiowski's library JSON
starts today and will run for 10 days ending on 23 Sept 2020. Both of these
authors have already developed a couple of libraries which are accepted in
Boost(boost beast and Static String)

This library focuses on a common and popular use-case for JSON. It provides
a container to hold parsed and serialised JSON types. It provides more
flexibility and better benchmark performance than its competitors.

JSON highlights the following features in the documentation:

- Fast compilation
- Require only C++11
- Fast streaming parser and serializer
- Easy and safe API with allocator support
- Constant-time key lookup for objects
- Options to allow non-standard JSON
- Compile without Boost, define BOOST_JSON_STANDALONE
- Optional header-only, without linking to a library

(a point I would like to add in highlight: it has cool Jason logo 😝)


To quickly understand and get the flavour of the library take a look at
"Quick Look"
<http://master.json.cpp.al/json/usage/quick_look.html>

You can find the source code to be reviewed here:
<https://github.com/CPPAlliance/json/tree/master>

You can find the latest documentation here:
<http://master.json.cpp.al/>

Benchmarks are also given in the document which can be found here:
<http://master.json.cpp.al/json/benchmarks.html>

Some people have also given the early reviews, the thread can be found here:
<https://lists.boost.org/Archives/boost/2020/09/249745.php>

Please provide in your review information you think is valuable to
understand your choice to ACCEPT or REJECT including JSON as a
Boost library. Please be explicit about your decision (ACCEPT or REJECT).

Some other questions you might want to consider answering:

- What is your evaluation of the design?
- What is your evaluation of the implementation?
- What is your evaluation of the documentation?
- What is your evaluation of the potential usefulness of the library?
- Did you try to use the library? With which compiler(s)? Did you have
any problems?
- How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?
- Are you knowledgeable about the problem domain?

More information about the Boost Formal Review Process can be found
here: <http://www.boost.org/community/reviews.html>

Thank you for your effort in the Boost community.

--
Thank you,
Pranam Lashkari, https://lpranam.github.io/

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Richard Hodges via Boost

unread,
Sep 14, 2020, 6:32:27 AM9/14/20
to boost@lists.boost.org List, Richard Hodges
Because of my direct involvement in the C++ Alliance I feel it would be
wrong for me to provide a review that leads to an accept/reject conclusion.

However, I have some experience of integrating this library into a private
project and I felt it might be valuable to share my experiences.

* - What is your evaluation of the design?*

My personal opinion is that the design is sane and well-reasoned. Any areas
with which I have previously taken issue have been raised with the authors
and concerns covered. Some effort was made to explore the effect of ideas I
presented and outcomes were measured. My opinion is that the final design
is largely data-driven.

* - What is your evaluation of the implementation?*

I have found no faults in the implementation during use. There is the
slightly off-putting fact that the default text representation of parsing
integers that are complete powers of 10 are expressed in scientific
notation. Unusual as it seems however, this is strictly conformant with the
JSON standard.

* - What is your evaluation of the documentation?*

The documentation is clear and succinct, the fact that it takes steps to
elucidate the rationale behind design decisions ought to head off a number
of "Wait! Why?" questions.

* - What is your evaluation of the potential usefulness of the library?*

The library has already proven useful to me.

For me personally, the ability to map the parser directly to C++ objects
without going through the intermediate json::value data structure would
offer a minor improvement in performance.

I have started exploring the building of such a parse handler which I
intend to offer as something to go into the examples section at some future
date assuming I have the time to finish it. Notwithstanding, the fact that
I can supply a custom area-style memory resource to the parser/value
largely offsets this concern in practice. Essentially by voiding the
building of the DOM I can avoid one memory allocation and some redundant
copies. In practice, neither one memory allocation nor the memory copies
have proven measurably expensive in my uses of the library.

Whether this ultimately belongs in the JSON library or should be a
dependent library is not for me to say.

It is worth noting that the separation of concerns between parser and
handler is helpful in that it makes this work possible without having to
rewrite any parsing code.

* - Did you try to use the library? With which compiler(s)? Did you have
any problems?*

I have used the library with GCC 9&10, and Clang 9&10. Standards selected
were C++17 and C++20. I chose the boost-dependent (default) option rather
than standalone because I was also using the boost libraries Asio, Beast,
Program Options and System.

* - How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?*

I have written an application that uses the library: A cryptocurrency
market-making bot that faced off to the Deribit websocket/json API.

* - Are you knowledgeable about the problem domain?*

Yes. In a previous market data distribution engine I used Nlohmann JSON
(high level but slow), RapidJSON (low level but fast) and JSMN (super low
level and blindingly fast but no DOM representation, only provides indexes
into data).

Regards,

R


--
Richard Hodges
hodg...@gmail.com
office: +442032898513
home: +376841522
mobile: +376380212

Vinnie Falco via Boost

unread,
Sep 14, 2020, 10:31:59 AM9/14/20
to boost@lists.boost.org List, Vinnie Falco
On Mon, Sep 14, 2020 at 12:30 AM Pranam Lashkari via Boost <
bo...@lists.boost.org> wrote:

> <https://github.com/CPPAlliance/json
> <https://github.com/CPPAlliance/json/tree/master>>
>

Don't forget to STAR the repository, and thanks!

Pranam Lashkari via Boost

unread,
Sep 14, 2020, 10:13:18 PM9/14/20
to Vinnie Falco, Pranam Lashkari, boost@lists.boost.org List
On Mon, Sep 14, 2020 at 8:01 PM Vinnie Falco <vinnie...@gmail.com> wrote:

>
> Don't forget to STAR the repository, and thanks!
>

Yes, please start the amazing library it may help to reach this library to
more people.


--
Thank you,
Pranam Lashkari, https://lpranam.github.io/

Rainer Deyke via Boost

unread,
Sep 15, 2020, 10:14:30 AM9/15/20
to bo...@lists.boost.org, Rainer Deyke
On 14.09.20 09:30, Pranam Lashkari via Boost wrote:
> Please provide in your review information you think is valuable to
> understand your choice to ACCEPT or REJECT including JSON as a
> Boost library. Please be explicit about your decision (ACCEPT or REJECT).
>
> - What is your evaluation of the design?

Most of it seems fine, but I do have some issues:

- The choice of [u]int64_t seems arbitrary and restrictive. It means
that Boost.JSON will not use a 128 bit integer even where one is
available, and it means that it cannot compile at all on implementations
that don't provide int64_t. It's good enough for my purposes, but I
would like to see some discussion about this. The json spec allows
arbitrarily large integers.

- On a similar note, the use of double restricts floating point
accuracy even when a higher-precision type is available. The json spec
allows arbitrarily precise decimal values.

- boost::json::value_to provides a single, clean way to extract
values from json, but it also renders other parts of the library (e.g.
number_cast) redundant except as an implementation detail.

- boost::json::value_to provides a single, clean way to extract
values from json, but it's syntactically long. The same functionality
in a member function of boost::json::value would be nicer to use.

- The omission of binary serialization formats (CBOR et al) bothers
me. Not from a theoretical point of view, but because I have actual
code that uses CBOR, and I won't be able to convert this code to
Boost.JSON unless CBOR support is provided. (Or I could write my own,
but that rather defeats the point of using a library in the first place,
especially if Neils Lohmann's library already provides CBOR support.)


> - What is your evaluation of the implementation?

I didn't look at it.

> - What is your evaluation of the documentation?

boost::json::value_to provides a single, clean way to extract values
from json, but it's actually rather hard to find in the documentation.
I was looking for a way to extract a std::string from a
boost::json::value, so I looked at the documentation for
boost::json::value and found as_string. OK, that returns
boost::json::string, which is not implicitly convertible to std::string
(in C++14). But it has an implicit conversion to string_view, which is
an alias of boost::string_view, which doesn't appear to be documented
anywhere but which (looking at the source code) has a member function
to_string. So I ended up with this code:

boost::json::value v = "Hello world.";
std::string s
= static_cast<boost::json::string_view>(v.as_string()).to_string();

...which I think we can all agree is an abomination. /Then/ I found out
about boost::json::value_to, and replaced my code with this:

std::string s = boost::json::value_to<std::string>(v);

Which is definitely nicer, though still not as nice as Niels Lohmann's code:

std::string s = v.get<std::string>();

boost::json::value_to or its member function replacement should
definitely be front and center to the documentation.

> - What is your evaluation of the potential usefulness of the library?

A good json library is very useful for a large number of programs. The
questions isn't if Boost should have a json library, but if this is the
json library that Boost should have.

> - Did you try to use the library? With which compiler(s)? Did you have
> any problems?

I converted a small but non-trivial program from Lohmann's json to the
proposed Boost.JSON, and compiled it with several different
cross-compilers. I did not encounter any problems compiling or running
the program, although I did have to add Boost.Container to the set of
linked libraries.

> - How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?

A few hours of work, most of which was spent converting code from
Lohmann's json library to the proposed Boost.JSON.

> - Are you knowledgeable about the problem domain?

Yes.

> Please be explicit about your decision (ACCEPT or REJECT).

For me, the ultimate question is if I would actually use this library,
and my reluctant answer is "not its current state". I'm basically
satisfied with Lohmann's json library, which requires less verbosity to
use and which provides CBOR support. I can see the attraction of
Boost.JSON's superior performance, and the attraction of incremental
parsing and serialization, but for my usage none of this matters. CBOR
support, on the other hand does matter.

I vote to CONDITIONALLY ACCEPT Boost.JSON, conditional on the inclusion
of code for parsing and serializing boost::json::value as CBOR. I can
live with the added verbosity of Boost.JSON (although I'd rather see it
reduced where possible), but not without CBOR.

(If CBOR support is not added, this vote should count as an abstain vote
and not as a reject. I don't think that Boost.JSON needs CBOR in order
to be useful to other people - I just don't want to vote to accept a
library that's not useful for me.)


--
Rainer Deyke (rai...@eldwood.com)

Vinnie Falco via Boost

unread,
Sep 15, 2020, 10:21:24 AM9/15/20
to boost@lists.boost.org List, Vinnie Falco, Rainer Deyke
On Tue, Sep 15, 2020 at 7:14 AM Rainer Deyke via Boost
<bo...@lists.boost.org> wrote:
> I did have to add Boost.Container to the set of linked libraries.

Boost.Container develop branch has a fix for this, so linking with
that library will not be necessary if Boost.JSON is released with
Boost:
<https://github.com/boostorg/container/commit/0b297019ec43483f523a3270b632fecbc3ce5e63>

Thank you for your thoughtful review!

Regards

Vinnie Falco via Boost

unread,
Sep 15, 2020, 10:36:54 AM9/15/20
to boost@lists.boost.org List, Vinnie Falco, Rainer Deyke
On Tue, Sep 15, 2020 at 7:14 AM Rainer Deyke via Boost
<bo...@lists.boost.org> wrote:
> - The choice of [u]int64_t seems arbitrary and restrictive. It means
> that Boost.JSON will not use a 128 bit integer even where one is
> available, and it means that it cannot compile at all on implementations
> that don't provide int64_t. It's good enough for my purposes, but I
> would like to see some discussion about this. The json spec allows
> arbitrarily large integers.
>
> - On a similar note, the use of double restricts floating point
> accuracy even when a higher-precision type is available. The json spec
> allows arbitrarily precise decimal values.

The spec gives implementations freedom to place arbitrary upper limits
on precision. To be useful as a vocabulary type, the library prefers
homogeneity of interface over min/maxing. In other words more value is
placed on having the library use the same integer representation on
all platforms than using the largest integer width available. Another
point is that the sizes of types in the library is very tightly
controlled. `sizeof(value)` is 16 bytes on 32-bit platforms and 24
bytes on 64-bit platforms, and this is for a reason. It is to keep
performance high and memory consumption low. There is a direct, linear
falloff in general performance with increasing size of types.

> - boost::json::value_to provides a single, clean way to extract
> values from json, but it also renders other parts of the library (e.g.
> number_cast) redundant except as an implementation detail.

Well, number_cast isn't redundant since it is the only interface which
offers the use of error codes rather than exceptions. We could have
gone with error codes in conversion to user-defined types but then the
interface would need some kind of expected<> return type and things
get messy there. Exceptions are a natural error handling mechanism.
However we recognize that in network programs there is a need to
convert numbers without using exceptions, thus number_cast is
available.

> - boost::json::value_to provides a single, clean way to extract
> values from json, but it's syntactically long. The same functionality
> in a member function of boost::json::value would be nicer to use.

Algorithms which can be implemented completely in terms of a class'
public interface are generally expressed as free functions in separate
header files. If we were to make `get` a member function of
`json::value`, then users who have no need to convert to and from
user-defined types would be unnecessarily including code they never
use.

Thanks

Alexander Grund via Boost

unread,
Sep 15, 2020, 10:44:28 AM9/15/20
to bo...@lists.boost.org, Alexander Grund

>> - boost::json::value_to provides a single, clean way to extract
>> values from json, but it's syntactically long. The same functionality
>> in a member function of boost::json::value would be nicer to use.
> Algorithms which can be implemented completely in terms of a class'
> public interface are generally expressed as free functions in separate
> header files. If we were to make `get` a member function of
> `json::value`, then users who have no need to convert to and from
> user-defined types would be unnecessarily including code they never
> use.

To add to that: You can use ADL to avoid naming the namespace:
`value_to<std::string>(json_val)` which is not much longer than:
`json_val.get<std::string>()`
I could have been named it `get_as` though:
`get_as<std::string>(json_val)` but one is as good as the other


Vinnie Falco via Boost

unread,
Sep 15, 2020, 10:45:26 AM9/15/20
to boost@lists.boost.org List, Vinnie Falco, Rainer Deyke
On Tue, Sep 15, 2020 at 7:14 AM Rainer Deyke via Boost
<bo...@lists.boost.org> wrote:
> I can see the attraction of
> Boost.JSON's superior performance, and the attraction of incremental
> parsing and serialization, but for my usage none of this matters. CBOR
> support, on the other hand does matter.

We are researching the topic. If you would like to weigh in, the issue
can be tracked here:
<https://github.com/CPPAlliance/json/issues/342>

One thing I will note, however. A Google search for "CBOR" produces
390,000 results while a Google search for JSON produces 188,000,000
results. Now there is surely some margin of error in these numbers but
I have to wonder how widespread is the use of CBOR.

Thanks

Peter Dimov via Boost

unread,
Sep 15, 2020, 10:46:18 AM9/15/20
to bo...@lists.boost.org, Peter Dimov
Rainer Deyke wrote:

> I vote to CONDITIONALLY ACCEPT Boost.JSON, conditional on the inclusion of
> code for parsing and serializing boost::json::value as CBOR.

I find this condition is too strict. You basically say that you'd rather not
see the proposed Boost.JSON enter Boost until CBOR is implemented, which may
happen six months from now. So people who don't have a need for CBOR will
have to wait until Boost 1.77, which doesn't really help anyone.

I can understand requiring a firm commitment on the part of the authors to
add CBOR support, but postponing the acceptance until this is implemented...

Rainer Deyke via Boost

unread,
Sep 16, 2020, 8:54:15 AM9/16/20
to bo...@lists.boost.org, Rainer Deyke
On 15.09.20 16:44, Vinnie Falco via Boost wrote:
> On Tue, Sep 15, 2020 at 7:14 AM Rainer Deyke via Boost
> <bo...@lists.boost.org> wrote:
>> I can see the attraction of
>> Boost.JSON's superior performance, and the attraction of incremental
>> parsing and serialization, but for my usage none of this matters. CBOR
>> support, on the other hand does matter.
>
> We are researching the topic. If you would like to weigh in, the issue
> can be tracked here:
> <https://github.com/CPPAlliance/json/issues/342>
>
> One thing I will note, however. A Google search for "CBOR" produces
> 390,000 results while a Google search for JSON produces 188,000,000
> results. Now there is surely some margin of error in these numbers but
> I have to wonder how widespread is the use of CBOR.

That's a valid point, but I would counter that most applications that
handle large amounts of JSON would benefit from using CBOR, and it's
often lack of library support that's holding them back.


--
Rainer Deyke (rai...@eldwood.com)

Rainer Deyke via Boost

unread,
Sep 16, 2020, 8:54:29 AM9/16/20
to bo...@lists.boost.org, Rainer Deyke
On 15.09.20 16:45, Peter Dimov via Boost wrote:
> Rainer Deyke wrote:
>
>> I vote to CONDITIONALLY ACCEPT Boost.JSON, conditional on the
>> inclusion of code for parsing and serializing boost::json::value as CBOR.
>
> I find this condition is too strict. You basically say that you'd rather
> not see the proposed Boost.JSON enter Boost until CBOR is implemented,
> which may happen six months from now. So people who don't have a need
> for CBOR will have to wait until Boost 1.77, which doesn't really help
> anyone.

Actually I would like to see Boost.JSON in Boost, with or without CBOR.
However, I can't in good conscience vote to accept a library that I am
unwilling to use. Boost should not be a graveyard of well-designed but
unused libraries.

As I explained in my review, if my condition is not met, I would like my
vote to be counted as ABSTAIN, not REJECT. I am hoping that Boost.JSON
will get enough votes to accept to make it into Boost, with or without
CBOR support.

> I can understand requiring a firm commitment on the part of the authors
> to add CBOR support, but postponing the acceptance until this is
> implemented...

If such a commitment is made, I would also consider my condition for
acceptance met.


--
Rainer Deyke (rai...@eldwood.com)

Peter Dimov via Boost

unread,
Sep 16, 2020, 9:33:11 AM9/16/20
to bo...@lists.boost.org, Peter Dimov
Rainer Deyke wrote:

> I vote to CONDITIONALLY ACCEPT Boost.JSON, conditional on the inclusion of
> code for parsing and serializing boost::json::value as CBOR.

The one interesting decision that needs to be made here is how to handle
CBOR byte strings (major type 3), as they aren't representable in JSON or in
the current boost::json::value.

I see that Niels Lohmann has added a "binary" 'kind' to the json value for
this purpose. Which would then invite the opposite question, what's a JSON
serializer supposed to do with kind_binary.

Niall Douglas via Boost

unread,
Sep 16, 2020, 9:33:32 AM9/16/20
to bo...@lists.boost.org, Niall Douglas
On 16/09/2020 08:31, Rainer Deyke via Boost wrote:
> On 15.09.20 16:45, Peter Dimov via Boost wrote:
>> Rainer Deyke wrote:
>>
>>> I vote to CONDITIONALLY ACCEPT Boost.JSON, conditional on the
>>> inclusion of code for parsing and serializing boost::json::value as
>>> CBOR.
>>
>> I find this condition is too strict. You basically say that you'd
>> rather not see the proposed Boost.JSON enter Boost until CBOR is
>> implemented, which may happen six months from now. So people who don't
>> have a need for CBOR will have to wait until Boost 1.77, which doesn't
>> really help anyone.
>
> Actually I would like to see Boost.JSON in Boost, with or without CBOR.
> However, I can't in good conscience vote to accept a library that I am
> unwilling to use.  Boost should not be a graveyard of well-designed but
> unused libraries.
>
> As I explained in my review, if my condition is not met, I would like my
> vote to be counted as ABSTAIN, not REJECT.  I am hoping that Boost.JSON
> will get enough votes to accept to make it into Boost, with or without
> CBOR support.

If the proposed library were called Boost.Serialisation2 or something, I
would see your point.

But it's called Boost.JSON. It implements JSON. It does not implement
CBOR. I don't think it's reasonable to recommend a rejection for a
library not doing something completely different to what it does.

Speaking wider that this, if the format here were not specifically JSON,
I'd also be more sympathetic - binary as well as text
serialisation/deserialisation is important. But JSON is unique, most
users would not choose JSON except that they are forced to do so by
needing to talk to other stuff which mandates JSON.

At work we have this lovely very high performance custom DB based on
LLFIO. It has rocking stats. But it's exposed to clients via a REST API,
and that means everything goes via JSON. So the DB spends most of its
time fairly idle compared to what it is capable of, because JSON is so
very very slow in comparison.

If we could choose anything but JSON, we would, but the customer spec
requires an even nastier and slower text format than JSON. We expect to
win the argument to get them to "upgrade" to JSON, but anything better
than that is years away. Change is hard for big governmental orgs.

In any case, CBOR is actually a fairly lousy binary protocol. Very
inefficient compared to alternatives. But the alternatives all would
require you to design your software differently to what JSON's very
reference count centric design demands.

Niall

Peter Dimov via Boost

unread,
Sep 16, 2020, 9:53:50 AM9/16/20
to bo...@lists.boost.org, Peter Dimov
Niall Douglas wrote:

> In any case, CBOR is actually a fairly lousy binary protocol. Very
> inefficient compared to alternatives.

I actually quite like CBOR. Of all the "binary JSON" protocols, it's the
best. Or the least worst, if you will. (Except for MessagePack, which I like
even more, but it's arguably not a binary JSON.)

Vinnie Falco via Boost

unread,
Sep 16, 2020, 10:11:34 AM9/16/20
to boost@lists.boost.org List, Vinnie Falco, Peter Dimov
On Wed, Sep 16, 2020 at 6:33 AM Peter Dimov via Boost
<bo...@lists.boost.org> wrote:
> The one interesting decision that needs to be made here is how to handle
> CBOR byte strings (major type 3), as they aren't representable in JSON or in
> the current boost::json::value.
>
> I see that Niels Lohmann has added a "binary" 'kind' to the json value for
> this purpose. Which would then invite the opposite question, what's a JSON
> serializer supposed to do with kind_binary.

If I make a library that has a public function which accepts a
parameter of type json::value, I don't expect to see binary objects in
it nor would I want to have to write code to handle something that is
not part of JSON. And I expect that things round-trip correctly, i.e.
if I serialize a json::value and then parse it, I get back the same
result. This is clearly impossible if the json::value contains a
"binary" string.

If CBOR was just another serialization format, then I might lean
towards implementing it. But CBOR is sounding more and more like it is
Not JSON and thus out-of-scope for Boost.JSON.

Thanks

Peter Dimov via Boost

unread,
Sep 16, 2020, 10:26:18 AM9/16/20
to Vinnie Falco, bo...@lists.boost.org, Peter Dimov
Vinnie Falco wrote:

> If I make a library that has a public function which accepts a parameter
> of type json::value, I don't expect to see binary objects in it nor would
> I want to have to write code to handle something that is not part of JSON.
> And I expect that things round-trip correctly, i.e. if I serialize a
> json::value and then parse it, I get back the same result. This is clearly
> impossible if the json::value contains a "binary" string.

"Binary string" is basically JSON string with UTF-8 validation turned off;
it's a common thing to want to send/receive and the binary formats are
arguably correct in offering it as a specific type. Of course this doesn't
change the fact that it's not representable in standard JSON.

Vinnie Falco via Boost

unread,
Sep 16, 2020, 10:29:00 AM9/16/20
to boost@lists.boost.org List, Vinnie Falco, Rainer Deyke
On Tue, Sep 15, 2020 at 7:14 AM Rainer Deyke via Boost
<bo...@lists.boost.org> wrote:
> - The omission of binary serialization formats (CBOR et al) bothers
> me. Not from a theoretical point of view, but because I have actual
> code that uses CBOR, and I won't be able to convert this code to
> Boost.JSON unless CBOR support is provided.

I've looked at the CBOR specification and some implementations in the
wild and these points stick out:

1. CBOR supports extensions, which cannot be represented in boost::json::value
2. CBOR also supports "binary" strings, which also cannot be
represented in boost::json::value
3. If boost.json's value container could hold these things, then it
would no longer serialize to standard JSON

Therefore, it seems to me that CBOR is not just a "binary
serialization format for JSON." It is in fact a completely different
format that only strongly resembles JSON. Or perhaps you could say it
is a superset of JSON. I think the best way to support this is as
follows:

1. Fork the Boost.JSON repository, rename it to Boost.CBOR
2. Add support for binary strings to the cbor::value type
3. Add support for extensions to the cbor::value type
4. Replace the parse, parser, serialize, and serializer interfaces
with CBOR equivalents
5. Propose this library as a new Boost library, with a separate review process

Then, we would have a first-class CBOR library whose interface and
implementation are optimized specifically for CBOR. Questions such as
what happens when you serialize a cbor::value to JSON would be moot.

This could be something that Krystian might take on as author and maintainer.

Thanks

Niall Douglas via Boost

unread,
Sep 16, 2020, 10:50:00 AM9/16/20
to bo...@lists.boost.org, Niall Douglas
On 16/09/2020 14:53, Peter Dimov via Boost wrote:
> Niall Douglas wrote:
>
>> In any case, CBOR is actually a fairly lousy binary protocol. Very
>> inefficient compared to alternatives.
>
> I actually quite like CBOR. Of all the "binary JSON" protocols, it's the
> best. Or the least worst, if you will. (Except for MessagePack, which I
> like even more, but it's arguably not a binary JSON.)

I know what you're saying. However, comparing **C++ implementations** of
CBOR to JSON ones does not yield much differential. For example,
simdjson will happily sustain 10 Gbit of textual JSON parsing per
session which is enough for most NICs. CBOR parsers, at least the ones
available to C++, are no better than this.

Our custom DB will push 20-25Gb/sec, so we'd need a 250 Gbit NIC and a
zero copy all binary network protocol to have the DB become the primary
bottleneck. I doubt any CBOR like design could achieve that, ever,
because that design is fundamentally anti-performance.

CBOR's primary gain for me is exact bit for bit value transfer, so
floating point numbers come out exactly as you sent them. That's rarely
needed outside scientific niches though, and even then, just send the FP
number as a string encoded in hexadecimal in JSON right?

In fact, for any format which looks better than JSON, encoding your
values as hexadecimal strings in JSON is an excellent workaround.
Hexadecimal string parsing is very, very fast on modern CPUs, you often
can add +20% to JSON bottlenecked performance by using hexadecimal for
everything.

Niall

Rainer Deyke via Boost

unread,
Sep 16, 2020, 10:55:47 AM9/16/20
to bo...@lists.boost.org, Rainer Deyke

I see CBOR not as a separate format, but as an encoding for JSON (with
some additional features that can safely be ignored). I use it to store
and transmit JSON data, and would not use it for anything else.

JSON data exists independently of the JSON serialization format. This
is in fact a core principle of Boost.JSON: the data representation
exists independently of the serialization functions.

> Speaking wider that this, if the format here were not specifically JSON,
> I'd also be more sympathetic - binary as well as text
> serialisation/deserialisation is important. But JSON is unique, most
> users would not choose JSON except that they are forced to do so by
> needing to talk to other stuff which mandates JSON.
>
> At work we have this lovely very high performance custom DB based on
> LLFIO. It has rocking stats. But it's exposed to clients via a REST API,
> and that means everything goes via JSON. So the DB spends most of its
> time fairly idle compared to what it is capable of, because JSON is so
> very very slow in comparison.

This is exactly the sort of problem that CBOR excels at. The server
produces JSON. The client consumes JSON. Flip a switch, and the server
produces CBOR instead. Ideally the client doesn't have to be changed at
all. One line of code changed in the server, and suddenly you have
twice the data throughput.

> In any case, CBOR is actually a fairly lousy binary protocol. Very
> inefficient compared to alternatives. But the alternatives all would
> require you to design your software differently to what JSON's very
> reference count centric design demands.

It may be lousy as a general-purpose binary protocol, but it's a fairly
good binary JSON representation. Which is why it belongs in a JSON
library if it belongs anywhere.


--
Rainer Deyke (rai...@eldwood.com)

Peter Dimov via Boost

unread,
Sep 16, 2020, 11:02:47 AM9/16/20
to bo...@lists.boost.org, Peter Dimov
Vinnie Falco wrote:

> I think the best way to support this is as follows:
>
> 1. Fork the Boost.JSON repository, rename it to Boost.CBOR
> 2. Add support for binary strings to the cbor::value type
> 3. Add support for extensions to the cbor::value type

This misses the point quite thoroughly; it doesn't "support this" at all.

The point of "binary JSON" is that people already have a code base that uses
JSON for communication and - let's suppose - boost::json::value internally.
Now those people want to offer an optional, alternate wire format that is
not as wasteful as JSON, so that the other endpoint may choose to use it.
But they most definitely don't want to rewrite or duplicate all their code
to be cbor::value based, instead of, or in addition to, json::value based.

It doesn't matter that CBOR supports extensions, because they don't. And it
may not even matter that the hypothetical CBOR-to-json::value parser doesn't
support binary values, because their protocol, being originally JSON-based,
doesn't.

(There is actually a fully compatible way to support "binary values" in
json::value, but it will require some hammering out.)

Peter Dimov via Boost

unread,
Sep 16, 2020, 11:09:12 AM9/16/20
to bo...@lists.boost.org, Peter Dimov
Niall Douglas wrote:

> I know what you're saying. However, comparing **C++ implementations** of
> CBOR to JSON ones does not yield much differential. For example, simdjson
> will happily sustain 10 Gbit of textual JSON parsing per session which is
> enough for most NICs. CBOR parsers, at least the ones available to C++,
> are no better than this.

That's not a fair comparison, because you need fewer Gb to encode the same
thing in CBOR.

Vinnie Falco via Boost

unread,
Sep 16, 2020, 11:09:13 AM9/16/20
to boost@lists.boost.org List, Vinnie Falco, Peter Dimov
On Wed, Sep 16, 2020 at 8:02 AM Peter Dimov via Boost
<bo...@lists.boost.org> wrote:
> The point of "binary JSON" is that people already have a code base that uses
> JSON for communication and - let's suppose - boost::json::value internally.
> Now those people want to offer an optional, alternate wire format that is
> not as wasteful as JSON, so that the other endpoint may choose to use it.

So what is being discussed here is "partial CBOR support?" In other
words, only the subset of CBOR that perfectly overlaps with JSON?

> (There is actually a fully compatible way to support "binary values" in
> json::value, but it will require some hammering out.)

Well, let's hear it!

Thanks

Peter Dimov via Boost

unread,
Sep 16, 2020, 11:20:22 AM9/16/20
to Vinnie Falco, bo...@lists.boost.org, Peter Dimov
Vinnie Falco wrote:

> So what is being discussed here is "partial CBOR support?" In other words,
> only the subset of CBOR that perfectly overlaps with JSON?

Well, it's self-evident that parsing a json::value from, and serializing a
json::value to, CBOR, only could ever support the subset of CBOR that's
representable in json::value. Fortunately, that's almost all of CBOR,
because that's what CBOR is for.

> > (There is actually a fully compatible way to support "binary values" in
> > json::value, but it will require some hammering out.)
>
> Well, let's hear it!

You may remember my going on and on about arrays of scalars at one point.
This is what MessagePack gets right - arrays of scalars are an important
special case and stuffing them into the general value[] representation
wastes both memory and performance.

If json::array supported internally arrays of scalars, without changing its
interface at all so it still appeared to clients as a json::array of
json::values, we could represent a binary value as a json::array (which
internally uses a scalar type of unsigned char.)

This magically solves everything - value_to<vector<unsigned char>> works,
value_from same works, it roundtrips.

Vinnie Falco via Boost

unread,
Sep 16, 2020, 11:39:49 AM9/16/20
to Peter Dimov, Vinnie Falco, boost@lists.boost.org List
On Wed, Sep 16, 2020 at 8:20 AM Peter Dimov <pdi...@gmail.com> wrote:
> If json::array supported internally arrays of scalars, without changing its
> interface at all so it still appeared to clients as a json::array of
> json::values, we could represent a binary value as a json::array (which
> internally uses a scalar type of unsigned char.)

That is easy to say but what do you do about this function which
returns a reference:

inline value& array::operator[]( std::size_t pos ) noexcept;

What do you do if you have an array of int (scalar) and someone
accesses an element in the middle and assigns a string to it?

Thanks

Niall Douglas via Boost

unread,
Sep 16, 2020, 11:49:36 AM9/16/20
to bo...@lists.boost.org, Niall Douglas
On 16/09/2020 16:05, Peter Dimov via Boost wrote:
> Niall Douglas wrote:
>
>> I know what you're saying. However, comparing **C++ implementations**
>> of CBOR to JSON ones does not yield much differential. For example,
>> simdjson will happily sustain 10 Gbit of textual JSON parsing per
>> session which is enough for most NICs. CBOR parsers, at least the ones
>> available to C++, are no better than this.
>
> That's not a fair comparison, because you need fewer Gb to encode the
> same thing in CBOR.

This is true, but I didn't mention that I was accounting already for that.

CBOR had about 15% overhead from binary when I last tested it. JSON for
the data we were transmitting it was around 50%. However simdjson and
sajson are many many times faster than the CBOR library I tested, and
JSON compresses very easily.

I guess what I'm really saying here is that yes, JSON emits less dense
data than CBOR. But, a recent JSON parser + snappy compression produces
denser representation than CBOR, and yet is still faster overall.

I'm very sure that a faster CBOR library is possible than what we have.
But given what's currently available in the ecosystem, I'm saying a
recent JSON library + fast compression is both faster and smaller output
than currently available alternatives right now.

This is why I'm not keen on CBOR personally. I don't think it solves a
problem anyone actually currently has, rather it solves a problem people
think they have because they haven't considered adding a compression
pass to a fast JSON implementation.

Niall

Peter Dimov via Boost

unread,
Sep 16, 2020, 11:54:00 AM9/16/20
to Vinnie Falco, Peter Dimov, bo...@lists.boost.org
Vinnie Falco wrote:

> That is easy to say but what do you do about this function which returns a
> reference:
>
> inline value& array::operator[]( std::size_t pos ) noexcept;
>
> What do you do if you have an array of int (scalar) and someone accesses
> an element in the middle and assigns a string to it?

The only possible answer is "copy the entire thing into value[]". This may
or may not be acceptable. I would think that if you have an array of 8044
ints, assigning a string somewhere in the middle would be a rare occurrence,
but who knows.

The upside is that an array of ints would consume significantly less memory
than today; this is often important and we've already seen that for a subset
of users, memory consumption is _the_ important metric. (It would also make
value_to<vector<int>> infinitely faster.)

The downside of course is that when someone assigns a single non-int
somewhere, or even just invokes the non-const op[] without assigning a
non-int, or the non-const begin(), the whole thing reallocates. This would
kill the noexcept, at the very minimum. There's a thing to be said here
about returning references to internal state, but consistency has its
benefits.

For CBOR 1.0, I'd just reject binary values on parsing. That's still much
more useful than not having it.

Gavin Lambert via Boost

unread,
Sep 16, 2020, 8:21:37 PM9/16/20
to bo...@lists.boost.org, Gavin Lambert
On 17/09/2020 02:26, Peter Dimov wrote:
> "Binary string" is basically JSON string with UTF-8 validation turned
> off; it's a common thing to want to send/receive and the binary formats
> are arguably correct in offering it as a specific type. Of course this
> doesn't change the fact that it's not representable in standard JSON.

The standard JSON representation would be a base64 string. Though of
course both sides have to agree to that.

And you probably wouldn't want a library to non-semantically auto-decode
anything that looks vaguely like a base64 string to bytes (but you'd
need something like that if you wanted to round-trip CBOR to JSON to CBOR).

Paul A Bristow via Boost

unread,
Sep 18, 2020, 4:19:17 AM9/18/20
to bo...@lists.boost.org, pbri...@hetp.u-net.com

> -----Original Message-----
> From: Boost <boost-...@lists.boost.org> On Behalf Of Pranam Lashkari via Boost
> Sent: 14 September 2020 08:30
> To: boost <bo...@lists.boost.org>
> Cc: Pranam Lashkari <plashk...@gmail.com>
> Subject: [boost] [review][JSON] Review of JSON starts today: Sept 14 - Sept 23
>
> Boost formal review of Vinnie Falco and Krystian Stasiowski's library JSON starts today and will run for 10
> days ending on 23 Sept 2020. Both of these authors have already developed a couple of libraries which
> are accepted in Boost(boost beast and Static String)
>
> This library focuses on a common and popular use-case for JSON. It provides a container to hold parsed
> and serialised JSON types. It provides more flexibility and better benchmark performance than its
> competitors.
>
> JSON highlights the following features in the documentation:
>
> - Fast compilation
> - Require only C++11
> - Fast streaming parser and serializer
> - Easy and safe API with allocator support
> - Constant-time key lookup for objects
> - Options to allow non-standard JSON
> - Compile without Boost, define BOOST_JSON_STANDALONE
> - Optional header-only, without linking to a library

What I knew about JSON could have be written on a postage stamp, but I have at least read the documentation.

It is an example of how it should be done. It has good examples and good reference info. I could quickly see how to use it, but didn't have a need, and didn't feel it useful as others already have, and there are benchmarks too.

On that basis alone, my view is ACCEPT, FWIW.

> (a point I would like to add in highlight: it has cool Jason logo 😝)

(😝 indeed - My only recommendation is to replace this with a Boost logo ASAP ! No - more than that - I make it a condition for acceptance.)

Paul

PS That there are other libraries doing similar (but fairly different) things is no reason to reject this library.

Pranam Lashkari via Boost

unread,
Sep 19, 2020, 9:43:38 AM9/19/20
to boost, Pranam Lashkari
Again I request everyone to spare some time to review this library. Last
day to submit the official review would be on the 23rd of September. There
is no way this review would be extended as there are other reviews aligned
so please submit the review as soon as possible. Every review is really
important to boost.

Thank you very much for your time in advance.

On Mon, Sep 14, 2020 at 1:00 PM Pranam Lashkari <plashk...@gmail.com>
wrote:

> Boost formal review of Vinnie Falco and Krystian Stasiowski's library JSON
> starts today and will run for 10 days ending on 23 Sept 2020. Both of these
> authors have already developed a couple of libraries which are accepted in
> Boost(boost beast and Static String)
>
> This library focuses on a common and popular use-case for JSON. It
> provides a container to hold parsed and serialised JSON types. It provides
> more flexibility and better benchmark performance than its competitors.
>
> JSON highlights the following features in the documentation:
>
> - Fast compilation
> - Require only C++11
> - Fast streaming parser and serializer
> - Easy and safe API with allocator support
> - Constant-time key lookup for objects
> - Options to allow non-standard JSON
> - Compile without Boost, define BOOST_JSON_STANDALONE
> - Optional header-only, without linking to a library
>

> (a point I would like to add in highlight: it has cool Jason logo 😝)
>
>

> To quickly understand and get the flavour of the library take a look at
> "Quick Look"
> <http://master.json.cpp.al/json/usage/quick_look.html>
>
> You can find the source code to be reviewed here:
> <https://github.com/CPPAlliance/json/tree/master>
>
> You can find the latest documentation here:
> <http://master.json.cpp.al/>
>
> Benchmarks are also given in the document which can be found here:
> <http://master.json.cpp.al/json/benchmarks.html>
>
> Some people have also given the early reviews, the thread can be found
> here:
> <https://lists.boost.org/Archives/boost/2020/09/249745.php>


>
>
>
> Please provide in your review information you think is valuable to
> understand your choice to ACCEPT or REJECT including JSON as a
> Boost library. Please be explicit about your decision (ACCEPT or REJECT).
>

> Some other questions you might want to consider answering:


>
> - What is your evaluation of the design?

> - What is your evaluation of the implementation?

> - What is your evaluation of the documentation?

> - What is your evaluation of the potential usefulness of the library?

> - Did you try to use the library? With which compiler(s)? Did you have
> any problems?

> - How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?

> - Are you knowledgeable about the problem domain?
>

> More information about the Boost Formal Review Process can be found
> here: <http://www.boost.org/community/reviews.html>
>
> Thank you for your effort in the Boost community.
>
> --
> Thank you,
> Pranam Lashkari, https://lpranam.github.io/
>


--
Thank you,
Pranam Lashkari, https://lpranam.github.io/

Alexander Grund via Boost

unread,
Sep 21, 2020, 5:04:54 AM9/21/20
to bo...@lists.boost.org, Alexander Grund

> Please provide in your review information you think is valuable to
> understand your choice to ACCEPT or REJECT including JSON as a
> Boost library. Please be explicit about your decision (ACCEPT or REJECT).
ACCEPT. Few minor issues to iron out, but nothing holding back acceptance.
> Some other questions you might want to consider answering:
>
> - What is your evaluation of the design?
Sound. The introduction of a new string-type raises eyebrows but it
seems to be worth it given the improvements. However a conversion to a
std::string should likely be added (or documented as supported already)
I'm however not convinced about the as_double, value_to<double> and
number_cast methods and their differences and intuitive exceptions from
users. I'd expect that documented at "Using numbers" or at least a cross
reference from there
> - What is your evaluation of the implementation?
After the discussions in Slack and on ML it improved considerably.
Especially due to use of inline namespaces and is_/as_ function changes.
Things left:
- `basic_parser` being declared in a detail header feels wrong. The
implementation (which includes the "private" declaration) is "public"
which is kind of the opposite I expected and indeed other reviewers
found the same. Especially as the docs list it as "defined in detail".
I'd put both in public, maybe use "basic_parser_impl.hpp"
- There seem to be multiple ways of doing things and some seem to make
great differences in exception handling, conversions and performance.
Sometimes this doesn't seem to be clear (see other reviews and above)
> - What is your evaluation of the documentation?
Very good with few improvements:
- "Quick Look" should be a top-level link. I struggled to find it in my
first pass. I'd expect that to be the very first link when opening the
docu or even at the front-page
- "This storage is freed when the parser is destroyed, allowing the
parser to cheaply re-use this memory on subsequent parses, improving
performance." - What does this mean? How can a destroyed parser reuse
anything? Or why does it need highlighting that freed memory can be reused?
- The example `handler` assigns "-1" to a std::size_t. I guess that
deserves a comment at least for beginners wondering about the
signed->unsigned conversion
- "`finish` Parse JSON incrementally." is lacking a clearer summary
- "parser::release" says "UB if the parser is not done" but "If !
this->done(), an exception is thrown.", which seems contradictory. I'd
remove the UB here, maybe use an EC overload if exceptions should be avoided
- "write/write_some" both say "Parse JSON incrementally.". The summary
should make it clearer what they do and the full description should
contain, well, a full description
- concepts like "ToMapLike" are explained after they are used w/o a
link/reference
> - What is your evaluation of the potential usefulness of the library?
Very useful especially after reading Peters review about extending the
parser etc.
> - Did you try to use the library? With which compiler(s)? Did you have
> any problems?
Yes with some small tests. GCC 10.0, no problems
> - How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?without explanation what they are
Couple hours checking docu and code.
> - Are you knowledgeable about the problem domain?
Ok, I guess. Just as anyone who has used JSON anywhere.

Vinnie Falco via Boost

unread,
Sep 21, 2020, 2:46:32 PM9/21/20
to boost@lists.boost.org List, Vinnie Falco
I've gotten some great feedback on this library, thanks to everyone who
spent the time to look at this work and comment on it. We have a little
under three days, and I wanted to address something that is obviously
important as it has been brought up a few times here and elsewhere.

It seems there are two desired use-cases for JSON:

1. "JSON-DOM" - parse and serialize to and from a variant-like hierarchical
container provided by the library

2. "JSON Serialization" - parse and serialize directly to and from
user-defined types.

My thoughts on this are as follows:

* Both of these use-cases are useful, and desirable
* Most of the time, a user wants one or the other - rarely both
* Optimized implementations of these use-cases are unlikely to share much
code
* These are really two different libraries

Boost.JSON is designed to offer 1. above and has no opinion on 2.

Some of the less-than-positive reviews argue that both of these use-cases
should be crammed into one library, otherwise users should not have access
to either. Here are some related facts:

* No one has submitted a JSON library of any kind for review to Boost *ever*
* The most popular JSON libraries implement JSON-DOM, not JSON Serialization
* Even one of the most popular serialization libraries,
Boost.Serialization, does not offer a JSON archive implementation
* Boost.Property tree supports JSON-DOM out of the box, but not JSON
Serialization

I find it interesting that people are coming out of the woodwork who claim
to have written their own JSON libraries, that say REJECT to Boost.JSON,
because they feel that conversion between JSON and user-defined types is of
the utmost importance and that Boost can't have a JSON library without it.

* If this is so important, why does Boost.Serialization not have it?
* Why is no one submitting a pull request to Boost.Serialization for a JSON
archive?
* Why has no one proposed a library to Boost which implements JSON
Serialization?
* Why doesn't Boost.PropertyTree have JSON-DOM but no JSON Serialization?
* Where are the immensely popular JSON Serialization libraries?

Meanwhile the JSON For Modern C++ repository has over 20,000 GitHub stars
and Tencent's RapidJSON repository has almost 10,000 GitHub stars. In a
sense these libraries have become standards, which is a shame since they
both have defects which I believe Boost.JSON addresses. Clearly the
JSON-DOM use case is popular (these libraries also do not offer JSON
Serialization). Where are the immensely popular JSON Serialization
libraries?

Not to put too fine a point on it...but these arguments against Boost.JSON
do not withstand any sort of scrutiny; rational readers should find them
entirely unconvincing.

Thanks

Robert Ramey via Boost

unread,
Sep 21, 2020, 4:02:50 PM9/21/20
to bo...@lists.boost.org, Robert Ramey
On 9/21/20 11:46 AM, Vinnie Falco via Boost wrote:

>
> I find it interesting that people are coming out of the woodwork who claim
> to have written their own JSON libraries, that say REJECT to Boost.JSON,
> because they feel that conversion between JSON and user-defined types is of
> the utmost importance and that Boost can't have a JSON library without it.
>

> * If this is so important, why does Boost.Serialization not have it?

Good question. FYI - no one has EVER requested or even inquired about
this. I don't think it would be very hard (the xml_archive is hard).
Maybe everyone who felt the needed it just made there own. It's also
possible that no one uses the serialization library anymore. I would
have no way of knowing if that were the case. FWIW I'm hoping this year
to do a re-boot of the serialization documentation. The main focus is
to convert it from raw html to boost.book. But also I want to add more
examples: using it for "deep copy", using co-routines to convert from
one serialization format to another, layering other boost libraries like
encryption and compression, generating "editable" archives, etc. None
of these will alter the library itself - just provide more examples.

I have always been puzzled why no one has ever asked for any of these
things.

> * Why is no one submitting a pull request to Boost.Serialization for a JSON
> archive?

ditto

> * Why has no one proposed a library to Boost which implements JSON
> Serialization?
> * Why doesn't Boost.PropertyTree have JSON-DOM but no JSON Serialization?
> * Where are the immensely popular JSON Serialization libraries?

ditto

Robert Ramey

Janek Kozicki via Boost

unread,
Sep 21, 2020, 5:01:46 PM9/21/20
to bo...@lists.boost.org, Janek Kozicki
Robert Ramey via Boost said: (by the date of Mon, 21 Sep 2020 13:01:58 -0700)

> > * If this is so important, why does Boost.Serialization not have it?
>
> Good question. FYI - no one has EVER requested or even inquired about
> this. I don't think it would be very hard (the xml_archive is hard).
> Maybe everyone who felt the needed it just made there own. It's also
> possible that no one uses the serialization library anymore.

I think that it's so good that there is nothing left to ask about
(except for problems with serializing boost::optional ;) #165


Vinnie Falco via Boost said: (by the date of Mon, 21 Sep 2020 11:46:13 -0700)

> Not to put too fine a point on it...but these arguments against Boost.JSON
> do not withstand any sort of scrutiny; rational readers should find them
> entirely unconvincing.


Agreed.

--
# Janek Kozicki http://janek.kozicki.pl/

Vinícius dos Santos Oliveira via Boost

unread,
Sep 21, 2020, 9:11:49 PM9/21/20
to Boost, Vinícius dos Santos Oliveira
Em seg., 21 de set. de 2020 às 15:46, Vinnie Falco via Boost
<bo...@lists.boost.org> escreveu:

> * Where are the immensely popular JSON Serialization libraries?

My gopher friend actually makes fun of us because we don't have
json.Unmarshal(): https://blog.golang.org/json

It's him making fun of us/me that gave me the idea to integrate
Boost.Serialization and Boost.Hana's Struct.

```go
type Bid struct {
Price string
Size string
NumOrders int
}
type OrderBook struct {
Sequence int64 `json:"sequence"`
Bids []Bid `json:"bids"`
Asks []Bid `json:"asks"`
}

...
var book OrderBook
json.Unmarshal(buffer, &book)
```

But that's a subject that I'll want to discuss with Robert Ramey after
Boost.JSON's review.

In my job, we have our own in-house serialization framework. I guess
that's what many others do as well in C++, but the requirements on the
JSON library to be usable in serialization frameworks would be quite
similar. Compare Boost.Serialization and QDataStream from Qt.


--
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

Peter Dimov via Boost

unread,
Sep 21, 2020, 10:02:11 PM9/21/20
to bo...@lists.boost.org, Peter Dimov
Vinícius dos Santos Oliveira wrote:

> In my job, we have our own in-house serialization framework. I guess
> that's what many others do as well in C++, but the requirements on the
> JSON library to be usable in serialization frameworks would be quite
> similar.

The easiest way to make a JSON input archive for Boost.Serialization is to
use Boost.JSON and go through json::value. Boost.Serialization typically
wants fields to appear in the same order they were written, but JSON allows
arbitrary reordering. Reading a json::value first and then deserializing
from that is much easier than deserializing directly from JSON.

For output, it's easier to bypass json::value and write JSON directly.

Vinícius dos Santos Oliveira via Boost

unread,
Sep 22, 2020, 6:24:33 AM9/22/20
to Boost, Vinícius dos Santos Oliveira, Peter Dimov
Em seg., 21 de set. de 2020 às 23:02, Peter Dimov via Boost
<bo...@lists.boost.org> escreveu:

> The easiest way to make a JSON input archive for Boost.Serialization is to
> use Boost.JSON and go through json::value. Boost.Serialization typically
> wants fields to appear in the same order they were written, but JSON allows
> arbitrary reordering. Reading a json::value first and then deserializing
> from that is much easier than deserializing directly from JSON.
>
> For output, it's easier to bypass json::value and write JSON directly.

It may be easier, but it's also the wrong way. There are libraries
whose usage patterns leak some structure into user code itself. We
have Lua and its virtual stack, for instance. Boost.Serialization is
one of such libraries. That's really a topic that I'd rather delay the
explanation on.

Boost.Serialization un-capturable structure makes it impossible to
untangle the serialization format from ordered trees. For a JSON
archive, this means arrays everywhere, and json::value doesn't really
help here.

But there's a catch. The user can have very valid concerns to control
serialization to just one archive or another. Here's where he'll
overload his types directly to a single archive. And here's where he
can use archive extensions from one model. For the JSON iarchive, the
only extension we need to expose is the pull parser object. This and
some accompanying algorithms (json::partial::skip(),
json::partial::scanf(), ...) is a much better answer than what you
propose.

It's not really hard if you understand the pull parser model. But I'm
more excited about Boost.Hana's Struct integration actually. We can
discuss all in detail after Boost.JSON's review. Please don't
misdirect discussions with comments such as "it'd be easier with
json::value". That's hardly a comment from somebody who explored the
subject.


--
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

_______________________________________________

Hans Dembinski via Boost

unread,
Sep 22, 2020, 7:22:47 AM9/22/20
to Boost Devs, Hans Dembinski

> On 21. Sep 2020, at 20:46, Vinnie Falco via Boost <bo...@lists.boost.org> wrote:
>
> * Both of these use-cases are useful, and desirable
> * Most of the time, a user wants one or the other - rarely both

The point of the proponents of a serialisation solution is that it can potentially also handle the DOM case, but obviously not vice versa. We may end up in a situation where Boost.JSON is accepted, but it does not address all use-cases and so yet another library has to be written. This will lead to code duplication. Perhaps Boost.JSON could build on this hypothetical library once it is there, but are you going to change your implementation if you then have to rely on another Boost library?

> * Optimized implementations of these use-cases are unlikely to share much
> code

I think the whole parsing can be shared.

> * These are really two different libraries

I don't think so. If we can find a solution that allows one to deserialize JSON to the dynamic json::value, then this would making reading into the DOM type a special case of normal serialization. In fact, we could then even deserialize JSON into Boost.Any or std::any.

> * No one has submitted a JSON library of any kind for review to Boost *ever*
> * The most popular JSON libraries implement JSON-DOM, not JSON Serialization
> * Even one of the most popular serialization libraries,
> Boost.Serialization, does not offer a JSON archive implementation
> * Boost.Property tree supports JSON-DOM out of the box, but not JSON
> Serialization
>
> I find it interesting that people are coming out of the woodwork who claim
> to have written their own JSON libraries, that say REJECT to Boost.JSON,
> because they feel that conversion between JSON and user-defined types is of
> the utmost importance and that Boost can't have a JSON library without it.
>
> * If this is so important, why does Boost.Serialization not have it?
> * Why is no one submitting a pull request to Boost.Serialization for a JSON
> archive?
> * Why has no one proposed a library to Boost which implements JSON
> Serialization?
> * Why doesn't Boost.PropertyTree have JSON-DOM but no JSON Serialization?
> * Where are the immensely popular JSON Serialization libraries?

The reason is that some Boost libraries have been successfully cloned by outside people and the development is happening there. Boost.Python is replaced by pybind11 and Boost.Serialization by cereal.

https://github.com/USCiLab/cereal

The most obvious appeal of cereal is that it is C++11. Boost.Serialization has a lot of pre-C++11 code and is accordingly a bit difficult to work with. Both pybind11 and cereal jumped on the opportunity to rewrite popular Boost libraries in C++11.

cereal has a JSON archive. cereal also has 2.5k stars on Github, so it is dramatically popular.

Best regards,
Hans

Peter Dimov via Boost

unread,
Sep 22, 2020, 8:22:36 AM9/22/20
to bo...@lists.boost.org, Peter Dimov
Hans Dembinski wrote:

> The point of the proponents of a serialisation solution is that it can
> potentially also handle the DOM case, but obviously not vice versa.

I don't think so. In fact the reverse is true. Deserialization from DOM is
trivial. Making a deserialization library build a DOM is decidedly
nontrivial. I'm not sure how it could be done.

Peter Dimov via Boost

unread,
Sep 22, 2020, 8:23:34 AM9/22/20
to Vinícius dos Santos Oliveira, Boost, Peter Dimov
Vinícius dos Santos Oliveira wrote:

> But I'm more excited about Boost.Hana's Struct integration actually.

Is this like

https://pdimov.github.io/describe/doc/html/describe.html#example_serialization

or do you have something else in mind?

Bjorn Reese via Boost

unread,
Sep 22, 2020, 8:48:55 AM9/22/20
to Peter Dimov via Boost, Bjorn Reese
On 2020-09-22 14:20, Peter Dimov via Boost wrote:

> I don't think so. In fact the reverse is true. Deserialization from DOM
> is trivial. Making a deserialization library build a DOM is decidedly
> nontrivial. I'm not sure how it could be done.

It is easy with a pull parser. The following header shows both direct
serialization from DOM to JSON, and direct deserizalization from JSON to
DOM:


https://github.com/breese/trial.protocol/blob/develop/include/trial/protocol/json/serialization/dynamic/variable.hpp

Vinícius dos Santos Oliveira via Boost

unread,
Sep 22, 2020, 8:52:13 AM9/22/20
to Peter Dimov, Vinícius dos Santos Oliveira, Boost
Em ter., 22 de set. de 2020 às 09:22, Peter Dimov <pdi...@gmail.com> escreveu:
> > But I'm more excited about Boost.Hana's Struct integration actually.
>
> Is this like
>
> https://pdimov.github.io/describe/doc/html/describe.html#example_serialization
>
> or do you have something else in mind?

Yes and no. If you rely on universal serialization, you fall back to
Boost.Serialization implied ordered trees.

The TMP code could look like the example you linked (e.g. one
mp11::mp_for_each() here and there). Are you willing to submit a new
reflection library to Boost? I'd be glad to hear all about it after
Boost.JSON's review. One of such questions would be: how does it
compare to Boost.Hana's Struct?

The reason why I'm trying to delay this debate is because I know we'll
have lots of noise from non-interested parties and I'd rather not deal
with that. A couple extra days and all the unwanted noise would
vanish.

There are a few questions to sort it out:

- Do you want to enable Boost.Hana by default or use an opt-in mechanism?
- Overload rules to choose the most specific serialization.
- Integration to Boost.Serialization's extra features. This will
require a larger time investment, but has nothing to do with
Boost.Hana or integration to any other reflection library.
- And of course, concerns raised by interested stakeholders.

The end game would be:

```json
{
"foo": 42,
"bar": "hello world"
}
```

(de)serializes effortlessly to:

```cpp
struct Foobar
{
int foo;
std::string bar;
};
```

just like Go's json.Unmarshal()


--
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

_______________________________________________

Peter Dimov via Boost

unread,
Sep 22, 2020, 9:21:47 AM9/22/20
to bo...@lists.boost.org, Peter Dimov
Bjorn Reese wrote:
> On 2020-09-22 14:20, Peter Dimov via Boost wrote:
>
> > I don't think so. In fact the reverse is true. Deserialization from DOM
> > is trivial. Making a deserialization library build a DOM is decidedly
> > nontrivial. I'm not sure how it could be done.
>
> It is easy with a pull parser. The following header shows both direct
> serialization from DOM to JSON, and direct deserizalization from JSON to
> DOM:
>
> https://github.com/breese/trial.protocol/blob/develop/include/trial/protocol/json/serialization/dynamic/variable.hpp

Yes, you're right, it's indeed easy with a pull parser, if the value and the
archive cooperate.

This still doesn't make one approach a superset of the other though. You can
feed Boost.JSON's push parser incrementally. A pull parser, being a pull
one, reverses the flow of control and asks you (its stream) for data when
you pull from it. Yes, it makes things like the above possible, but I don't
think it entirely supersedes pull parsers.

I remember this being brought up before, so you may have a solution for the
incremental case. A pull parser could f.ex. return token::need_input or
something like that when it's starved, like a nonblocking socket. There'll
still be a disconnect between its buffer size and the amount of incoming
data though, unless I'm missing something.

Peter Dimov via Boost

unread,
Sep 22, 2020, 9:22:14 AM9/22/20
to Vinícius dos Santos Oliveira, Peter Dimov, Boost
Vinícius dos Santos Oliveira wrote:
> The end game would be:
>
> ```json
> {
> "foo": 42,
> "bar": "hello world"
> }
> ```

Switch "foo" and "bar" here for generality.

> (de)serializes effortlessly to:
>
> ```cpp
> struct Foobar
> {
> int foo;
> std::string bar;
> };
> ```

It's already possible to make this work using Boost.JSON.
https://pdimov.github.io/describe/doc/html/describe.html#example_from_json

Just add the `parse` call.

> Are you willing to submit a new reflection library to Boost?

Yes, I'm waiting for the review to end to not detract from the Boost.JSON
discussions.

Vinícius dos Santos Oliveira via Boost

unread,
Sep 22, 2020, 9:37:06 AM9/22/20
to Peter Dimov, Vinícius dos Santos Oliveira, Boost
Em ter., 22 de set. de 2020 às 10:21, Peter Dimov <pdi...@gmail.com> escreveu:
> It's already possible to make this work using Boost.JSON.
> https://pdimov.github.io/describe/doc/html/describe.html#example_from_json
>
> Just add the `parse` call.

You're missing the point. I don't want the DOM intermediate
representation. It's not needed. The right choice for the archive
concept is to use a pull parser and read directly from the buffer.

The DOM layer adds a high cost here.

> Yes, I'm waiting for the review to end to not detract from the Boost.JSON
> discussions.

Looking forward to it.


--
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

_______________________________________________

Bjorn Reese via Boost

unread,
Sep 22, 2020, 9:46:00 AM9/22/20
to Peter Dimov via Boost, Bjorn Reese
On 2020-09-22 15:17, Peter Dimov via Boost wrote:

> This still doesn't make one approach a superset of the other though. You
> can feed Boost.JSON's push parser incrementally. A pull parser, being a
> pull one, reverses the flow of control and asks you (its stream) for
> data when you pull from it. Yes, it makes things like the above
> possible, but I don't think it entirely supersedes pull parsers.
>
> I remember this being brought up before, so you may have a solution for
> the incremental case. A pull parser could f.ex. return token::need_input
> or something like that when it's starved, like a nonblocking socket.
> There'll still be a disconnect between its buffer size and the amount of
> incoming data though, unless I'm missing something.

https://github.com/breese/trial.protocol/tree/develop/example/json/chunked_push_parser

The current limitation is that the buffer must be large enough to hold
the largest string or number.

Peter Dimov via Boost

unread,
Sep 22, 2020, 10:23:52 AM9/22/20
to bo...@lists.boost.org, Peter Dimov
Bjorn Reese wrote:

> https://github.com/breese/trial.protocol/tree/develop/example/json/chunked_push_parser
>
> The current limitation is that the buffer must be large enough to hold the
> largest string or number.

You can lift it if you introduce string_part tokens. This will bring it even
closer to Boost.JSON. It will make the pull interface less pure though.

Peter Dimov via Boost

unread,
Sep 22, 2020, 10:24:06 AM9/22/20
to Vinícius dos Santos Oliveira, Peter Dimov, Boost
Vinícius dos Santos Oliveira wrote:

> You're missing the point. I don't want the DOM intermediate
> representation. It's not needed.

I get that. I also get the appeal and the utility of pull parsers. But my
point is that I can make that work today, quite easily, using Boost.JSON.

It's 2020. Boost has zero pull JSON parsers. (I counted them, twice.) The
"boost" implementation on https://github.com/kostya/benchmarks#json uses
PropertyTree and is leading from behind. Maybe Boost.JSON can be refactored
and a pull parser can be inserted at the lowest level. But in the meantime,
there are people who have actual need for a nlohmann/json (because of speed)
or RapidJSON (because of interface) replacement, and we don't have it. It
doesn't make much sense to me to wait until 2032 to have that in Boost.

Pranam Lashkari via Boost

unread,
Sep 22, 2020, 12:07:57 PM9/22/20
to boost, Pranam Lashkari
Tomorrow(23rd Sept) is going to be the last day to submit an official
review for the JSON, so if you are willing to submit review, hurry up.

On Sat, Sep 19, 2020 at 7:12 PM Pranam Lashkari <plashk...@gmail.com>

Vinnie Falco via Boost

unread,
Sep 22, 2020, 1:44:54 PM9/22/20
to boost@lists.boost.org List, Vinnie Falco
On Mon, Sep 21, 2020 at 2:05 AM Alexander Grund via Boost
<bo...@lists.boost.org> wrote:
> ACCEPT. Few minor issues to iron out, but nothing holding back acceptance.

Thank you for your thoughtful review and comments. I have broken out
each of the problems you raised into separate issues in the repository
so they can be tracked.

Regards

Hans Dembinski via Boost

unread,
Sep 23, 2020, 5:05:30 AM9/23/20
to Boost Devs, Hans Dembinski, Peter Dimov

> On 22. Sep 2020, at 16:23, Peter Dimov via Boost <bo...@lists.boost.org> wrote:
>
> But in the meantime, there are people who have actual need for a nlohmann/json (because of speed) or RapidJSON (because of interface) replacement, and we don't have it. It doesn't make much sense to me to wait until 2032 to have that in Boost.

My rough count of accept votes indicates that Boost.JSON is going to be accepted, so you get what you want, but I feel we gave up on trying to achieve the best possible technical solution for this problem out of a wrong sense of urgency (also considering the emails by Bjørn and Vinícius I does not seem like we need to wait for 2032 for a different approach).

This is C++, we strifes for "zero overhead" and "maximum efficiency for all reasonable use cases", and to achieve that requires a careful interface design. I only worry about interfaces, because implementations can be improved at any time. However if an interface is designed to requires more work than absolutely necessary, then this cannot be fixed afterwards.

You said yourself that Boost.JSON is not as efficient as it could be during the conversion of "my data type" to JSON, because the existing data has to be copied into the json::value first. I am a young member of the Boost family, but my feeling is that this would have been a reason to reject the design in the past. Designing abstractions that enable users to get maximum performance if they want it is a core value of C++.

As my previous examples of pybind11 and cereal have shown, the lasting legacies of Boost are excellent interfaces. Making good interface is very difficult and that's where the review process really shines. We have not achieved that here, since valid concerns are pushed aside by the argument: we have to offer a solution right now.

Best regards,
Hans

Peter Dimov via Boost

unread,
Sep 23, 2020, 6:06:03 AM9/23/20
to Hans Dembinski, Boost Devs, Peter Dimov
Hans Dembinski wrote:

> You said yourself that Boost.JSON is not as efficient as it could be
> during the conversion of "my data type" to JSON, because the existing data
> has to be copied into the json::value first. I am a young member of the
> Boost family, but my feeling is that this would have been a reason to
> reject the design in the past.

I don't think so. As a previous reviewer correctly observed, an apple has
been submitted, and you're complaining that it isn't an orange.

Were it a bad apple, that would have been a reason to reject. If we didn't
need apples, if users didn't need apples, if the two most popular fruits
weren't apples, that might have been a reason to reject. Not being an orange
isn't.

One possible objection (that has been used in the past) is that if we accept
an apple, nobody will submit an orange anymore. That's where the calendar
argument comes in. It's 2020, we didn't have an apple, and nobody has
submitted an orange. Ten years should have been enough time for orange
proponents. And this has nothing to do with "urgency".

Niall Douglas via Boost

unread,
Sep 23, 2020, 7:56:13 AM9/23/20
to bo...@lists.boost.org, Niall Douglas
On 23/09/2020 10:04, Hans Dembinski via Boost wrote:
>
>> On 22. Sep 2020, at 16:23, Peter Dimov via Boost <bo...@lists.boost.org> wrote:
>>
>> But in the meantime, there are people who have actual need for a nlohmann/json (because of speed) or RapidJSON (because of interface) replacement, and we don't have it. It doesn't make much sense to me to wait until 2032 to have that in Boost.
>
> My rough count of accept votes indicates that Boost.JSON is going to be accepted, so you get what you want, but I feel we gave up on trying to achieve the best possible technical solution for this problem out of a wrong sense of urgency (also considering the emails by Bjørn and Vinícius I does not seem like we need to wait for 2032 for a different approach).

For the record, I've had offlist email discussions about proposed
Boost.JSON with a number of people where the general feeling was that
there was no point in submitting a review, as negative review feedback
would be ignored, possibly with personal retribution thereafter, and the
library was always going to be accepted in any case. So basically it
would be wasted effort, and they haven't bothered.

I haven't looked at the library myself, so I cannot say if the concerns
those people raised with it are true, but what you just stated above
about lack of trying for a best possible technical solution is bang on
the nail if one were to attempt summarising the feeling of all those
correspondences.

Me personally, if I were designing something like Boost.JSON, I'd
implement it using a generator emitting design. I'd make the supply of
input destructive gather buffer based, so basically you feed the parser
arbitrary sized chunks of input, and the array of pointers to those
discontiguous input blocks is the input document. As the generator
resumes, emits and suspends during parse, it would destructively modify
in-place those input blocks in order to avoid as much dynamic memory
allocation and memory copying as possible. I'd avoid all variant
storage, all type erasure, by separating the input syntax lex from the
value parse (which would be on-demand, lazy), that also lets one say "go
get me the next five key-values in this dictionary" and that would
utilise superscalar CPU concurrency to go execute those in parallel.

I would also attempt to make the whole JSON parser constexpr, not
necessarily because we need to parse JSON at compile time, but because
it would force the right kind of design decisions (e.g. all literal
types) which generate significant added value to the C++ ecosystem. I
mean, what's the point of a N + 1 yet another JSON parser when we could
have a Hana Dusíková all-constexpr regex style JSON parser?

Consider this: a Hana Dusíková type all-constexpr JSON parser could let
you specify to the compiler at compile time "this is the exact structure
of the JSON that shall be parsed". The compiler then bangs out optimum
parse code for that specific JSON structure input. At runtime, the
parser tries the pregenerated canned parsers first, if none match, then
it falls back to runtime parsing. Given that much JSON is just a long
sequence of identical structure records, this would be a very compelling
new C++ JSON parsing library, a whole new better way of doing parsing.
*That* I would get excited about.

Niall

Vinnie Falco via Boost

unread,
Sep 23, 2020, 8:14:52 AM9/23/20
to boost@lists.boost.org List, Vinnie Falco
On Wed, Sep 23, 2020 at 4:55 AM Niall Douglas via Boost
<bo...@lists.boost.org> wrote:
> Me personally, if I were designing something like Boost.JSON, I'd
> implement it using a generator emitting design. I'd make the supply of
> input destructive gather buffer based

So you would implement an orange instead of an apple. Note that the
C++ ecosystem already has the flavor of orange you are describing, it
is called SimdJSON and it is quite performant. As with your approach,
it produces a read-only document. Quite different from Boost.JSON.
There's nothing wrong with implementing a library that only offers a
parser and a read-only DOM, and there are certainly use-cases for it.
Personally I would not try to compete with SimdJSON myself but perhaps
you can do better than me, especially in the area of "parallel
execution utilising superscalar CPU concurrency."

Regards

Niall Douglas via Boost

unread,
Sep 23, 2020, 9:14:06 AM9/23/20
to boost@lists.boost.org List, Niall Douglas
On 23/09/2020 13:14, Vinnie Falco wrote:
> On Wed, Sep 23, 2020 at 4:55 AM Niall Douglas via Boost
> <bo...@lists.boost.org> wrote:
>> Me personally, if I were designing something like Boost.JSON, I'd
>> implement it using a generator emitting design. I'd make the supply of
>> input destructive gather buffer based
>
> So you would implement an orange instead of an apple. Note that the
> C++ ecosystem already has the flavor of orange you are describing, it
> is called SimdJSON and it is quite performant. As with your approach,
> it produces a read-only document. Quite different from Boost.JSON.
> There's nothing wrong with implementing a library that only offers a
> parser and a read-only DOM, and there are certainly use-cases for it.
> Personally I would not try to compete with SimdJSON myself but perhaps
> you can do better than me, especially in the area of "parallel
> execution utilising superscalar CPU concurrency."

I never said what I'd do is comparable to what you've done. You're
absolutely right it's an apple vs orange difference.

What I was describing is the sort of design which would get me excited
because it's novel and opens up all sorts of new opportunities not
currently well served by existing solutions in the ecosystem.

There was a larger point made though, regarding the tradeoff of optimal
design vs getting it done. Historically you've favoured getting it done,
and have not been a warm recipient to suggestions regarding alternative
designs which would require you throwing most or all of what you've done
away and starting again. Quite a few people have perceived this about
you, and no longer bother to comment on anything you're doing or seeking
advice upon.


You asked off list what I meant by "personal retribution". I'd prefer to
answer that on list. I am referring to you having, in the past, being
perceived as harassing and persecuting individuals whose technical
opinion you disagree with across multiple internet forums over multiple
months, especially if they have ever publicly criticised a technical
design that you personally believe doesn't deserve that criticism.

That has caused those people, who perceive that about you, to not be
willing to interact with anything you touch or are involved with,
because they aren't willing to be followed all over the internet for the
next few months by you.

Now, personally speaking, I think it's more a case of you being rather
enthusiastic and passionate in your beliefs and not thinking through
other people's perception of you applying those beliefs, rather than you
being malevolent. I have stated that opinion about you on several
occasions when my opinion was privately sought. But you also need to
accept that you reap what you sow in how you're perceived to treat
people, just the same way as I knew I'd permanently make most of the
technical Boost leadership hate me for life when I decided to go rattle
their cages here so many years ago.

You may take this reply personally. It was not meant as a personal
attack. It was meant as a statement of facts to my best understanding,
because I don't think many people tell you this stuff, and if nobody
tells you, then there's no progress possible.

Niall

Peter Dimov via Boost

unread,
Sep 23, 2020, 9:21:21 AM9/23/20
to bo...@lists.boost.org, Peter Dimov
Niall Douglas wrote:

> For the record, I've had offlist email discussions about proposed
> Boost.JSON with a number of people where the general feeling was that
> there was no point in submitting a review, as negative review feedback
> would be ignored, possibly with personal retribution thereafter, and the
> library was always going to be accepted in any case.

Personal retribution, really?

> Consider this: a Hana Dusíková type all-constexpr JSON parser could let
> you specify to the compiler at compile time "this is the exact structure
> of the JSON that shall be parsed". The compiler then bangs out optimum
> parse code for that specific JSON structure input. At runtime, the parser
> tries the pregenerated canned parsers first, if none match, then it falls
> back to runtime parsing.

That's definitely an interesting research project, but our ability to
imagine it does not mean that people have no need for what's being
submitted - a library with the speed of RapidJSON and the usability of JSON
for Modern C++, with some additional and unique features such as incremental
parsing.

To go on your tangent, I, personally, think that compile-time parsing is
overrated because it's cool. Yes, CTRE is a remarkable accomplishment, and
yes, Tim Shen, the author of libstdc++'s <regex> also thinks that
compile-time regex parsing is the cure for <regex>'s ills. But I don't think
so. In my unsubstantiated opinion, runtime parsing can match CTRE's
performance, and the only reason current engines don't is because they are
severely underoptimized.

Similarly, I doubt that a constexpr JSON parser will even match simdjson,
let alone beat it.

Peter Dimov via Boost

unread,
Sep 23, 2020, 10:51:12 AM9/23/20
to bo...@lists.boost.org, Peter Dimov
Niall Douglas wrote:

> For the record, I've had offlist email discussions about proposed
> Boost.JSON with a number of people where the general feeling was that
> there was no point in submitting a review, as negative review feedback
> would be ignored, possibly with personal retribution thereafter, [...]

FWIW, the Boost review process allows reviews to be submitted directly to
the review manager, in private, without being posted on the list. That's
mostly a mechanism to avoid the exact same scenario as described above. I
don't think we've ever needed it ("so far"), but it exists and can be taken
advantage of, by those who are so inclined.

Rainer Deyke via Boost

unread,
Sep 23, 2020, 11:09:28 AM9/23/20
to bo...@lists.boost.org, Rainer Deyke
On 23.09.20 11:04, Hans Dembinski via Boost wrote:
> You said yourself that Boost.JSON is not as efficient as it could be during the conversion of "my data type" to JSON, because the existing data has to be copied into the json::value first. I am a young member of the Boost family, but my feeling is that this would have been a reason to reject the design in the past. Designing abstractions that enable users to get maximum performance if they want it is a core value of C++.

Actually, some of the oldest Boost libraries have (had) atrocious
performance. Early Boost libraries were often more concerned with how
to do something at all (within the limitations of C++03) than with how
to do it efficiently.


--
Rainer Deyke (rai...@eldwood.com)

Richard Hodges via Boost

unread,
Sep 23, 2020, 11:26:56 AM9/23/20
to boost@lists.boost.org List, Richard Hodges
On Wed, 23 Sep 2020 at 15:14, Niall Douglas via Boost <bo...@lists.boost.org>
I think you highlight one of the problems plaguing C++ decision-making in
the present day.

Far too much concern over feelings, ruffled feathers, politics and imagined
thought crimes - and not enough technical discussions backed with facts and
benchmarks.

I have worked with Vinnie for the past 9 months. During that time he has
spoken to me directly concerning design and implementation choices I have
made. Often indicating strong disagreement in the most uncompromising
terms.

However, because I am an adult, able to draw upon my experience and not
immediately exhibiting an emotional reaction in response to being
challenged, I have been quite able to make my view quite firmly known and
understood. Without any fear whatsoever of "personal retribution".

I have not experienced "being followed all over the internet" in any way
whatsoever.

I am also often told by Peter in his deadpan way that, "I am wrong". I
actually find this quite humorous, for me the comedy is in expecting to
hear the reason, which never comes - presumably because he thinks I ought
to be bright enough to see why I am wrong without being told (I am often
not).

I think there have been some very valid observations made about the design
choices of Boost.JSON. I have made some myself, even going so far as to
push code to see if I could do better than the existing implementation. I
think everyone concerned in its development is quite open that the code is
a compromise between "correctness" and performance. In fact while
maintaining Boost.Beast, I have been reminded that even in 2020, sometimes
you have to break a few rules to get things to go fast. I have been an
advocate of "speed is a hardware problem, elegance is a software problem"
all my life - but with imperfect compilers sometimes brute force is
unfortunately the answer.

It seems to me that some of the more scathing criticisms of the library
have been made *without offering an alternate implementation.* This might
be material to the strength of resistance to these observations - I have
personally watched Boost.JSON evolve over the past six (more?) months. The
testing, writing, rewriting and retesting has consumed at least 18
man-months (this term used advisedly and unashamedly) of effort. That's 18
months of people's lives dedicated to producing the best possible tool for
the most common use-case of a JSON library. There has been an almost
messaianic effort made to match and exceed the performance of RapidJSON, so
that no-one can say that the Boost offering is anything other than best in
class where it matters, *at the point of use*.

If people are going to argue that the underlying design choices are wrong,
I think they are morally compelled to offer a demonstration of why -
preferably as a PR. Knowing Vinnie as I do, I am quite convinced that no
matter who submitted the code, if it improved the final result, it would be
*welcomed* and the author given full credit, regardless of any previous
crossed words spoken in the heat of the moment.

I think it's worth reminding everyone that most of us do what we do because
we are passionate about it. It is therefore natural for people to argue
strongly for what they believe to be right for the good of all. I
personally hope that people who are unwilling to contribute for fear of
hurt feelings can find it in themselves to speak up - they may end up
improving a useful library. Maybe even discovering that through their
shared interest in C++ that they can cross boundaries and find interesting
friends in unlikely places.

R



>
> Niall
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>


--
Richard Hodges
hodg...@gmail.com
office: +442032898513
home: +376841522
mobile: +376380212

Niall Douglas via Boost

unread,
Sep 23, 2020, 2:20:52 PM9/23/20
to bo...@lists.boost.org, Niall Douglas
On 23/09/2020 14:21, Peter Dimov via Boost wrote:
> Niall Douglas wrote:
>
>> For the record, I've had offlist email discussions about proposed
>> Boost.JSON with a number of people where the general feeling was that
>> there was no point in submitting a review, as negative review feedback
>> would be ignored, possibly with personal retribution thereafter, and
>> the library was always going to be accepted in any case.
>
> Personal retribution, really?

Some have interpreted it as such yes. I have a raft of private email
that just arrived there after my post here recounting their stories
about being on the receiving end of Vinnie's behaviour, and/or thanking
me for writing that post.

Richard, I appreciate your "they're being snowflakes" response and
standing up for your friend, that was good of you. You should be aware
that I've known Vinnie longer than you, possibly as long as anyone here.
I think you'll find you're in the "get it done" philosophical camp (at
least that's my judgement of you from studying the code you write), so
Vinnie's fine with you. I have noticed, from watching him on the
internet, he tends to find most issue with those in the "aim for
perfection" philosophical camp. Vinnie particularly dislikes other
people's visions of abstract perfection if it makes no sense to him, if
it's abtuse, or he doesn't understand it. If you're in that camp, then
you might have a very different experience than what you've had.
Nevertheless, I believe Vinnie's opinion is important as representative
of a significant minority of C++ users, and I think it ought to continue
to be heard. I might add that the said "snowflakes" that I've spoken to
have all to date agreed with that opinion, we're perfectly capable of
withstanding severe technical criticism, indeed some of us serve on WG21
where every meeting is usually a battering of oneself.

Anyway, I have no wish to discuss this further, all I want to say has
been said.

>> Consider this: a Hana Dusíková type all-constexpr JSON parser could
>> let you specify to the compiler at compile time "this is the exact
>> structure of the JSON that shall be parsed". The compiler then bangs
>> out optimum parse code for that specific JSON structure input. At
>> runtime, the parser tries the pregenerated canned parsers first, if
>> none match, then it falls back to runtime parsing.
>

> To go on your tangent, I, personally, think that compile-time parsing is
> overrated because it's cool. Yes, CTRE is a remarkable accomplishment,
> and yes, Tim Shen, the author of libstdc++'s <regex> also thinks that
> compile-time regex parsing is the cure for <regex>'s ills. But I don't
> think so. In my unsubstantiated opinion, runtime parsing can match
> CTRE's performance, and the only reason current engines don't is because
> they are severely underoptimized.

Hana's runtime benchmarks showed her regex implementation far outpacing
any of those in the standard libraries. Like, an order of magnitude in
absolute terms, linear scaling to load instead of exponential for common
regex patterns. A whole new world of performance.

Part of why her approach is so fast is because she didn't implement all
of regex. But another part is because she encodes the parse into
relationships between literal types which the compiler can far more
aggressively optimise than complex types. So basically the codegen is
way better, because the compiler can eliminate a lot more code.

> Similarly, I doubt that a constexpr JSON parser will even match
> simdjson, let alone beat it.

simdjson doesn't have class leading performance anymore. There are
faster alternatives depending on your use case.

Niall

Glen Fernandes via Boost

unread,
Sep 23, 2020, 2:41:26 PM9/23/20
to bo...@lists.boost.org, Glen Fernandes
On Wed, Sep 23, 2020 at 2:20 PM Niall Douglas wrote:
>
> >> For the record, I've had offlist email discussions about proposed
> >> Boost.JSON with a number of people where the general feeling was that
> >> there was no point in submitting a review, as negative review feedback
> >> would be ignored, possibly with personal retribution thereafter, and
> >> the library was always going to be accepted in any case.
> >
> > Personal retribution, really?
>
> Some have interpreted it as such yes. I have a raft of private email
> that just arrived there after my post here recounting their stories
> about being on the receiving end of Vinnie's behaviour, and/or thanking
> me for writing that post.

Niall, I think this might be the first time I've seen you review the
submitter, instead of just the submission, so I have to assume it is
for a reason this is being raised in public on the Boost mailing list
during this review. What is the genuine concern here, with the
submission, reviews so far, review manager, etc.? i.e. It seems like
there's some subtext that I'm not grasping here, and it would be
better if spoken plainly. Mateusz is an active review wizard who can
intervene, if you want to raise an issue that something is
compromised.

Otherwise, I'm not sure why the the off-list opinions of people about
Vinnie matter to a Boost review of this library.

Glen

Vinnie Falco via Boost

unread,
Sep 23, 2020, 2:44:24 PM9/23/20
to boost@lists.boost.org List, Vinnie Falco
On Wed, Sep 23, 2020 at 4:55 AM Niall Douglas via Boost
<bo...@lists.boost.org> wrote:
> For the record, I've had offlist email discussions...
> where the general feeling was that there was no point in submitting a review,
> as negative review feedback would be ignored, possibly with personal retribution
> thereafter, and the library was always going to be accepted in any case.

Citing "anonymous sources with knowledge of the matter" to disparage
someone is bad enough, but then to make the unfounded allegation that
the Boost review process is rigged in advance to favor a particular
outcome is disrespectful to everyone who has invested time in the
process, including the review manager and the review wizards. As was
stated, reviews can be submitted anonymously and they will be
evaluated on their merit. Acting as a proxy for anonymous attacks on a
submitter is unprofessional and entirely inappropriate. We don't do
that here.

> ...I've known Vinnie longer than you...
> ...Vinnie's fine with you. I have noticed, from watching him...
> ...he tends to find most issue with...
> ...Vinnie particularly dislikes other people's visions...

I would appreciate it if you didn't speak for me, or pretend to think
you know what I find issues with, or what visions I like or I dislike.

You owe the list an apology.

Thanks

Maximilian Riemensberger via Boost

unread,
Sep 23, 2020, 2:51:08 PM9/23/20
to Niall Douglas via Boost, Maximilian Riemensberger

On 9/23/20 1:55 PM, Niall Douglas via Boost wrote:
> Consider this: a Hana Dusíková type all-constexpr JSON parser could let
> you specify to the compiler at compile time "this is the exact structure
> of the JSON that shall be parsed". The compiler then bangs out optimum
> parse code for that specific JSON structure input. At runtime, the
> parser tries the pregenerated canned parsers first, if none match, then
> it falls back to runtime parsing. Given that much JSON is just a long
> sequence of identical structure records, this would be a very compelling
> new C++ JSON parsing library, a whole new better way of doing parsing.
> *That* I would get excited about.

Great. There's really quite a lot of things to imagine about future C++
and libraries to be written in future C++ that get me excited.

There are also things about C++ that really don't excite me and probably
most other people as well. To name a few examples: std::vector, std::string,
etc. They are not perfect, they are not fancy, they are not even pretty.

But they are useful. Almost every day. To many, if not most C++ developers.
And they perform well. In many ordinary use-cases.

That's where I could see Boost Json: It's not perfect and probably also not
pretty in parser-aesthetic terms (judging from some the review comments).
But for me it combines a simple and widely-used user interface (similar to
nlohmann's) with decent performance (similar to rapidjson). That gives me
90% of both worlds. And I get it now / soon. As a user of Json libraries,
I find this a worthwhile trade-off.

Max

Peter Dimov via Boost

unread,
Sep 23, 2020, 3:07:51 PM9/23/20
to bo...@lists.boost.org, Peter Dimov
Niall Douglas wrote:
> Hana's runtime benchmarks showed her regex implementation far outpacing
> any of those in the standard libraries. Like, an order of magnitude in
> absolute terms, linear scaling to load instead of exponential for common
> regex patterns. A whole new world of performance.
>
> Part of why her approach is so fast is because she didn't implement all of
> regex. But another part is because she encodes the parse into
> relationships between literal types which the compiler can far more
> aggressively optimise than complex types. So basically the codegen is way
> better, because the compiler can eliminate a lot more code.

I've looked at CTRE, I know what it does, how it does it, and how well it
performs. It is, as I said, a remarkable piece of engineering, and I respect
Hana for her work.
https://pdimov.github.io/blog/2020/05/15/its-a-small-world/

Nevertheless, I have a not-entirely-uninformed hunch that a runtime engine
can perform on par. Of course, until/unless I can substantiate this more
thoroughly, by for example writing a runtime regex engine that exhibits
similar performance to CTRE, you can file this under "idle speculation".

Tom Honermann via Boost

unread,
Sep 23, 2020, 6:36:41 PM9/23/20
to bo...@lists.boost.org, Tom Honermann, Peter Dimov
Please let me know when/if evidence becomes available that elevates this
beyond "idle speculation".

Tom.

Vinícius dos Santos Oliveira via Boost

unread,
Sep 23, 2020, 8:30:54 PM9/23/20
to Boost, Vinícius dos Santos Oliveira
Em qua., 23 de set. de 2020 às 15:44, Vinnie Falco via Boost
<bo...@lists.boost.org> escreveu:
> [...] You owe the list an apology.

wow, quite intimidating words. calm down, fella.

I know you're under big pressure. Boost's review is no small
undertaking. And pressure does blind the best of our judgments.
However... you're being a little paranoid here. Just like you were
paranoid when you demanded a second review manager, just like you were
being paranoid when you opened this whole thread on the premise of "I
find it interesting that people are coming out of the woodwork" (and
that was your response to a **single** reject vote). That's a pattern
here.

Niall never said the process is being rigged. You're imagining it.
Niall just said people were discouraged to send any review thanks to
your past behaviours. I find that quite easy to believe, actually.
I've spent 50 hours on a review for which your answer summed up to
cheap baits. It was so depressing that I didn't even reply. And you
had my feedback... long time ago. You answer? "Let's put up to
review". You never actually considered my feedback. So yeah, you're
not quite warming to receive feedback. I wouldn't be surprised to hear
a declaration such as Niall's.

There's no need for an apology here. Just move on (you all -- Niall
included), and try to learn something out of this event.


--
Vinícius dos Santos Oliveira
https://vinipsmaker.github.io/

_______________________________________________

Mateusz Loskot via Boost

unread,
Sep 24, 2020, 6:19:24 AM9/24/20
to bo...@lists.boost.org, Mateusz Loskot
Niall,

It is unfortunate that you decided to discuss in the great detail
the issues of personal interactions with Vinnie *before* the JSON
review curtain drops.
Such pursuits just make the review manager(s) job (and wizards')
much harder than it has to be.

No, I don't object to the public collective catharsis.
I just think the timing for it was very unfortunate.

A reminder to all, we are currently trying to answer a simple question:
"Do you think the [JSON] library should be accepted as a Boost library?"
based on constructive criticism and technical evidence.

Best regards,
--
The Review Wizard for Boost C++ Libraries
Mateusz Loskot, http://mateusz.loskot.net

Jeff Garland via Boost

unread,
Sep 24, 2020, 7:02:59 AM9/24/20
to Boost Developers List, Jeff Garland
On Tue, Sep 22, 2020 at 4:22 AM Hans Dembinski via Boost <
bo...@lists.boost.org> wrote:

>
>
> > * Where are the immensely popular JSON Serialization libraries?
>
> The reason is that some Boost libraries have been successfully cloned by
> outside people and the development is happening there. Boost.Python is
> replaced by pybind11 and Boost.Serialization by cereal.
>
> https://github.com/USCiLab/cereal
>
> The most obvious appeal of cereal is that it is C++11. Boost.Serialization
> has a lot of pre-C++11 code and is accordingly a bit difficult to work
> with. Both pybind11 and cereal jumped on the opportunity to rewrite popular
> Boost libraries in C++11.
>
> cereal has a JSON archive. cereal also has 2.5k stars on Github, so it is
> dramatically popular.
>
>
I believe you're correct -- we moved to cereal for json serialization
because it wasn't in Boost. In our case it gets extensive use to marshall
objects in and out of Mongo databases, web service/socket requests, config
files.

Jeff

Daniela Engert via Boost

unread,
Sep 24, 2020, 7:45:00 AM9/24/20
to bo...@lists.boost.org, Daniela Engert
Am 24.09.2020 um 13:02 schrieb Jeff Garland via Boost:
> On Tue, Sep 22, 2020 at 4:22 AM Hans Dembinski via Boost <
> bo...@lists.boost.org> wrote:
>
>> cereal has a JSON archive. cereal also has 2.5k stars on Github, so
>> it is
>> dramatically popular.
>>
> I believe you're correct -- we moved to cereal for json serialization
> because it wasn't in Boost. In our case it gets extensive use to marshall
> objects in and out of Mongo databases, web service/socket requests, config
> files.

Wie switched almost completely from Boost.Serialization over to cereal,
for similar reasons. Pre-C++11 libs get phased out from projects in
active maintenance (wherever possible) and are discouraged for new
projects where C++17 is the targeted standard.

Ciao
  Dani

--
PGP/GPG: 2CCB 3ECB 0954 5CD3 B0DB 6AA0 BA03 56A1 2C4638C5

Pranam Lashkari via Boost

unread,
Sep 24, 2020, 10:51:52 AM9/24/20
to boost, Pranam Lashkari
Formal review of JSON has ended officially now. I would take a couple of
days to compile and declare the results.

Thank you very much to everyone who has invested time to review this
library.

On Tue, Sep 22, 2020 at 9:37 PM Pranam Lashkari <plashk...@gmail.com>

Vinnie Falco via Boost

unread,
Aug 8, 2022, 12:40:05 PM8/8/22
to boost@lists.boost.org List, Vinnie Falco
This was one piece of feedback posted during the Boost.JSON review of
September 2020:

> For the record, I've had offlist email discussions about proposed
> Boost.JSON with a number of people where the general feeling was that
> there was no point in submitting a review, as negative review feedback
> would be ignored, possibly with personal retribution thereafter, and the
> library was always going to be accepted in any case. So basically it
> would be wasted effort, and they haven't bothered.

Unless a impassioned on-list reply counts as "personal retribution" I
think it is safe to say that the aforementioned retribution never took
place. However, the false claim that "the library was always going to
be accepted in any case" is really harmful to the reputation of the
Boost Formal Review process.

As I believe that the review process is a vital piece of social
technology that has made the Boost C++ Library Collection the best of
breed, I'd like to avoid having the review of the upcoming proposed
Boost.URL submission tainted with similar aspersions.

Therefore let me state unequivocally, I have no interest in
persecuting individuals for criticizing my library submissions. In
fact I welcome negative feedback as it affords the opportunity to make
the library better - regardless of who is providing the feedback. I am
very happy to hear criticisms of my libraries even from those
individuals who are actively hostile.

However I do have an interest in vigorously opposing bad ideas, such
as this one which was tacked on to the end of the message quoted
above:

> Consider this: a Hana Dusíková type all-constexpr JSON parser could
> let you specify to the compiler at compile time "this is the exact
> structure of the JSON that shall be parsed". The compiler then
> bangs out optimum parse code for that specific JSON structure
> input. At runtime, the parser tries the pregenerated canned parsers
> first, if none match, then it falls back to runtime parsing

The totality of the experience gained in developing Boost.JSON
suggests that this proposed design is deeply flawed. The bulk of the
work in achieving the performance comparable to RapidJSON went not
into the parsing but in the allocation and construction of the DOM
objects during the parse. This necessitated a profound coupling
between parsing and creation of json::value objects.

I realize of course that this will invite contradictory replies ("all
you need to do is...") but as my conclusion was achieved only after
months of experimentation culminating in the production of a complete,
working prototype, I would just say: show a working prototype then
let's talk.

Regards

Gavin Lambert via Boost

unread,
Aug 8, 2022, 11:48:49 PM8/8/22
to bo...@lists.boost.org, Gavin Lambert
On 9/08/2022 04:39, Vinnie Falco wrote:
> This was one piece of feedback posted during the Boost.JSON review of
> September 2020:

It does seem a bit peculiar to bring this up again two years later.

(Also FWIW because this was a reply it ends up buried deep in the old
thread, where some people may overlook it.)

> As I believe that the review process is a vital piece of social
> technology that has made the Boost C++ Library Collection the best of
> breed, I'd like to avoid having the review of the upcoming proposed
> Boost.URL submission tainted with similar aspersions.
[...]
> I realize of course that this will invite contradictory replies ("all
> you need to do is...") but as my conclusion was achieved only after
> months of experimentation culminating in the production of a complete,
> working prototype, I would just say: show a working prototype then
> let's talk.

These two positions seem at odds -- you're inviting and encouraging
review, but then trying to set an extremely high bar ("implement at
least a skeletal competing library first") for that review to be
considered worthwhile. You can't have it both ways.

While granted, "why not do it like X?" can be annoying when you did
already consider that and found it didn't work for whatever reason (and
even more so if you hadn't considered it, it's actually better, but
you're a long way down a different path); the proper response is not to
dismiss it but to interpret this as feedback that your documentation
does not sufficiently clearly explain why you didn't do it like X.

John Maddock via Boost

unread,
Aug 9, 2022, 4:00:10 AM8/9/22
to Vinnie Falco via Boost, John Maddock
On 08/08/2022 17:39, Vinnie Falco via Boost wrote:
> This was one piece of feedback posted during the Boost.JSON review of
> September 2020:
>
>> For the record, I've had offlist email discussions about proposed
>> Boost.JSON with a number of people where the general feeling was that
>> there was no point in submitting a review, as negative review feedback
>> would be ignored, possibly with personal retribution thereafter, and the
>> library was always going to be accepted in any case. So basically it
>> would be wasted effort, and they haven't bothered.

I'm concerned that folks feel that way: Boost has always had a robust
and frankly sometimes bruising review process, but IMO we have ended up
with better libraries as a result.  So I hope everyone will feel free to
submit reviews as they feel fit.


> I realize of course that this will invite contradictory replies ("all
> you need to do is...") but as my conclusion was achieved only after
> months of experimentation culminating in the production of a complete,
> working prototype, I would just say: show a working prototype then
> let's talk.

We are at heart empiricists.  Working code always triumphs!

Best, John.

Vinnie Falco via Boost

unread,
Aug 9, 2022, 9:34:49 AM8/9/22
to bo...@lists.boost.org, Vinnie Falco, Gavin Lambert
On Mon, Aug 8, 2022 at 8:48 PM Gavin Lambert via Boost
<bo...@lists.boost.org> wrote:
> While granted, "why not do it like X?" can be annoying when you did
> already consider that and found it didn't work for whatever reason

My esteemed colleague Darrell Wright enlightened me as to the meaning
of the the aforementioned jargonspeaux. It wasn't a matter of "it
didn't work" but rather, that there are several different flavors of
JSON libraries which are mutually incompatible in terms of API: DOM,
using user-types, in-situ (SIMDJSON) come to mind. The library I
offered is the DOM variety. The other types are perfectly valid and
useful, I just felt that I personally was both lacking in knowledge
and insufficiently enthusiastic to also deliver the other flavors.
There is still plenty of room in Boost for the type of JSON library
that Niall describes, which is to go directly to and from user defined
types and serialized JSON. I still assert that these different
approaches to "implementing JSON" each belong in their own library -
because optimizing for one case necessarily disadvantages the others.

But the point I was trying to make originally had nothing to do with
the particulars of the various JSON approaches. Rather, the point is
that we must be ever-vigilant never to conflate vigorous and spirited
technical debate with "personal retribution", because this false
accusation dilutes the value and assaults the reputation of the review
process.

Thanks
Reply all
Reply to author
Forward
0 new messages