On Friday, October 30, 2015 at 9:41:39 PM UTC-4, Öö Tiib wrote:
>
> Indeed I was interested in context. Context here seems to be that it is a
> deserializer? I actually still don't know why the object to deserialize
> is a data member (component) of the deserializer. It seems that it is
> so for achieving that deserialization and transfer of deserialization
> results can then happen as separate steps/calls. May be that is useful
> but it is not immediately apparent why.
The rationale is to decouple the parsing from the construction of the in-memory representation. The parser feeds parse events to an input handler, the input handler handles the events. This way we can have different parsers that generate json parse events, e.g. a csv_reader that knows how to parse csv files, or a json_reader that knows how to parse json files, and use the same input handler - json_deserializer - to construct an in memory representation. Or someone may want to use my csv_reader to read a csv file but have no interest in my json representation, the option exists to implement a custom input handler to produce say a jsoncpp or rapidjson object. There is also an opportunity to introduce filters and validators between the source and the destination, this borrows from the ideas of SAX parsers in the XML space.
>
> Parsers typically produce/allocate the objects during parsing and
> transfer ownership right as result (return value) of parsing. Less
> popular alternative is to pass the object as IN-OUT parameter to parser.
>
Here the parser only produces parse events, the input handler consumes them and constructs something or filters them and passes them along. An alternative design would be to pass an IN-OUT parameter to the json_desrializer.
> You deserialize into 'basic_json' that looks somewhat like 'boost::variant'.
> Real 'boost::variant' does some more tricks to reduce levels of indirection;
> it may perform slightly better (less news/deletes). Also it does some
> tricks to achieve more compile-time optimizations and type-safety but that
> all is perhaps beyond the point.
>
Indeed, boost::variant would be an option. I checked it's memory footprint (a big concern, these things become heavily nested), and for the same types that are in my union, the footprint is no larger.
One problem with using boost for a utility like this though is that some of my current users wouldn't want it, and would drop my library if I did that, unfortunately introducing dependencies on other libraries isn't always appreciated. So I only use boost for the test suite. I did look into using a boost like implementation for the variant, but I got a headache from looking at the boost headers :-), I think I understand what they're doing, but for now I want to keep it simpler. There would be a bigger benefit if I could fit whole objects (string, array, associative map) into the variant, but that blows up the memory too much for null, bool, double, long long. Since the number of types to be supported is not large, it's not difficult to cover the safety concerns with a blanket of unit tests.
> >
> > But I don't think that that context matters very much. Yes, some things you can build up with T:T(...), and other things you can't. My question had to do with the latter, and there's not much else to be said about it.
>
> Context can always matter since we have no silver bullets that are
> equally good in every context. If you want to keep your deserialization as
> two steps and result of it as by-value data member then you may want
> to consider:
>
> T t = builder.give_result(); // returns r-value reference to result
>
> instead of:
>
> T t = std::move(builder.result); // result is public data member
>
> Both technically move from data member but former is simpler to
> extend or to change.
That's actually not a bad suggestion :-)
Thanks for your feedback.
Daniel