Sending "raw" values

39 views
Skip to first unread message

Walter Both

unread,
Sep 8, 2020, 8:15:08 AM9/8/20
to Jansson users
Is it an option to add a property per field we want to send to have control over the fact wether we want a string value surrounded with  ' " ' . Then we can send numbers as string but they are still treated as numbers. And we can also send a string like 'undefined' without enclosing ' " '

Graeme Smecher

unread,
Sep 8, 2020, 12:47:24 PM9/8/20
to jansso...@googlegroups.com
Hi Walter,
I think both this and your previous post are aiming at similar goals. If
so, I think I'm responding to both e-mails here.

Personally, I think a custom serialization/deserialization layer for
Jansson would be interesting, BUT would be a fairly fundamental
departure from Jansson's current approach. The JSON spec specifies
string encodings but does not mandate a particular approach to
deserialized encodings. It does acknowledge that double-precision floats
are an obvious choice and allows room for implementation-specific
limitations.

So, in terms of the encoding:

* There is room in the JSON spec for a more flexible interpretation of
what a number is and how to interpret one from a string. However,

* Double-precision floats are de-facto standard for JSON encoders and
decoders. Even if you coax Jansson into producing the JSON you want, the
decoder at the other side is likely to dump your data back into
double-precision floats with all their wonderful surprises.
(https://floating-point-gui.de/basic/) If this is to be a public API,
interoperability is essential and depending on features of an unusual
JSON implementation is problematic.

* Fundamentally, you may be conflating encoding (double-precision floats
serialized into a string) with presentation (two decimal places for a
currency). There is a wide body of good advice and best practices
on-line, especially regarding safe handling of currency. You may find
your specific questions are no longer relevant after backing up a few
steps and checking with the assumptions you're making by asking them.

In terms of the implementation:

* You are aiming at a specific problem and using Jansson to solve it.
That's OK, but if it requires changes to Jansson someone has to do the
work. I don't think it will happen without an enthusiastic and
experienced coder. Please consider if that's you.

* Changes to the serializer imply mirrored changes to the parser. These
are tricky: you're proposing JSON structures augmented with some kind of
type annotations that tell the serializer how to manage them. The parser
has to load from strings without any such annotations. It's not clear
(to me, offhand) how to inject this information in an idiomatic way.
Something like JSON Schema seems necessary but it's currently out of
scope for Jansson.

* The impacts on the upstream API are also likely sprawling. Before you
dive in and start writing code it will be essential to understand (and
get buy-in for) the scope of the changes you're proposing. This is meant
to save you time and suffering and I think it's best done via e-mail.

* Finally: if you are looking for a JSON library that's easy to twist
into the shape you need, a streaming design might be a better starting
point. Jansson's object model gives you a friendly API but requires
fairly profound changes to get what you want.

All of this is not meant to discourage you, but to let you know the
likely landscape for your suggestion. I don't speak for Jansson's
maintainer but I've been around long enough I don't expect I'm far off here.

best,
Graeme Smecher
pEpkey.asc

Petri Lehtinen

unread,
Sep 8, 2020, 2:26:07 PM9/8/20
to jansso...@googlegroups.com
Thanks Graeme, well said.

I would like to add that if you still want to customize the serialization the
way you described, I would suggest you to write the code to generate the JSON
yourself. Generating valid JSON is not such a big deal really, especially
because it seems you fully control the input.

Petri
> --
> --
> Jansson users mailing list
> jansso...@googlegroups.com
> http://groups.google.com/group/jansson-users
> ---
> You received this message because you are subscribed to the Google Groups "Jansson users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to jansson-user...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/jansson-users/166683fd-8fb5-af9a-8758-fa17bd93a420%40gmail.com.


signature.asc

Walter Both

unread,
Sep 8, 2020, 3:51:29 PM9/8/20
to Jansson users
Thanks Graeme and Petri for your quick answers.

As you guessed correctly Graeme, I am looking primarily at having more control over  double precision numbers with all their unexpected behaviour. Our backend application is written in C. So we now all too well what practical implications can show up if you use double precision numbers for storing prices. For that reason we use the common solution for prices in our backend; we store them in integers and seperatly we store how many decimals they contain. Of course we could create a similar presentation in JSON. But it requires extra documentation and explanation to develpors connecting to our API. Send the prices as floats will be more self explainatory. But as said with double precision numbers we would need more control over the number of decimals used in the 'printf' call in the function 'json_dumps'. That would (for a large part?) hide the rounding errors. Otherwise we probaly have to define the prices as a string but that looks less attractive and elegant. On the other hand you are right to mention the fact that the receiving and will still have the problem not being able to store that double precision value in a double.

We are a group of a couple of c-coders with quite some years of experience, so we could make the changes. The primary aim of the question is to inquire if this change would break the specs to which JSON should conform and if we would push the changes, whether they might be consired for acceptance or if they would be rejected by definition straight away.

P.S.
Do you have any insight in what approach people often take when they need to put double precision values in JSON with the problems as mentioned in the url https://floating-point-gui.de/basic/


Op dinsdag 8 september 2020 om 20:26:07 UTC+2 schreef pe...@digip.org:

Graeme Smecher

unread,
Sep 14, 2020, 1:11:23 PM9/14/20
to jansso...@googlegroups.com
Hi Walter,

Apologies for the delay. I drafted a response here and sat on it longer
than I meant to. You'll see why: it is far out of bounds for a mailing
list about Jansson, and I'm out of my domain here.

Please be cautious. Waltzing through a minefield is a bad idea, even if
someone has told you it's probably do-able.

Comments inline.

On 2020-09-08 12:51 p.m., Walter Both wrote:
> Thanks Graeme and Petri for your quick answers.
>
> As you guessed correctly Graeme, I am looking primarily at having more
> control over  double precision numbers with all their unexpected
> behaviour. Our backend application is written in C. So we now all too
> well what practical implications can show up if you use double precision
> numbers for storing prices. For that reason we use the common solution
> for prices in our backend; we store them in integers and seperatly we
> store how many decimals they contain. Of course we could create a
> similar presentation in JSON. But it requires extra documentation and
> explanation to develpors connecting to our API. Send the prices as
> floats will be more self explainatory. But as said with double precision
> numbers we would need more control over the number of decimals used in
> the 'printf' call in the function 'json_dumps'. That would (for a large
> part?) hide the rounding errors. Otherwise we probaly have to define the
> prices as a string but that looks less attractive and elegant. On the
> other hand you are right to mention the fact that the receiving and will
> still have the problem not being able to store that double precision
> value in a double.

Here's as concise a statement of the problem as I can make: consider a
fraction like 0.10. As a double, this has the approximate encoding

0x3fb999999999999a

The repeating binary digit is visible in the trailing .....999999a's: if
you replaced the double with a longer floating-point format it would
keep trailing 9's forever. Depending on your fraction, the repeating
pattern may be longer or shorter (or your encoding may be exact, like 0.25.)

You are trying to avoid any user-visible impacts from from this
inaccuracy in your JSON. Your back-end implementation uses safer design
practices, so this may be the first time your currencies are coaxed into
a floating-point number at all.

Even if I can't precisely encode decimals like 0.10 in floating point, I
(and most double-backed JSON implementations) will serialize the numbers
the way you want anyhow. Jansson and Python do the same thing here:

>>> import json
>>> json.dumps(0.10)
'0.1'

You just have to be aware about nearby encodings:

>>> import json
>>> import np
>>> json.dumps(0.01 + 0.09)
'0.09999999999999999' # ....oh no

>>> json.dumps(np.round(0.01 + 0.09, 2))
'0.1' # ...ok

This works for dimes (0.10) and pennies (0.01). Other fractions have
different approximations, so e.g. internationalization adds its own
problems.

Increasing the absolute size of your price also eventually wrecks the
encoding ($1000000000000000000.01 is problematic) because floating-point
has finite dynamic range. If your quantities are bounded this may be
avoidable.

Finally, you are relying on more than just Jansson for consistency here:
your float-to-string conversion is ultimately part of libc, so you're
exposing yourself to an unexpected impact from a deeper part of the
runtime than seems logical at first.

But: you may be able to coax Jansson into serializing quantities
appropriately without deep changes to Jansson itself, just by
understanding what you're doing and doing it carefully. You'll need to
pre-round your prices (similar to the last Python example) so trailing
digits don't creep in. You will want to be very deliberate about your
regression testing so that the assumptions you're making during design
are not violated later on.

Thoughts?

> We are a group of a couple of c-coders with quite some years of
> experience, so we could make the changes. The primary aim of the
> question is to inquire if this change would break the specs to which
> JSON should conform and if we would push the changes, whether they might
> be consired for acceptance or if they would be rejected by definition
> straight away.

We're now switching gears to consider a contribution to Jansson. I
recommend you start with the "bignum" bug report here:

https://github.com/akheron/jansson/issues/69

This is a very closely related stable of problems to yours. The fact
that this bug report is open (and was opened by Jansson's primary
develper) indicates that we're eager for a well-crafted bignum patch.
OTOH it's still open, hinting it's non-trivial and that some very strong
coders have not been successful yet.

It is possible (but not guaranteed) that a bignum patch would give you
the hooks you need. It's also possible this is the wrong approach for
you, but that these materials are a good starting point anyways.

> P.S.
> Do you have any insight in what approach people often take when they
> need to put double precision values in JSON with the problems as
> mentioned in the url https://floating-point-gui.de/basic/

Facetiously: denial, anger, bargaining, depression, and acceptance. I
hope the approach I first sketched above falls into the "acceptance"
category. There is certainly nothing tidy about pushing your currencies
into strings. Strings are more future-proof for internationalization. On
the other hand, strings are a wholesale dodge around serialization and
do not offer the most friendly or logical API.

best,
Graeme
> <mailto:jansson-user...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/jansson-users/3fd3b7b0-fb81-4122-9b17-28c83f70a029n%40googlegroups.com
> <https://groups.google.com/d/msgid/jansson-users/3fd3b7b0-fb81-4122-9b17-28c83f70a029n%40googlegroups.com?utm_medium=email&utm_source=footer>.
pEpkey.asc

Walter Both

unread,
Sep 16, 2020, 4:07:00 AM9/16/20
to Jansson users
Hello Graeme,

Thank you very much for your elaborate answer and the energy you put in.

This mostly confirms what we already suspected. We have over 25 years of coding experience. During that time we have had our moments of fun with unexpected behaviour of floating point numbers as you described.
For that reason we designed our dBase/business logic in such a way that mainly prices are stored as integers as I mentioned. We could define our prices in exacly the same way in JSON as an object containing a field with the number and a other field with the number of decimals needed. This could work convieniently for REST API's we use internally. If we have to send data to another party we could send prices as strings. We generate those string with our business logic and that logic will do the correct rounding to circumvent all the problems mentioned. We already do that when we send invoices to a REST API system for customs declarations. They defined the prices as strings.

The only 'quick and dirty' solution I would see quickly if we could add the prices as strings, correctly rounded by our software,  with a new method for adding string values to the JSON.  The effect would be that the enclosing ' "'  around the values are ommited by JSON in that field. But the receiving end would still need to be aware of the effect that the numbers received is not always 1 on 1 representable as a binary number. As you mentionded this is always an annoying aspect of working with floats. In all my years as a coder the only things I encountered to tackle this probem, are work-arounds.

Regards,
Walter
Op maandag 14 september 2020 om 19:11:23 UTC+2 schreef Graeme Smecher:
Reply all
Reply to author
Forward
0 new messages