Re: [fava] Regression (?) Number formatting lost (#205)

244 views
Skip to first unread message

Martin Blais

unread,
Mar 12, 2016, 1:09:55 PM3/12/16
to aumayr/fava, aumayr/fava, Beancount
I have a lot to say in this thread, and to be honest I'm not following all the details from your comments.
Let me provide some context and terminology to explain what might be going on.

First, it's important to realize how these numbers are represented in memory. They are using the Decimal representation which beyond being able to accurately representing decimal numbers (as opposed to the approximation that binary floats provides) also contains a specific precision. That is, the number 2.00 is represented differently than the numbers 2.0 and 2.000. The numbers "remember" which precision they are represented up to. This is important. When I say rendering the numbers to their "natural precision" I mean the precision with which they are represented, i.e., 2.0 renders as "2.0", 2.000 renders as "2.000".

Then, there are two DISTINCT topics: (1) tolerances, and (2) precision.
- "Tolerances" are values used to determine how much imprecision is acceptable in balancing transactions. This is used in the verification stage, to determine how much looseness to allow. It should not affect how numbers are rendered.
- "Precision" is perhaps a bit of a misnomer: By that I'm referring to is how many digits the numbers are to be rendered with.

Once upon a time - after the shell was already written - these concepts weren't well defined in Beancount and I wasn't dealing with these things consistently. At some point it became clear what I needed to do and I created a class called "DisplayContext" which could contain appropriate settings for rendering the precision of numbers for each currency (each currency tends to have its own most common rendering precision, e.g. two digits for USD, one digit for MXN, no digits for JPY and in reports we're typically fine rounding the actual numbers to that precision). So an instance of this DisplayContext is automatically instantiated in the parser and in order to avoid the user having to set these values manually - for Beancount to "do the right thing" by default - it is able to accumulate (https://bitbucket.org/blais/beancount/src/c349150908078720c7e5ee6bfda644b51ac9543e/src/python/beancount/core/display_context.py?at=default&fileviewer=file-view-default#display_context.py-190) the numbers seen and to deduce the most common and maximum number of digits used from the input, and to use that as the default number of digits for rendering numbers. The most common format/number of digits is used to render the number of units, and the maximum number of digits seen is used to render costs and prices. In addition, this class also has capabilities for aligning to the decimal dot and to insert commas on thousands as well. It separates the control of the formatting from the numbers themselves.

MOST of the code that renders numbers uses the DisplayContext (via the to_string() methods) to convert the numbers into strings, such as the web interface and explicit text reports. But NOT ALL... there's a bit of HISTORY here... the SQL shell uses some old special-purpose code (https://bitbucket.org/blais/beancount/src/c349150908078720c7e5ee6bfda644b51ac9543e/src/python/beancount/query/query_render.py?at=default&fileviewer=file-view-default#query_render.py-166) to render numbers that I never bothered to convert to the DisplayContext class. There's a TODO item here: https://bitbucket.org/blais/beancount/src/c349150908078720c7e5ee6bfda644b51ac9543e/TODO?at=default&fileviewer=file-view-default#TODO-312. It needs to get converted at some point, but I've neglected doing this so far because I have much bigger plans for the SQL query engine that involve a full rewrite of it with many improvements and I figured I'd do that then. If you recall, the SQL query engine was a prototype, and actually it works, but it is not well covered by unit tests. My purpose with it was to discover through usage what would be useful and to then write a v2 of it that would be much better.

Now, about that PRINT command... this is not intended as a reporting tool. The printer's purpose is to print input that accurately represents the content of the transactions. In order to do this, it needs to render the numbers at their "natural" precision, so that when they get read back in, they parse into the very same number, that is, with the same number of digits (even if zeros). For this reason, the PRINT command does not attempt to render using the DisplayContext instance derived from the input file - this is on purpose. I could change that, but then round-trip would break: the rounding resulting from formatting using the display context may output transactions which don't balance anymore.

As you can see, it's not an obvious topic... Hopefully this should allow you to understand what comes out of Beancount in terms of the precision of the numbers it renders.

Note: "default_tolerances" has been renamed to "inferred_tolerance_default" recently because the name was too general and confusing. Old name will work but generate a warning.

I just noticed from your comments and some grepping around that the "render_commas" option is not used anymore. I'm not sure how that happened, but I'll go ad fix that right away and set the default value of the DisplayContext derived from the input.

I should probably also convert the SQL shell rendering to use the display context regardless of future plans, so that it renders consistently with all the rest. Not sure I can do that this weekend, but I'll log a ticket.

I hope this helps. You're welcome to ask questions if the above isn't clear. I'm sorry if this isn't entirely obvious... there's been a fair bit of history there and there's a lot of code. I should review the naming of options, I think the tolerance options all have "tolerance" in their name, but there aren't options to override the rendering and when I add them they should all have a common name as well.




On Sat, Mar 12, 2016 at 5:38 AM, Jakob Schnitzer <notifi...@github.com> wrote:

use the previous behavior (. as thousands separator) as the default option?

The problem is that beancount already sets the default for us, so we if no way of knowing if the user didn't set the render_commas option or they intentionally set it to false. There is no code in beancount actually using/respecting that option, so if we want to change the default, I guess we would have a good cause. I don't really care what the default is.

@blais: What's your position regarding render_commas? Is it intended to be used in more places in the beancount source as well at some point?


Reply to this email directly or view it on GitHub.


Martin Blais

unread,
Mar 12, 2016, 2:39:56 PM3/12/16
to Martin Blais, aumayr/fava, aumayr/fava, Beancount
I just fixed the bug with render_commas not being honored.
It should now affect the web interface and the PRINT command.

I think I'm not rendering the cost and price consistently with maximum precision everywhere though, I'll need to review this at some point later.


Martin Blais

unread,
Mar 12, 2016, 2:56:11 PM3/12/16
to Martin Blais, aumayr/fava, aumayr/fava, Beancount
I just realized further that the reports under beancount.report aren't honoring the display context either.
I need to complete this work and just use it consistently throughout.

I'll do this later... I need to wrap up the ingest branch.
This will take a while.



Martin Blais

unread,
Mar 12, 2016, 3:15:23 PM3/12/16
to Martin Blais, aumayr/fava, aumayr/fava, Beancount
To be specific: Here I mean the "text report".
That same code is used to render the web reports, which do honor it.

Martin Blais

unread,
Mar 12, 2016, 6:22:38 PM3/12/16
to Martin Blais, aumayr/fava, aumayr/fava, Beancount
I just fixed the balances and income text reports:

At some point I'll review the entire reports code to ensure the DisplayContext is used absolutely everywhere.
Apologies for the confusion.

Reply all
Reply to author
Forward
0 new messages