reds capgains classifier: div by zero

79 views
Skip to first unread message

Eric Altendorf

unread,
Aug 12, 2024, 7:38:22 PM8/12/24
to bean...@googlegroups.com
I have occasionally gotten division-by-zero errors from this plugin; I finally decided to look into what's going on.  I am having trouble understanding this code from long_short.py:

            # ensure our replacement postings sum up to the original capital gains postings we removed
            diff = orig_sum - (short_gains + long_gains)
            # divide this diff among short/long. TODO: warn if this is over tolerance threshold, because it
            # means that the transaction is probably not accounted for correctly
            if abs(diff) >= entry.meta['__tolerances__'][p.units.currency]:
                total = short_gains + long_gains
                short_gains += (short_gains/total) * diff
                long_gains += (long_gains/total) * diff

A couple specific questions:

- How could the replacement postings not add up to the original posting?
- If they do, do we really want to be robust to this case, or should that "TODO" really be TODO: make this a fatal error?

Assuming that there are good reasons to enter this block of code to divvy up unaccounted for gains/losses, the following seem like problems:

- If short_gains and long_gains are both zero, this code will fail with div-by-zero.
- If short_gains and long_gains are equal and opposite, this code will also fail.

Thoughts?

Red S

unread,
Aug 20, 2024, 12:28:28 AM8/20/24
to Beancount
Hi there,

I have occasionally gotten division-by-zero errors from this plugin; I finally decided to look into what's going on.  I am having trouble understanding this code from long_short.py:

Good to know. These seem like valid edge cases you ran into, even if rare, that the code doesn't handle. I whipped up this plugin for a temporary need, but have used it permanently since then, and surprisingly haven't run into these cases (yet).
 
            # ensure our replacement postings sum up to the original capital gains postings we removed
            diff = orig_sum - (short_gains + long_gains)
            # divide this diff among short/long. TODO: warn if this is over tolerance threshold, because it
            # means that the transaction is probably not accounted for correctly
            if abs(diff) >= entry.meta['__tolerances__'][p.units.currency]:
                total = short_gains + long_gains
                short_gains += (short_gains/total) * diff
                long_gains += (long_gains/total) * diff

A couple specific questions:

- How could the replacement postings not add up to the original posting?

Rounding errors from various sources.
 
- If they do, do we really want to be robust to this case, or should that "TODO" really be TODO: make this a fatal error?

I believe a fatal error beyond a certain threshold is a good idea. Whether the threshold is the same as the tolerance, some multiple of it, or a fixed amount is something I meant to figure out but haven't yet. Typically, brokerages (mine at least) do annoying things like maintain 4 digits of precision internally but expose only two rounded digits in their ofx/csv, and thing of that nature. This leads to several cents that I automatically book to a "Rounding-Error" account in my importers. This I find, works very well in practice. A rounding error of this nature is typically well above the transaction's tolerance, and will need to be handled gracefully by this long_short plugin.

Assuming that there are good reasons to enter this block of code to divvy up unaccounted for gains/losses, the following seem like problems:

- If short_gains and long_gains are both zero, this code will fail with div-by-zero.
- If short_gains and long_gains are equal and opposite, this code will also fail.

Agree, rare but valid edge cases. PR welcome if you get around to fixing it :).

Eric Altendorf

unread,
Aug 22, 2024, 12:31:02 PM8/22/24
to bean...@googlegroups.com
On Mon, Aug 19, 2024 at 9:28 PM Red S <redst...@gmail.com> wrote:
Hi there,

I have occasionally gotten division-by-zero errors from this plugin; I finally decided to look into what's going on.  I am having trouble understanding this code from long_short.py:

Good to know. These seem like valid edge cases you ran into, even if rare, that the code doesn't handle. I whipped up this plugin for a temporary need, but have used it permanently since then, and surprisingly haven't run into these cases (yet).

I think what's happening for me is stablecoins -- if you buy crypto with stablecoins (e.g. USDT) -- it's a taxable disposal of a non-USD asset.  It's just that the price (USDT/USD) tends to only fluctuate from say 0.9994 to 1.0007.  In such a tight range, you can end up selling the USDT at basically the same price you bought.

 
            # ensure our replacement postings sum up to the original capital gains postings we removed
            diff = orig_sum - (short_gains + long_gains)
            # divide this diff among short/long. TODO: warn if this is over tolerance threshold, because it
            # means that the transaction is probably not accounted for correctly
            if abs(diff) >= entry.meta['__tolerances__'][p.units.currency]:
                total = short_gains + long_gains
                short_gains += (short_gains/total) * diff
                long_gains += (long_gains/total) * diff

A couple specific questions:

- How could the replacement postings not add up to the original posting?

Rounding errors from various sources.

Ah, OK, so conceptually the postings should exactly balance, and this case is purely to handle numerical instability.

 
- If they do, do we really want to be robust to this case, or should that "TODO" really be TODO: make this a fatal error?

I believe a fatal error beyond a certain threshold is a good idea. Whether the threshold is the same as the tolerance, some multiple of it, or a fixed amount is something I meant to figure out but haven't yet. Typically, brokerages (mine at least) do annoying things like maintain 4 digits of precision internally but expose only two rounded digits in their ofx/csv, and thing of that nature. This leads to several cents that I automatically book to a "Rounding-Error" account in my importers. This I find, works very well in practice. A rounding error of this nature is typically well above the transaction's tolerance, and will need to be handled gracefully by this long_short plugin.

Got it, yes, totally makes sense.
 
Assuming that there are good reasons to enter this block of code to divvy up unaccounted for gains/losses, the following seem like problems:

- If short_gains and long_gains are both zero, this code will fail with div-by-zero.
- If short_gains and long_gains are equal and opposite, this code will also fail.

Agree, rare but valid edge cases. PR welcome if you get around to fixing it :).

Now that I confirmed the goal here, I'll take a look at what would be the best approach.  Thanks! 

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/bbe2cb97-a411-4374-853b-c5cf6d38a589n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages