Hi.I've been using Beancount and fava to report on microinvestment transactions. I'm hitting serious performance issues, as the journal for a single account is approaching 11Mb. (This is no criticism of either fava or Beancount, as I think this use case is probably far beyond their intended usage.)
* Are there any big performance hits I could avoid (e.g. does relying on auto-posting have a significant impact)?
* Does anyone know of any tools out there for aggregating journal entries into summary journals (or has anyone had any success using Beancount's API to do this)?
--Cheers.-Mick.
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/d778316a-e755-4d7f-94e4-1969280e8bdd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Sun, Feb 10, 2019 at 11:07:03PM -0500, Martin Blais wrote:
> You can view the breakdown in time with the -v option to bean-check:
You've probably already thought about that, so out of curiosity: how
much of this is potentially parallelizable, as an avenue for "easily"
getting a performance boost? I guess not much, due to either I/O
constraints or the GIL lock, right? I'm curious about whether
validation, booking, and plugins might be made parallelizable in the
future.
--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAK21%2BhMXqd9sOAey%2B3aFDi6gh22B5bG8Y08E7CKa5WssWcryZg%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAAY9sD8%2BXEKOEstkmF5mHNMTWsGOjKJcFarBV15v%2BUCA7pAmYw%40mail.gmail.com.
Will the rewrite in C++ really help speed that much?
I mean, C++ does comes with a number of additional costs, and so do you believe ultimately that the benefit of C++ (execution speed) for an accounting tool like beancount, really outweighs those costs?
Here's some of my thoughts:
- C++ cross-platform dependency management & build - I personally use beancount on a FreeBSD system, and I do have to manually build it (even when install from pip) because there are some C/C++ library dependences for the parser etc. I can say that part is not very fun. If then entire thing is written in C++, care would have to be taken to not use "fancy" C++ features because that means not being able to use on certain systems (because they have older compilers or don't have the specific). Perhaps bazel solves that?
- Ease of development & hacking on the code - One prime reason I chose beancount over ledger was the fact that the dta structures and algorithms used were written in Python and so easier to grok. I am fairly adept in C++, but running through .h & .cpp & Make & inheritance hierarchies is much more work in C++ than other languages. It was difficult for me to follow along the datatypes available in ledger and how the python integration really worked. I mean, perhaps some more documentation would have helped. Also C++ bugs may give segfaults a lot more often than python code does - a different beast than the stack trace bugs in python. I'm not saying it's not possible to write seg-fault-free code. It gets harder very fast as the complexity goes up.
- Also, I'm not sure of what design you have in mind, but if you are going to expose Python bindings for plugins (which, according to the docs is a fundamental part of beancount extensions model), won't you need to be constantly converting between Python objects & C++ objects anyway? That might nullify down all the benefits from C++. Caveat here: I'm not very familiar with Python/C++ bindings, there may be a way to do this efficiently. And maybe googe/clif solves that problem superbly.
Finally, I reckon that you can get a lot from your execution speeds by using other compiled language. Have you considered Go? It should give much faster execution speeds of integers/decimals with easier development, maintenance (and package management) etc. Caveat here: I have not used Go very much, that is, I know only basics, and what I've heard from others. It may work really well to solve the problem beancount is facing in an elegant manner.
Anyway, I do hope you take these points in good spirit - as they were well intentioned. Beancount is a great product and I can't wait till it gets even better with all the features you listed out here!
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAAY9sD8%2BXEKOEstkmF5mHNMTWsGOjKJcFarBV15v%2BUCA7pAmYw%40mail.gmail.com.
Hello Martin,
(trying a second time because Google Groups semes to have lost my
previous message)
On 18/02/2019 11:22, Martin Blais wrote:
> - Beancount core, parser, booking and plugins get rewritten in simple
> C++ (no boost/templates, but rather on top of a bazel + absl +
> protobuf + clif base with functional-style and a straightforward subset
> of C++, no classes), providing its parsed and booked contents as a
> stream of protobuf objects.
> - All tests would remain in Python (I'm not rewriting those).
> Comprehensive clean Python bindings for beancount.core would be
> provided, to do as much scripting as is done today, except with types
> implemented fully in C++.
How do you see the possibility of using Cython instead of C++?
Advantages would include the possibility of an (easier) piecewise
conversion instead of a rewrite and not having to solve the problem of
generating Python binding from a C++ codebase.
Cheers,
Dan
--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/f1ecbaee-899e-4a6a-26b3-b8b0fb66ae40%40grinta.net.
Just out of curiosity - would changing the data format shorten the time required for processing? I know this is plain-text-accounting but it would be interesting to see what effect would using SQLite have on the performance.It might help in reducing the load time of transactions simply due to the nature of the technology.
--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/ba60a1ca-22f0-455c-9a36-531b05e81278%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/20190415120638.nrme4od3zj3zdlje%40upsilon.cc.
Just out of curiosity - would changing the data format shorten the time required for processing? I know this is plain-text-accounting but it would be interesting to see what effect would using SQLite have on the performance.
It might help in reducing the load time of transactions simply due to the nature of the technology.
Sort of defeats the purpose of plain text accounting though. I think the product would lose something special if it's in some non-plain-text format. If it were to go this route, I think the better solution would be to try some sort of caching.
For example, it could be interesting to cache files in a serialized format. It could check to see if the file size or timestamp has changed (and then invalidate the cache in such an event). This would give you the option to move your older transactions into separate files (for example, a different file for each year)
--On Mon, Apr 15, 2019 at 3:02 PM Alen Šiljak <alen....@gmx.com> wrote:--Just out of curiosity - would changing the data format shorten the time required for processing? I know this is plain-text-accounting but it would be interesting to see what effect would using SQLite have on the performance.It might help in reducing the load time of transactions simply due to the nature of the technology.
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/ba60a1ca-22f0-455c-9a36-531b05e81278%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAOHSxbm%2BDiBw%3DMnhoaFUa1jjs48kqqcxfsdrBg3M75eu26qUaA%40mail.gmail.com.
That's already done. There's a pickle cache right next to the top-level input file.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAK21%2BhPJQPPOcP2GQDeAurGSiu-HAEMoWoumca%2BMPteWPGhPpQ%40mail.gmail.com.
Hmm, not for me. Maybe it's something specific to my version of Python or the way it's set up on my computer. Here's the output I see ...
To unsubscribe from this group and stop receiving emails from it, send an email to bean...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/a45fed05-24b8-4e4f-a966-d5eec455029c%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/a45fed05-24b8-4e4f-a966-d5eec455029c%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/e8cd764e-bc63-4b87-9c2f-0c89060e1267%40googlegroups.com.
But every time I insert a new transaction [...]
I don't know if it's relevant but I'm using beancount fava.