Those tiers are:
- Utility code
There's lots of general utility in Ledger for doing time parsing, using
Boost.Regex, error handling, etc. It's all done in a way that can be
reused in other projects as needed.
- Commoditized Amounts (amount_t, commodity_t and friends)
An numerical abstraction combining multi-precision rational numbers (via
GMP) with commodities. These structures can be manipulated like regular
numbers in either C++ or Python (as Amount objects).
- Commodity Pool
Commodities are all owned by a commodity pool, so that future parsing of
amounts can link to the same commodity and established a consistent price
history and record of formatting details.
- Balances
Adds the concept of multiple amounts with varying commodities. Supports
simple arithmetic, and multiplication and division with non-commoditized
values.
- Price history
Amounts have prices, and these are kept in a data graph which the amount
code itself is only dimly aware of (there's three points of access so an
amount can query its revalued price on a given date).
- Values
Often the higher layers in Ledger don't care if something is an amount or a
balance, they just want to add stuff to it or print it. For this, I
created a type-erasure class, value_t/Value, into which many things can be
stuffed and then operated on. They can contain amounts, balances, dates,
strings, etc. If you try to apply an operation between two values that
makes no sense (like dividing an amount by a balance), an error occurs at
runtime, rather than at compile-time (as would happen if you actually tried
to divide an amount_t by a balance_t).
This is the core data type for the value expression language.
- Value expressions
The next layer up adds functions and operators around the Value concept.
This lets you apply transformations and tests to Values at runtime without
having to bake it into C++. The set of functions available is defined by
each object type in Ledger (posts, accounts, transactions, etc.), though
the core engine knows nothing about these. At its base, it only knows how
to apply operators to values, and how to pass them to and receive them from
functions.
- Query expressions
Expressions can be onerous to type at the command-line, so there's a
shorthand for reporting called "query expressions". These add no
functionality of there own, but are purely translated from the input string
(cash) down to the corresponding value expression (account =~ /cash/).
This is a convenience layer.
- Format strings
Format strings let you interpolate value expressions into string, with the
requirement that any interpolated value have a string representation.
Really all this does is calculate the value expression in the current
report context, call the resulting value's "to_string()" method, and stuffs
the result into the output string. It also provides printf-like behavior,
such as min/max width, right/left justification, etc.
- Journal items
Next is a base type shared by anything that can appear in a journal: an
item_t. It contains details common to all such parsed entities, like what
file and line it was found on, etc.
- Journal posts
The most numerous object found in a Journal, postings are a type of item
that contain an account, an amount, a cost, and metadata. There are some
other complications, like the account can be marked virtual, the amount
could be an expression, etc.
- Journal transactions
Postings are owned by transactions, always. This subclass of item_t knows
about the date, the payee, etc. If a date or metadata tag is requested
from a posting and it doesn't have that information, the transaction is
queried to see if it can provide it.
- Journal accounts
Postings are also shared by accounts, though the actual memory is managed
by the transaction. Each account knows all the postings within it, but
contains relatively little information of its own.
- The Journal object
Finally, all transactions with their postings, and all accounts, are owned
by a journal_t object. This is the go-to object for querying ad reporting
on your data.
- Textual journal parser
There is a textual parser, wholly contained in textual.cc, which knows how
to parse text into journal objects, which then get "finalized" and added to
the journal. Finalization is the step that enforces the double-entry
guarantee.
- Iterators
Every journal object is "iterable", and these iterators are defined in
iterators.h and iterators.cc. This iteration logic is kept out of the
basic journal objects themselves for the sake of modularity.
- Comparators
Another abstraction isolated to its own layer, this class encapsulating the
comparison of journal objects, based on whatever value expression the user
passed to --sort.
- Temporaries
Many reports bring pseudo-journal objects into existence, like postings
which report totals in a "<Total>" account. These objects are created and
managed by a temporaries_t object, which gets used in many places by the
reporting filters.
- Option handling
There is an option handling subsystem used by many of the layers further
down. It makes it relatively easy for me to add new options, and to have
those option settings immediately accessible to value expressions.
- Session objects
Every journal object is owned by a session, with the session providing
support for that object. In GUI terms, this is the Controller object for
the journal Data object, where every document window would be a separate
session. They are all owned by the global scope.
- Report objects
Every time you create report output, a report object is created to
determine what you want to see. In the Ledger REPL, a new report object is
created every time a command is executed. In CLI mode, only one report
object ever comes into being, as Ledger immediately exits after displaying
the results.
- Reporting filters
The way Ledger generates data is this: it asks the session for the current
journal, and then creates an iterator applied to that journal. The kind of
iterator depends on the type of report.
This iterator is then walked, and every object yielded from the iterator is
passed to an "item handler", whose type is directly related to the type of
the iterator.
There are many, many item handlers, which can be chained together. Each
one receives an item (post, account, xact, etc.), performs some action on
it, and then passes it down to the next handler in the chain. There are
filters which compute the running totals; that queue and sort all the input
items before playing them back out in a new order; that filter out items
which fail to match a predicate, etc. Almost every reporting feature in
Ledger is related to one or more filters. Looking at filters.h, I see over
25 of them defined currently.
- The filter chain
How filters get wired up, and in what order, is a complex process based on
all the various options specified by the user. This is the job of the
chain logic, found entirely in chain.cc. It took a really long time to get
this logic exactly write, which is why I haven't exposed this layer to the
Python bridge yet.
- Output modules
Although filters are great and all, in the end you want to see stuff. This
is the job of special "leaf" filters call output modules. They are
implemented just like a regular filter, but they don't have a "next" filter
to pass the time on down to. Instead, they are the end of the line and
must do something with the item that results in the user seeing something
on their screen or in a file.
- Select queries
Select queries know a lot about everything, even though they implement
their logic by implementing the user's query in terms of all the other
features thus presented. Select queries have no functionality of their
own, they are simple a shorthand to provide access to much of Ledger's
functionality via a cleaner, more consistent syntax.
- The Global Scope
There is a master object which owns every other objects, and this is
Ledger's global scope. It creates the other objects, provides REPL
behavior for the command-line utility, etc. In GUI terms, this is the
Application object.
- The Main Driver
This creates the global scope object, performs error reporting, and handles
command-line options which must precede even the creation of the global
scope, such as --debug.
And that's Ledger in a nutshell. All the rest are details, such as which
value expressions each journal item exposes, how many filters currently exist,
which options the report and session scopes define, etc.
John
And that's Ledger in a nutshell. All the rest are details
Many thanks for this email. Does cl-ledger (lisp version) has similar
architecture? What are the differences? I don't know how many of the
ledger's users are programmers but making Ledger's architecture more
transparent will, for sure, help people understand and contribute to
ledger.
Best,
Alexandre Rademaker
http://arademaker.github.com/
> Many thanks for this email. Does cl-ledger (lisp version) has similar
> architecture? What are the differences? I don't know how many of the
> ledger's users are programmers but making Ledger's architecture more
> transparent will, for sure, help people understand and contribute to ledger.
CL-Ledger is based on the same essential design. On that platform, "report
filters" are SERIES functions, so that all evaluation is performed lazily.
Otherwise, everything else is quite similar.
John
The strict testing of layering at link time is pretty neat. hledger's layering emerged as needed to avoid GHC "import
cycle" errors. It's good to see the similarities between my layers and your (lower) layers. Our terminology has also
become pretty consistent. Some time I should do a similar writeup following this format. Your post gives me some nice
ideas and food for thought.
-Simon