Extensions to currency syntax

73 views
Skip to first unread message

Martin Blais

unread,
Mar 20, 2021, 2:26:56 PM3/20/21
to Beancount
In my continuing strategic diversification efforts I have a growing number of futures and futures options transaction flow and the symbology has been giving me some trouble lately. To address this, I just introduced a backward incompatible change to the v3 syntax which may warrant your attention.
It affects flags and currency symbols, cleans up past issues with flags parsing and expands the valid set of currency tokens.

See
https://github.com/beancount/beancount/commit/d2d0a35e629408c9ce364eea5601839f8f582208



Implemented extensions to the Currency syntax, parsing and rendering.

This affects the syntax in the following backwards-incompatible ways (please
read the detail, it's a bit complicated):


- Single-character currency names are now supported in v3 C++. For example,

     Assets:Trading:Stocks     10 V {206.90 USD}

  for buying 10 shares of Visa, is now working as expected.

  You will not be able to enjoy this for now because the C++ lexer isn't used
  for the Python version of v3, only the GNU flex lexer. This change only takes
  effect in the C++ version.


- Any of the A-Z characters are now supported, but they require new syntax.
  Before this change, only the PSTCURM characters were supported.
  Now all characters are supported, BUT YOU HAVE TO QUOTE THEM, like this:

    2021-03-20 'S "My special transaction"
      ...

  This is new syntax. This introduces a small change that removes ambiguity.
  A flag is not either one of !&#?% or '<char>, where <char> is in A-Z.

  (When parsed, the single-character flags are parsed as before, that is, the
  leading quote will be stripped. 'S becomes "S" on the data structure.)

  Unfortunately, you may have to change your input file in order to make this
  work; THIS IS NOT A BACKWARD COMPATIBLE CHANGE. It was deemed more important
  to support single-character stock symbols than flags, and this also opens up
  all the characters to be used as flags, without this arbitrary list of
  supported ones (PSTCURM).


- Currency symbols now support a leading slash, for futures contracts. The
  following syntax now parse as valid currencies:

    AAPL                (stock)                                                                                            
    V                   (single-character stock)                                                                            
    NT.TO               (stock on another market)                                                                          
    TLT_040921C144      (equity option)                                                                                    
    /6J                 (currency futures)                                                                                  
    /NQH21              (commodity futures)                                                                                
    /NQH21_QNEG21C13100 (futures option)                                                                                    


- The definition of 'CURRENCY_RE' has been updated accordingly. If you have
  scripts using that, be mindful of the change.


The changes above do not affect branch 'v2', only 'v3.'
Single-character currency names are only available in the C++ version of v3.


Daniele Nicolodi

unread,
Mar 20, 2021, 4:28:54 PM3/20/21
to bean...@googlegroups.com
On 20/03/2021 19:26, Martin Blais wrote:

>   This is new syntax. This introduces a small change that removes ambiguity.
>   A flag is not either one of !&#?% or '<char>, where <char> is in A-Z.

I think you mean "A flag is _now_ either one of...".

I think the changes make a lot of sense.

Have you also reworked how flags at the beginning of a line make the
line to be ignored? I always thought that this was a rather counter
intuitive and not very useful feature (and I don't remember if it made
to the RE/flex parser).

Cheers,
Dan

Martin Blais

unread,
Mar 20, 2021, 4:46:45 PM3/20/21
to Beancount
On Sat, Mar 20, 2021 at 4:28 PM Daniele Nicolodi <dan...@grinta.net> wrote:
On 20/03/2021 19:26, Martin Blais wrote:

>   This is new syntax. This introduces a small change that removes ambiguity.
>   A flag is not either one of !&#?% or '<char>, where <char> is in A-Z.

I think you mean "A flag is _now_ either one of...".

Yes

 

I think the changes make a lot of sense.

Thank you
 

Have you also reworked how flags at the beginning of a line make the
line to be ignored? I always thought that this was a rather counter
intuitive and not very useful feature (and I don't remember if it made
to the RE/flex parser).

Yes
This now raises an error.



 

Cheers,
Dan

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/df075220-f2a7-ccf3-df42-78e5ae60209f%40grinta.net.

Justus Pendleton

unread,
Mar 23, 2021, 10:02:19 AM3/23/21
to Beancount
On Sunday, March 21, 2021 at 1:26:56 AM UTC+7 bl...@furius.ca wrote:
- Any of the A-Z characters are now supported, but they require new syntax.
  Before this change, only the PSTCURM characters were supported.
  Now all characters are supported, BUT YOU HAVE TO QUOTE THEM, like this:

Is there a reason beancount can't support non A-Z characters as well or does that overly complicate the parser? I've always found it slightly strange that I can't use $, €, or ¥ as a currency. I wonder if any global exchanges use non A-Z ticker symbols or if everyone just follows ASCII to keep compatibility with the US. The wikipedia page doesn't have a lot detail but suggests non-Latin script exchanges often use numbers instead of letters for tickers: https://en.wikipedia.org/wiki/Ticker_symbol

Martin Blais

unread,
Mar 23, 2021, 10:28:13 AM3/23/21
to Beancount
I think the problem is ordering and character-set.
$ comes before the number, not after.  I'm not sure if the change would be trivial but there are numbers in a number of different places which would all be affected (e.g. balance checks, costs, etc.).
The other is that using Flex I didn't have good support for e.g. Unicode encoding at the time (this changes in v3, as it's using Genivia's RE-Flex).
Finally, when I wrote v2 I was really focused on simplifying and making everything as uniform as possible and that drove that decision at the time.




--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages