More functions for bean-query

326 views
Skip to first unread message

shreedha...@gmail.com

unread,
Nov 14, 2018, 10:15:52 PM11/14/18
to Beancount

Hi,


First let me say that beancount has been a blast to work with so far! The code is really easy to read and follow and extend and I'm very impressed with the overall architecture and design. Thanks a bunch for all the work that has gone into it!


I'm sorry for so many posts all at once! Recently, I've been "refactoring" my account structure to get my reporting a little easier and preparing for tax season next year.  But that means I'm writing all sorts of SQL queries for bean-query. There are some roadblocks I'm hitting that can be easily crossed with some additions to the list of functions in bean-query. Here are 2 that I have so far (with links for reference):

  1. SUB(pattern, replacement, string) - Basically this would be a wrapper on top of re.sub (link). This helps a lot in case you want to print things slightly differently or reorder something for a report.
  2. COALESCE(x, y) - This is a staple SQL function to deal with NULLs in the data (link). For example if you want to multiply price and number for a collection of postings but not all have a price and you want price for the NULL scenarios to be interpreted as 1. : SELECT number * COALESCE(price, 1) where narration ~ 'BLAH'. Now, SQL supports variadic arguments which might be tough to implement in beancount, so I think sticking to two arguments is enough because it can be nested to achieve the same effect.

So, I'd love to your hear your thoughts on these. As always, I can implement them and send a patch. I hope a hg export should work well for you? Also, I know that I said I could implement the date functions as well.  I'm planning on getting to all that this weekend since I would need them for my reports anyway.


As I said, beancount is an awesome product and I'm really happy to contribute to make it better in any way I can.


- Shreedhar

Martin Blais

unread,
Nov 15, 2018, 12:40:52 AM11/15/18
to bean...@googlegroups.com
On Wed, Nov 14, 2018 at 10:15 PM <shreedha...@gmail.com> wrote:

Hi,

First let me say that beancount has been a blast to work with so far! The code is really easy to read and follow and extend and I'm very impressed with the overall architecture and design. Thanks a bunch for all the work that has gone into it!

Thanks for the nice words.
I'm always happy when a new voice shows up and I find out it's working for them.

I'm sorry for so many posts all at once! Recently, I've been "refactoring" my account structure to get my reporting a little easier and preparing for tax season next year.  But that means I'm writing all sorts of SQL queries for bean-query. There are some roadblocks I'm hitting that can be easily crossed with some additions to the list of functions in bean-query. Here are 2 that I have so far (with links for reference):

  1. SUB(pattern, replacement, string) - Basically this would be a wrapper on top of re.sub (link). This helps a lot in case you want to print things slightly differently or reorder something for a report.
SGTM
Let's do it

 
  1. COALESCE(x, y) - This is a staple SQL function to deal with NULLs in the data (link). For example if you want to multiply price and number for a collection of postings but not all have a price and you want price for the NULL scenarios to be interpreted as 1. : SELECT number * COALESCE(price, 1) where narration ~ 'BLAH'. Now, SQL supports variadic arguments which might be tough to implement in beancount, so I think sticking to two arguments is enough because it can be nested to achieve the same effect.
Also SGTM
I think the dispatching mechanism supports multiple signatures, so you could define versions for 1, 2, 3, 4, 5 parameters (probably sufficient).
 

So, I'd love to your hear your thoughts on these.

Yes.
The SQL client is an experimental thing intended to absorb most feature requests. Very liberal about adding stuff.
Note: It's not tested well.
The goal is ultimately to reimplement vastly better as a separate project and to have a Beancount adapter for its input.
 

As always, I can implement them and send a patch.

Awesome
 

I hope a hg export should work well for you?

If you can wrangle bitbucket that's the no.1 way.
Otherwise an export or old-school patch works too.

Also, I know that I said I could implement the date functions as well.  I'm planning on getting to all that this weekend since I would need them for my reports anyway.


As I said, beancount is an awesome product and I'm really happy to contribute to make it better in any way I can.

Fab!
Thank you,

 


- Shreedhar

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/2591eb99-2d58-45f8-bbe2-06dffe6ae2cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

shreedha...@gmail.com

unread,
Nov 18, 2018, 4:20:51 PM11/18/18
to Beancount


On Wednesday, November 14, 2018 at 11:40:52 PM UTC-6, Martin Blais wrote:
On Wed, Nov 14, 2018 at 10:15 PM <shreedha...@gmail.com> wrote:

Hi,

First let me say that beancount has been a blast to work with so far! The code is really easy to read and follow and extend and I'm very impressed with the overall architecture and design. Thanks a bunch for all the work that has gone into it!

Thanks for the nice words.
I'm always happy when a new voice shows up and I find out it's working for them.

I'm sorry for so many posts all at once! Recently, I've been "refactoring" my account structure to get my reporting a little easier and preparing for tax season next year.  But that means I'm writing all sorts of SQL queries for bean-query. There are some roadblocks I'm hitting that can be easily crossed with some additions to the list of functions in bean-query. Here are 2 that I have so far (with links for reference):

  1. SUB(pattern, replacement, string) - Basically this would be a wrapper on top of re.sub (link). This helps a lot in case you want to print things slightly differently or reorder something for a report.
SGTM
Let's do it


 
 
  1. COALESCE(x, y) - This is a staple SQL function to deal with NULLs in the data (link). For example if you want to multiply price and number for a collection of postings but not all have a price and you want price for the NULL scenarios to be interpreted as 1. : SELECT number * COALESCE(price, 1) where narration ~ 'BLAH'. Now, SQL supports variadic arguments which might be tough to implement in beancount, so I think sticking to two arguments is enough because it can be nested to achieve the same effect.
Also SGTM
I think the dispatching mechanism supports multiple signatures, so you could define versions for 1, 2, 3, 4, 5 parameters (probably sufficient).

I tried to implement this, but I believe there is a limitation in the dispatching framework. It can be fixed of course, but I wanted to let you here before I opened a PR with it. The problem is with the code:

    def get_function(self, name, operands):
       
"""Return a function accessor for the given named function.
        Args:
          name: A string, the name of the function to access.
        """

       
try:
            key
= tuple([name] + [operand.dtype for operand in operands])
           
return self.functions[key](operands)
       
except KeyError:
           
# If not found with the operands, try just looking it up by name.
           
try:
               
return self.functions[name](operands)
           
except KeyError:
                signature
= '{}({})'.format(name,
                                           
', '.join(operand.dtype.__name__
                                                     
for operand in operands))
               
raise CompilationError("Invalid function '{}' in {} context".format(
                    signature
, self.context_name))

It uses a "key" to look up a dictionary of functions. That means if you're overloading functions, the operands much match exactly. But COALESCE doesn't really care about the type of the operands at all! That is, operands should be of type object, just like the function STR(). But, get_function will try to look for the key ('coalesce', str, Amount) for the call `COALESCE(str, Amount)` - which won't find ('coalesce', object, object). I see a couple of ways of "fixing" this:

  1. Convert the dictionary to a simple list that you can iterate to find what you need. The number of functions is not that large for it to matter IMHO. But if you are worried about that, look up in the dictionary first, and if you don't find it there, iterate the keys and values till you find what you need. In any case, the code can then do is a issubclass(operand.dtype, intype) check for each argument to determine if the function is valid. That would be similar to the implementation in EvalFunction()
  2. Use a multi-dictionary. That is each key (function) maps to multiple (a list) of EvalFunctions.
  3. Change the key of the dictionary to include the number of arguments. That is, first search for ('Coalesce', str, amount) then ('Coalesce', 2) and pick the first you find.
  4. Always look for "object" arguments in the dictionary. That is, first search for ('Coalesce', str, amount) then ('Coalesce', str, object) then ('Coalesce', object, object). (This could lead to a lot of lookups! If you permute, it's even more!). Also this won't work for cases when the argument is neither a leaf nor base class but an intermediate class (I don't have an example for this today, though).
There might be other ways of course. I'd love to hear you thoughts on this. I think #3 would involve the least code change and should be quick to implement.


[Aside: I personally don't think COALESCE with 1 argument has any use, since it is equivalent to using the argument directly.]

There was another limitation I noticed, but i'll start another e-mail thread for that.
 
 

So, I'd love to your hear your thoughts on these.

Yes.
The SQL client is an experimental thing intended to absorb most feature requests. Very liberal about adding stuff.
Note: It's not tested well.
The goal is ultimately to reimplement vastly better as a separate project and to have a Beancount adapter for its input.
 

I'll bet that's a lot of work to rewrite. But, perhaps that would have helped us for COALESCE. Is it a system where you don't have to write you grammar rules at all - and SQL grammar is built-in?

 

As always, I can implement them and send a patch.

Awesome
 

I hope a hg export should work well for you?

If you can wrangle bitbucket that's the no.1 way.
Otherwise an export or old-school patch works too.

I think I have the PR system figured out. But, do let me know if I did anything wrong.
 

Cheers,
-Shreedhar
Reply all
Reply to author
Forward
0 new messages