csv Import in 2.1.3 errors with older csv config

55 views
Skip to first unread message

Jonathan Salles

unread,
Nov 18, 2018, 9:25:42 PM11/18/18
to Beancount
Hello All,

After running a long time with Python 3.5.2 and a Beancount version that did not list versions, (sorry, forgot which one, but more than a year old) I installed Python 3.7.1 on my Mac Mini running El Capitan 10.11.6.  I then updated Beancount and Fava using pip3, and pulled down 2.1.3.  Things looked good until I tried to do a csv import.  It incurred the following error:

Traceback (most recent call last):



  
File "/usr/local/bin/bean-extract", line 4, in <module>

    
from beancount.ingest.extract import main; main()

  
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/beancount/ingest/extract.py", line 250, in main

    
return scripts_utils.trampoline_to_ingest(sys.modules[__name__])

  
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/beancount/ingest/scripts_utils.py", line 132, in trampoline_to_ingest

    
return run_import_script_and_ingest(parser)

  
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/beancount/ingest/scripts_utils.py", line 191, in run_import_script_and_ingest

    mod 
= runpy.run_path(args.config)

  
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 263, in run_path

    pkg_name
=pkg_name, script_name=fname)

  
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 96, in _run_module_code

    mod_name
, mod_spec, pkg_name, script_name)

  
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code

    
exec(code, run_globals)

  
File "jfs.impcfg", line 32, in <module>

    fidelityvisa
.Importer('Liabilities:US:FidelityVisa'),

  
File "/Users/jonathan/Documents/Beancount/importers/fidelityvisa/__init__.py", line 34, in __init__

    
'fidelity')

  
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/beancount/ingest/importers/csv.py", line 135, in __init__

    
assert isinstance(skip_lines, int)

AssertionError

I noticed the csv.py on the Bitbucket source was slightly different, so I copied that into a local directory and imported that.  That was trying to import the new mixins, so that didn't work. (I suspect if I download those my importer would break with that too.) Is the "assert isinstance(skip_lines, int) a new addition?  Is there an easy way to fix this?  I took a look, but didn't see anything obvious yet.  I can upload my __init__.py file if needed.  This broke for two accounts, both similar except for csv column details.

Thanks for all the great work Martin, and others.  Other than being a little hard to understand your code and never enough time to use it fully, I do enjoy using Beancount.

Jonathan 



Martin Blais

unread,
Nov 18, 2018, 11:32:48 PM11/18/18
to bean...@googlegroups.com
You're probably passing positional arguments to the constructor.
Double-check the arguments and perhaps convert to using keyword arguments to be explicit.


--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/bc2517dd-b75e-477d-943b-cbb4f80ef3ee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jonathan Salles

unread,
Nov 21, 2018, 8:27:17 PM11/21/18
to Beancount
Hi Martin,

Thanks for the quick reply. It helped me solve the problem (though I realized the error was telling me the same thing, but I didn't see it).  I took a look at the arguments in the new Importer class, and compared them to the oldest version on the bitbucket source (probably what I was using).  I saw the institution was changed to a keyword, and changed that in my __init__.py.

Unfortunately it incurred another error.  I stared at that for a bit, then fortunately looked on the groups and found the posts by Shreedhar and Zhouyun describing the same thing and a temporary fix for that.  Thanks Shreedhar and Zhouyun!  Adding the $(pwd) to my input file path got me my parsed transactions.

Is there some newer documentation on using the csv importer?  The only documentation I see is the generic one about the whole importer process.  I received some instructions from you a couple of years ago when I first stated, but I see things seemed to have changed a bit.  I believe we still need the import config file, and do we still need the __init__.py files for each importer?  It looks like it.  The original csv.py file I used did not have an identify function, and I added one (I think it was in one of the examples).

    def identify(self, file):
        # Match if the filename is as downloaded and the header has the unique
        # fields combination we're looking for.
        return (re.match(r"\d\d\d\d\d\d\d\dFidVisa\.csv", path.basename(file.name)) and
                re.match("Trans Date,Post Date,Transaction,Description,Amount",\
                        file.head()))

I see there is now one in the 2.1.3 release:

    def identify(self, file):
        if file.mimetype() != 'text/csv':
            return False
        return super().identify(file)

This seems to just check that it is a csv file, where the other checks for the file name.  If I understand things correctly, the one in my __init__.py will override the one in the csv.py file in the Beancount source.  If that is so, then I would have to add the csv check to my __init.py if I desire that function?

On another note, I think I read in one of your comments that the ofx importer will only work on account id in a file, or something linke that (sorry, forgot where and couldn't find it).  My wife and I both have Capital One 360 accounts that show up in the same login.  I discovered that when downloading the ofx for one account I get both accounts in the same file, both in their own ofx 'envelope' with the account id.  I put an instance for both in my config file, and when I run bean-extract I get the transactions for both accounts, with the correct beancount account for each transaction.  So your ofx importer works pretty good, maybe better than you thought.

Thanks again for all your hard work, apologies if I ran on too much, 
and Happy Thanksgiving to you and all the other Beancounters.

Jonathan

Martin Blais

unread,
Nov 22, 2018, 8:38:18 PM11/22/18
to bean...@googlegroups.com
On Wed, Nov 21, 2018 at 8:27 PM Jonathan Salles <jfs...@gmail.com> wrote:
Hi Martin,

Thanks for the quick reply. It helped me solve the problem (though I realized the error was telling me the same thing, but I didn't see it).  I took a look at the arguments in the new Importer class, and compared them to the oldest version on the bitbucket source (probably what I was using).  I saw the institution was changed to a keyword, and changed that in my __init__.py.

Unfortunately it incurred another error.  I stared at that for a bit, then fortunately looked on the groups and found the posts by Shreedhar and Zhouyun describing the same thing and a temporary fix for that.  Thanks Shreedhar and Zhouyun!  Adding the $(pwd) to my input file path got me my parsed transactions.

Is there some newer documentation on using the csv importer?  The only documentation I see is the generic one about the whole importer process.  I received some instructions from you a couple of years ago when I first stated, but I see things seemed to have changed a bit.  I believe we still need the import config file, and do we still need the __init__.py files for each importer?  It looks like it.  The original csv.py file I used did not have an identify function, and I added one (I think it was in one of the examples).

    def identify(self, file):
        # Match if the filename is as downloaded and the header has the unique
        # fields combination we're looking for.
        return (re.match(r"\d\d\d\d\d\d\d\dFidVisa\.csv", path.basename(file.name)) and
                re.match("Trans Date,Post Date,Transaction,Description,Amount",\
                        file.head()))

I see there is now one in the 2.1.3 release:

    def identify(self, file):
        if file.mimetype() != 'text/csv':
            return False
        return super().identify(file)

This seems to just check that it is a csv file, where the other checks for the file name.  If I understand things correctly, the one in my __init__.py will override the one in the csv.py file in the Beancount source.  If that is so, then I would have to add the csv check to my __init.py if I desire that function?

From memory, no idea.
Please refer to the source code, if you find a bug, send me a patch.


On another note, I think I read in one of your comments that the ofx importer will only work on account id in a file, or something linke that (sorry, forgot where and couldn't find it).  My wife and I both have Capital One 360 accounts that show up in the same login.  I discovered that when downloading the ofx for one account I get both accounts in the same file, both in their own ofx 'envelope' with the account id.  I put an instance for both in my config file, and when I run bean-extract I get the transactions for both accounts, with the correct beancount account for each transaction.  So your ofx importer works pretty good, maybe better than you thought.

From memory, each instance will extract the data for one account id.
Using two instances - one for each account - is the correct thing to do.


 
--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages