Need help troubleshooting an importer

50 views
Skip to first unread message

no...@flipo.contact

unread,
Dec 21, 2021, 7:04:55 AM12/21/21
to bean...@googlegroups.com
Hi,

I've been a happy user of beancount+fava for years. Although I'm not a developer, I managed to create and maintain an importer, but never took the time to learn how to create unit tests.

This September, my importer stopped working even though I didn't touch it. The input file is now listed as "Non-importable Files" by Fava, like previously imported Excel files.

I guess the problem comes from a software upgrade (I'm now running fava 1.20.1-2 and beancount 2.3.4-2) and I'm wondering if the pull request #659 broke my importer somehow: [parser: Stop pretending we support encodings other than UTF-8](https://github.com/beancount/beancount/pull/659)

Here's the input file ACCOUNT_ID.xls generated by Boursorama (Unicode text, UTF-8 text)
```
"*** Période : 01/10/2021 - 02/10/2021"
"*** Compte : ACCOUNT_ID -EUR "

"DATE OPERATION" "DATE VALEUR" "LIBELLE" "MONTANT" "DEVISE"
" 17/09/2021" " 17/09/2021" "CARTE 11/09/21 PAYEE'S NAME CB*DIGITS" -00000000002,00 "EUR "
```

And the importer.py:
```
import os
import re
import codecs
import csv
import datetime
from beancount.core import number, data, amount
from beancount.ingest import importer
from smart_importer.detector import DuplicateDetector

class Target(object):

def __init__(self, account, payee=None, narration=None):
self.account = account
self.payee = payee
self.narration = narration

class Importer(importer.ImporterProtocol):
def __init__(self, account):
self.account = account

def identify(self, fname):
return re.match(r"ACCOUNT_ID",
os.path.basename(fname.name))

def file_account(self, fname):
return self.account

def file_date(self, fname):
fp = codecs.open(fname.name, 'r', 'us-ascii')
line = fp.readlines()[0]

period = line.strip('"*** Période : ')
start = datetime.datetime.strptime(period[0:10], '%d/%m/%Y').date()

return start

def file_name(self, fname):
fp = codecs.open(fname.name, 'r', 'us-ascii')
line = fp.readlines()[0]

period = line.strip('"*** Période : ')
start = datetime.datetime.strptime(period[0:10], '%d/%m/%Y').date()
end = datetime.datetime.strptime(period[13:23], '%d/%m/%Y').date()
filename = end.strftime('%Y-%m-%d') + ' ACCOUNT_ID since ' + start.strftime('%Y-%m-%d') + '.xls'

return filename

def extract(self, fname, existing_entries):
fp = codecs.open(fname.name, 'r')
lines = fp.readlines()

# drop top and bottom stuff
lines = lines[4:]
entries = []

def fix_decimals(s):
return s.replace(',', '.')

for index, row in enumerate(csv.reader(lines, delimiter='\t')):
meta = data.new_metadata(fname.name, index)
date = datetime.datetime.strptime(row[0], ' %d/%m/%Y').date()
desc = row[2].rstrip()
payee = ""
currency = row[4]
num = number.D(fix_decimals(row[3]))
units = amount.Amount(num, currency)

frm = data.Posting(self.account, units, None, None, None, None)
txn = data.Transaction(meta, date, "*", payee, desc,
data.EMPTY_SET, data.EMPTY_SET, [frm])

entries.append(txn)

return entries
```

Any idea? Thank you.
Reply all
Reply to author
Forward
0 new messages