Hi!
It's been a while since I've done much, but a few weekends ago I rewrote all my CSV importers.
I had new changes to update my code for, and I was also behind on updating from changes from updates in beangulp.
Some nice experience came out of it.
I had been unhappy with the object-oriented mixins and CSV importer that's in beangulp for a long time.
Looking around for which file provided which implementation was always a bit annoying.
It's a lot simpler to have a single protocol (beangulp.Importer) with all abstract methods and just implementations of that (no inheritance of functionality).
In fact, even if I have to duplicate some code in the implementation, I'm still happier with the result that way.
The simplicity is worth the repetition and having all the code locally visible in a single file is advantageous, especially since this is the type of thing that you end up doing reluctantly (in general when I'm doing accounting imports the last thing I want to do is having to hack to adapt code due to changed file formats; the easier I can make it the better).
As it turns out, a heavily configurable CSV importer is not best served by a class + config abstraction. It's a lot simpler to read and massage the input table with "petl" to convert the types (dates and numbers, mostly), normalize the column names and then call a generic little helper function to construct Transaction instances. For many of my simple CSVs, I've been using this extremely simple helper:
and these parser functions:
The petl code really is as simple - and much more powerful - than a custom configuration that attempts to support all variations and think ahead about all the possibilities.
This is the key: that code *is* the transformation configuration, and the petl API is quite elegant and minimal in that way.
Here's an example of such a CSV importer using petl (but not the helper above, this one creates transactions for groups of rows with the same id):
What I ended up with is so much easier to work with when debugging is needed that I'm tempted to declare the CSV importer implementation that's in beangulp deprecated.
I have no intention of adding to that functionality going forward.
I think we should even probably delete the mixins and it on the next release. I have a feeling nobody's been using them anyway (nobody ever asked questions about them, I was probably alone using them) and it's less code to maintain. If you rely on them say something.
We could add a tag for the last version with them available.
Any thoughts?