Hi all,
I have been improving and polishing the tool I use for ETL processes. While it's still alpha I wanted to share it.
The main focus was to be able to model OLAP schemas. It is able to inspect SQL databases and generate a default OLAP schema for them, and then to generate a Cubes model and config for it. CubETL supports arbitrarily nested dimensions, which it then flattens when exporting to Cubes.
Also you can define your OLAP schema manually and your own SQL mappings, or let CubETL generate SQL mappings automatically for you.
It also has a variety of other faciliities typical in ETL. Reads and writes files, CSV, JSON, XML, handles in-memory tables, SQL tables, query lookups, table lookups, caching, and reads PcAxis and SDMX multidimensional data formats.
I have done a major refactor and some parts of the code have not yet been migrated. It is still in early stages. But it does work, I'm using it for a few ETL processes, and I wanted to share and see it someone is interested.
Note: when generating an schema from a database and serving it with Cubes, you may need to use the branch "alias-issue" of my cubes repository (
https://github.com/jjmontesl/cubes/), as the current Cubes pip version has a bug related to table aliases. Clone my cubes repo, checkout the alias-issue branch and run 'python setup.py develop' from there.
Best regards!