I want to use Refine as part of a batch ETL process. Taking a CSV from an Oracle table, cleaning and transforming as necessary with Refine expressions, transforming to XML and then loading to an XML DB where I will do denormalisations of the multiple normalised relational tables. The tables could run into millions of rows.
I can see how using the python/ruby scripts I could batch up a large CSV into 100K row projects, load and refine. I can't see how I could then export using a templating exporter to turn the output into XML. Is there a URL I can post to to do a custom template export? Do I need to write a custom extension?
I've given the whole process I am trying to perform above so that if anyone has any other thoughts on this then please shout!
Thanks,
Dave
--
You received this message because you are subscribed to the Google Groups "Open Refine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Thanks Martin.I've seen other posts commenting that OpenRefine isn't an ETL tool and I understand. It is however a potentially very useful process when putting your own custom ETL process together...which one often has to do. To be add a little more colour to my previous post...this library:wraps HTTP calls into OpenRefine, such as:/command/core/apply-operations?project=#{@project_id}
If there was a URI for the templater I could use this in a batch process. That's all I need.
I guess I can go work that out from the source code, however it would be great if the project published these URLs as an API.
I am trying to get https://github.com/felixlohmeier/openrefine-batch#options working for json processing , but will also want to interact with the templater. Can anyone share some solutions they have found?