Automating OpenRefine

127 views
Skip to first unread message

Felix Lohmeier

unread,
Jun 18, 2021, 4:16:48 AM6/18/21
to OpenRefine
Hi OpenRefine Community,

I would like to let you know that new templates for automating OpenRefine on Linux are available. In the last years I tried different approaches (including openrefine-batch) and now I finally work with a combination of the command line tool openrefine-client and the task runner go-task (an alternative to GNU Make).
https://github.com/opencultureconsulting/openrefine-task-runner

To try it out, you can directly start an environment via binder:
https://notebooks.gesis.org/binder/v2/gh/opencultureconsulting/openrefine-task-runner/main?urlpath=lab/tree/demo.ipynb

Please be aware that OpenRefine is not designed for automation. The internal API is not versioned and may change. The operation history in JSON format is not very handy and may change as well. Error handling could also be better. The development team therefore recommends to rather not use OpenRefine for complex workflows (see for example Tom's comment).

Nevertheless, I have had good experiences with it in various projects and thought that the templates might be interesting for others as well.

Best regards,
Felix

Antonin Delpeuch (lists)

unread,
Jun 18, 2021, 5:25:43 AM6/18/21
to openr...@googlegroups.com
Hi Felix,

Thank you very much!

Although your caveat about the stability of the internal API is correct,
personally I recognize this sort of automation as an important use case
that we should try to support better.

Best,

Antonin

On 18/06/2021 10:16, Felix Lohmeier wrote:
> Hi OpenRefine Community,
>
> I would like to let you know that new templates for automating
> OpenRefine on Linux are available. In the last years I tried different
> approaches (including openrefine-batch) and now I finally work with a
> combination of the command line tool openrefine-client and the task
> runner go-task (an alternative to GNU Make).
> https://github.com/opencultureconsulting/openrefine-task-runner
> <https://github.com/opencultureconsulting/openrefine-task-runner>
>
> To try it out, you can directly start an environment via binder:
> https://notebooks.gesis.org/binder/v2/gh/opencultureconsulting/openrefine-task-runner/main?urlpath=lab/tree/demo.ipynb
> <https://notebooks.gesis.org/binder/v2/gh/opencultureconsulting/openrefine-task-runner/main?urlpath=lab/tree/demo.ipynb>
>
> Please be aware that OpenRefine is not designed for automation. The
> internal API is not versioned and may change. The operation history in
> JSON format is not very handy and may change as well. Error handling
> could also be better. The development team therefore recommends to
> rather not use OpenRefine for complex workflows (see for example Tom's
> comment
> <https://groups.google.com/g/openrefine-dev/c/42mdP8gyt4M/m/s21fJ3W6BQAJ>).
>
> Nevertheless, I have had good experiences with it in various projects
> and thought that the templates might be interesting for others as well.
>
> Best regards,
> Felix
>
> --
> You received this message because you are subscribed to the Google
> Groups "OpenRefine" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to openrefine+...@googlegroups.com
> <mailto:openrefine+...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/openrefine/492376cc-33de-45a6-a444-5bd31794625dn%40googlegroups.com
> <https://groups.google.com/d/msgid/openrefine/492376cc-33de-45a6-a444-5bd31794625dn%40googlegroups.com?utm_medium=email&utm_source=footer>.

Reply all
Reply to author
Forward
0 new messages