Five things not to do with OpenRefine

Skip to first unread message

Antonin Delpeuch (lists)

Sep 28, 2021, 8:22:45 AMSep 28
Hi all,

Here is a really interesting article (in German) about a few things that
do not work well in OpenRefine in 3.5:

I am working on making the first two use cases easier to handle in
OpenRefine. In particular, we should have a beta release of the 4.0
version soon, which tackles their second point (handling large datasets).



Thad Guidry

Sep 28, 2021, 8:52:31 AMSep 28
I agree with the author in all his points.
I acknowledge that I had other decent tools to help with all those points at the time I was helping David Huynh and Stefano Mazzocchi shape OpenRefine's design, so they weren't a big deal for me. We'll, regarding handling larger datasets I did complain to David a few times, but then just bought more memory :-)
What we didn't have were any open source tools to really help clean datasets and align them to Freebase's schema easily and batch upload.
And the most expensive enterprise tools that had clustering started at $100,000, but Stefano's extra work also made that open source and free in OpenRefine.

I'm excited about future improvements to OpenRefine, but I also try to never forget that the world owes a debt to both of them for the features they delivered for free...not what was intentionally left on the design floor.

You received this message because you are subscribed to the Google Groups "OpenRefine Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit
Reply all
Reply to author
0 new messages