Draft - Announcing Google Refine 2.1

1 view
Skip to first unread message

Tom Morris

unread,
Jul 18, 2011, 12:36:23 PM7/18/11
to google-refine-dev
Sorry for the delay in getting this drafted.  Please find below the proposed announcement.  Please proofread and send me corrections/additions.

Per David's suggestion, if you've contributed code to this release (patches count), please vote +1, +1 with comments, or -1 to signify your approval or disapproval.  I'll distribute when we've got three +1s (after addressing any comments).

Tom
-------------
The Google Refine team is pleased to announce version 2.1 of Google Refine, a free, open source, data hacker's power tool for working with messy data, cleaning it, transforming it, and linking it to databases like Freebase or OpenCorporates.  

This is a maintenance release which incorporates fixes and enhancements from the both the community and the core team.

Some new features of this release include:
  • HTML parsing functions (based on JSoup) 
  • Metaphone3 (American English) & Cologne Phonetic (German) coders & clustering 
  • Google Fusion Table import support
  • Facet for exact duplicates 
  • Ability to save favorite transforms for reuse later
  • Latest Apache POI library including a number of Excel bug fixes
Of course we've also fixed a bunch of bugs, particularly around  character encoding and multinational character support.  You can see a
full list of changes at  http://code.google.com/p/google-refine/wiki/ChangesFor2p1  

Google Refine is designed to be extensible and a number of extensions have been written by users over the past eight months, e.g. the RDF extension from DERI in Galway or the CKAN extension.  These are not bundled with Google Refine, but are easily available and significantly enhance the functionality of the base product.  The wiki has documentation on writing your own extension or reconciliation service.

Other than backing up your data, there is no special upgrade procedure  required.  If you are an existing user, you will get prompted with the option to upgrade the next time you run Google Refine.

The kits for Linux, Mac, and Windows are available at:

The project is completely open sourced, liberally licensed, and community driven.  If you'd like to be involved, we'd love to have you.  Check the wiki for ways to participate.

The Google Refine Team

Iain Sproat

unread,
Jul 18, 2011, 12:39:33 PM7/18/11
to google-r...@googlegroups.com
+1
Is there a wiki page which explains how to back up your data? It would
be good to link to that from "backing up your data".

Iain

David Huynh

unread,
Jul 18, 2011, 1:04:01 PM7/18/11
to google-r...@googlegroups.com
+1. I like that you're pointing out the extensions, too!

I'd only change "Ability to save favorite transforms for reuse later" to "Ability to star favorite expressions for reuse later".

(BTW, I also have new versions of those 3 screen casts to upload, very soon. I need to figure out how to properly replace the old ones without changing their URLs.)

David

Paul Makepeace

unread,
Jul 18, 2011, 1:18:29 PM7/18/11
to google-r...@googlegroups.com
+1

Few minor tweaks below...

s/and/;/

> been written by users over the past eight months, e.g. the RDF
> extension from DERI in Galway or the CKAN extension.  These are not bundled

s/or/and/

> with Google Refine, but are easily available and significantly enhance the
> functionality of the base product.

How about something in the positive sense, e.g.,

"Extensions are separate downloads and significantly..."

(I've never used an extension(!) so some elaboration/improvement on
"separate downloads" might be better.)

>  The wiki has documentation on writing

How about linking 'wiki' to the wiki home page, on the off chance
that's a new concept for anyone.

Thad Guidry

unread,
Jul 18, 2011, 1:24:32 PM7/18/11
to google-r...@googlegroups.com
+1 Yeah, we need pointers to backing up your data, specifically the whole Refine workspace folder itself and all projects in there should be zipped up and put somewhere safe.  Since I found a bug with ReconciliationManager keeping failed Standard Reconciliation services and David said he will patch after 2.1
--
-Thad
http://www.freebase.com/view/en/thad_guidry

Iain Sproat

unread,
Jul 18, 2011, 1:39:31 PM7/18/11
to google-r...@googlegroups.com
On Mon, Jul 18, 2011 at 5:39 PM, Iain Sproat <iains...@gmail.com> wrote:
> Is there a wiki page which explains how to back up your data? It would
> be good to link to that from "backing up your data".

I could only find details in the page about upgrading from 1.0 to 2.0,
so I copied out the relevant bits and made a new page:
http://code.google.com/p/google-refine/wiki/BackUpGoogleRefineData

Iain

Tom Morris

unread,
Jul 18, 2011, 4:16:26 PM7/18/11
to google-refine-dev
OK, we're live!  Thanks for everyone's help with the review and comments -- and of course all the work you did on the release.

There as a tiny last minute glitch with the auto update notification, but we were able to patch it in the releases.js file (which is good because it's too late to patch 2.0!).

Tom

David Huynh

unread,
Jul 18, 2011, 4:23:23 PM7/18/11
to google-r...@googlegroups.com
Thanks, Tom! Much appreciated!

David

Stefano Mazzocchi

unread,
Jul 18, 2011, 5:36:41 PM7/18/11
to google-r...@googlegroups.com
yay!

Great job everybody!
--
Stefano Mazzocchi  <stef...@google.com>
Software Engineer, Google Inc.

Reply all
Reply to author
Forward
0 new messages