Announcing Google Refine 2.1

11 views
Skip to first unread message

Tom Morris

unread,
Jul 18, 2011, 4:13:40 PM7/18/11
to google-refine
The Google Refine team is pleased to announce version 2.1 of Google Refinea free, open source, data hacker's power tool for working with messy data, cleaning it, transforming it, and linking it to databases like Freebase or OpenCorporates.  

This is a maintenance release which incorporates fixes and enhancements from the both the community and the core team.

Some new features of this release include:
  • HTML parsing functions (based on JSoup) 
  • Metaphone3 (American English) & Cologne Phonetic (German) coders & clustering 
  • Google Fusion Table import support
  • Facet for exact duplicates 
  • Ability to star favorite expressions for reuse later
  • Latest Apache POI library including a number of Excel bug fixes
Of course we've also fixed a bunch of bugs, particularly around  character encoding and multinational character support.  You can see a
full list of changes at  http://code.google.com/p/google-refine/wiki/ChangesFor2p1  

Google Refine is designed to be extensible; a number of extensions have been written by users over the past eight months, e.g. the RDF extension from DERI in Galway and the CKAN extension. Extensions are distributed separately by their publishers and significantly enhance the functionality of the base product.  The wiki has documentation on writing your own extension or reconciliation service.

Other than backing up your data, there is no special upgrade procedure  required.  If you are an existing user, you will get prompted with the option to upgrade the next time you run Google Refine.

The kits for Linux, Mac, and Windows are available for download from:

The project is completely open sourced, liberally licensed, and community driven.  If you'd like to be involved, we'd love to have you.  Check the wiki for ways to participate.

The Google Refine Team

Reply all
Reply to author
Forward
0 new messages