NEW: OpenRefine Named-Entity Recognition extension

726 views
Skip to first unread message

Ruben Verborgh

unread,
Dec 20, 2012, 10:18:46 AM12/20/12
to openr...@googlegroups.com, Seth van Hooland, made...@ulb.ac.be, to...@google.com
Dear OpenRefine enthousiasts,

The Free Your Metadata team has a special Christmas treat for you all:
today, we've released a named-entity recognition extension for OpenRefine.

With this extension, you can enrich text fields right from your workspace.
Currently, we support AlchemyAPI, DBpedia Lookup and Zemanta.

Give this extension a try and let us know what you think!

Best wishes,

Ruben for the Free Your Metadata team

PS The source code is available on GitHub.

Thad Guidry

unread,
Dec 20, 2012, 10:28:42 AM12/20/12
to openr...@googlegroups.com
Great Job, Ruben !


BTW, are you at Ghent U. ? or somewhere else ?


Tom Morris

unread,
Dec 20, 2012, 10:29:51 AM12/20/12
to openr...@googlegroups.com, Seth van Hooland, made...@ulb.ac.be, to...@google.com
On Thu, Dec 20, 2012 at 10:18 AM, Ruben Verborgh <ruben.v...@ugent.be> wrote:
Dear OpenRefine enthousiasts,

The Free Your Metadata team has a special Christmas treat for you all:
today, we've released a named-entity recognition extension for OpenRefine.

With this extension, you can enrich text fields right from your workspace.
Currently, we support AlchemyAPI, DBpedia Lookup and Zemanta.

Give this extension a try and let us know what you think!

Very cool!  Thanks for contributing.
 
For those who aren't familiar with the lingo, it might be useful if you gave a concrete example or two, either here or on the web page, to help folks decide whether they should be interested. e.g. If you start with <this text> and use <this service> you'll get <N> new columns containing <this information>.

Tom

Ruben Verborgh

unread,
Dec 20, 2012, 10:32:55 AM12/20/12
to openr...@googlegroups.com
Hi Thad,

Great Job, Ruben !

Thanks a lot! 

BTW, are you at Ghent U. ? or somewhere else ?

I’m affiliated with Multimedia Lab, a research group of Ghent University and iMinds.
Free Your Metadata is a joint project with the MaSTIC group from the Université Libre de Bruxelles.

Best,

Ruben 

Ruben Verborgh

unread,
Dec 20, 2012, 10:39:51 AM12/20/12
to openr...@googlegroups.com, Seth van Hooland, made...@ulb.ac.be, to...@google.com
Hi Tom,

For those who aren't familiar with the lingo, it might be useful if you gave a concrete example or two, either here or on the web page, to help folks decide whether they should be interested. e.g. If you start with <this text> and use <this service> you'll get <N> new columns containing <this information>.

That's a great idea. I'll write it down here and see how I can incorporate it in our website.

What does the named-entity recognition extension do?
Suppose you have a project that contains a column with textual descriptions. You want structured data, but machines cannot process plain text yet.
This extension will extract important terms (called named entities) from the text.
For example, if one of the columns contains the text:
"Borut Pahor (born 2 November 1963) is a Slovenian politician who was Prime Minister of Slovenia from 2008 to 2012. He was elected as President of Slovenia in December 2012."
then named-entity extraction will create a new column with the terms "Borut Pahor", "2 November", "politician", "Prime Minister of Slovenia", "President of Slovenia" (in the case of DBpedia Lookup). Additionally, each of those terms will be associated with a link that represents it.

Best,

Ruben 

David Huynh

unread,
Dec 20, 2012, 2:34:56 PM12/20/12
to openr...@googlegroups.com
This is really great, FYM team! NER has indeed been a major feature area that Refine is lacking. If you have resources, an end-to-end use case scenario to illustrate the power of this feature will be awesome. By end-to-end I mean something like: get realistic data from somewhere, do NER, do further analysis, arrive at some insights.

David


--
 
 

Ruben Verborgh

unread,
Dec 21, 2012, 3:01:23 AM12/21/12
to openr...@googlegroups.com
Hi David,

Thanks. We're working on resources right now.

On the one hand, there is our forthcoming publication: Seth van Hooland, Max De Wilde, Ruben Verborgh, Thomas Steiner, and Rik Van de Walle. Named-Entity Recognition: A Gateway Drug for Cultural Heritage Collections to the Linked Data Cloud?
In this article, we will focus on a cultural heritage collection and analyze the results and added value in detail.

On the other hand, we plan to provide an introductory video and dataset on our website to get people started, similar to our previous Refine tutorials.

Best,

Ruben

Martin Magdinier

unread,
Dec 21, 2012, 8:01:03 AM12/21/12
to openrefine
Kudos for this work. Freeyourmetada have done a great job to promote OpenRefine and now extending it ! Thanks

Part of my Christmas break plan is to get the landing page and wiki updated. It will include links to all working extensions (download, doc and code). Of course you're welcome to update the wiki doc regarding your extension. 

Martin




--
 
 

Mateja Verlic

unread,
Jan 14, 2013, 8:19:20 AM1/14/13
to openr...@googlegroups.com
Hi Ruben,

this is just amazing! Well, I have a question/suggestion for you...

I am one of the developers working on LOD-enabled version of OpenRefine (LODRefine) - as a part of LOD2 project. LODRefine is basically OpenRefine + integrated extensions from DERI and Zemanta, which provide additional functionalities for the Linked Open Data community. 
Would you mind if I include your extension as well?  We already implemented named entity extraction using Zemanta API, but your extension offers additional NER services, which is just great. :)

Please let me know what you think about my suggestion.

Kind regards,
Mateja

Ruben Verborgh

unread,
Jan 14, 2013, 8:50:45 AM1/14/13
to openr...@googlegroups.com
Dear Mateja,

Great idea. We'd love to have our extension in LODRefine :-)
Please keep us updated, and don't hesitate to contact us if you need any help!

Best,

Ruben

brind...@gmail.com

unread,
Apr 6, 2020, 9:05:30 AM4/6/20
to OpenRefine
Hi Ruben,
Can you please help me with tutorial on extracting entities from running plain text and converting to RDF using open refine.


Thanks,
Brindha
Reply all
Reply to author
Forward
0 new messages