Remove URL Encoding

1,172 views
Skip to first unread message

Sean Leahy

unread,
Jan 9, 2014, 11:07:27 AM1/9/14
to openr...@googlegroups.com

Hi,

I have a column in my project that has a lot of URL encoding that I would like to remove. If I perform a single replace, that particular encoded character is replaced as expected. i.e.
replace(value, "%26", "&")
But given there are so many characters I would like to handle them all in one transform statement and not do each different type of character one at a time. 

I have also tried the following but it is not recursive.

if (value.contains('%20'),replace(value, "%20", " "),
if (value.contains('%21'),replace(value, "%21", "!"),
if (value.contains('%22'),replace(value, "%22", "\""),
if (value.contains('%23'),replace(value, "%23", "#"),
if (value.contains('%24'),replace(value, "%24", "$"),
if (value.contains('%25'),replace(value, "%25", "%"),
if (value.contains('%26'),replace(value, "%26", "&"),
if (value.contains('%27'),replace(value, "%27", "'"),
if (value.contains('%28'),replace(value, "%28", "("),
if (value.contains('%29'),replace(value, "%29", ")"),
if (value.contains('%2A'),replace(value, "%2A", "*"),
if (value.contains('%2B'),replace(value, "%2B", "+"),
if (value.contains('%2C'),replace(value, "%2C", ","),
if (value.contains('%2D'),replace(value, "%2D", "-"),
if (value.contains('%2E'),replace(value, "%2E", "."),
if (value.contains('%2F'),replace(value, "%2F", "/"), value)))))))))))))))))

How can I make this recursive ? so they are all processed in one GREL transform statement, or is there a function for this in GREL?

Thanks,

slahy

Tom Morris

unread,
Jan 9, 2014, 11:14:54 AM1/9/14
to openr...@googlegroups.com
The "if" is implicit and you can chain functions, so you could just do

  value.replace("%20", " ").replace("%21", "!") ...

but even that is more work than you need to do since we have 

  value.unescape("url")

That function will also handle html, xml, csv, and javascript escaping (with the appropriate mode parameter).

Tom

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sean Leahy

unread,
Jan 9, 2014, 11:22:56 AM1/9/14
to openr...@googlegroups.com
Well that is much simpler, thanks Tom.

Is there documentation for these functions just to see what else is available ?

Tom Morris

unread,
Jan 9, 2014, 11:25:46 AM1/9/14
to openr...@googlegroups.com
On Thu, Jan 9, 2014 at 11:22 AM, Sean Leahy <sle...@gmail.com> wrote:

Is there documentation for these functions just to see what else is available ?

The Help tab in the Transform dialog has a complete list.  There is additional documentation available on the wiki at  https://github.com/OpenRefine/OpenRefine/wiki (which is linked to from our main site at http://openrefine.org/documentation.html

Tom

Tom Morris

unread,
Jan 9, 2014, 11:27:31 AM1/9/14
to openr...@googlegroups.com
p.s. unescape(value,"url") and value.unescape("url") are synonyms, so you can choose whichever style is most comfortable/familiar for you, but the latter style works much better for chaining multiple functions together.

Tom
Reply all
Reply to author
Forward
0 new messages