How to keep Regex selection in a new column ?

174 views
Skip to first unread message

Marco RUDI

unread,
May 17, 2019, 12:07:33 PM5/17/19
to OpenRefine
Hi,

I have lot information in column number 1. I use Regex expression in openrefine to select information i would. My purpose is to keep this information in a new column.

Any ideas ? I'd try value.find($$$) and i have godd information but i don"t know how to keep regex result in a new comlumn.

Thanks for your help.

Owen Stephens

unread,
May 17, 2019, 12:24:53 PM5/17/19
to OpenRefine
Hi Marco,

It may depend on exactly what you want to do, but I'd suggest the following:

  • Use Edit Column -> Add Column based on this column
  • In GREL use the `match` function with a regular expression
Unlike `find` the `match` function requires you to write a complete match from the whole cell content - essentially it silently puts a ^ at the start of your expression and a $ at the end.
To 'capture' the bits of the expression you want you can use the usual regular expression capture groups with parentheses (  and ) 
The `match` function captures these into an array, and you can access the bits of that array and store them. 

This is probably easier to show with an example. Lets imagine cell with the content "This is a match TERM match". The GREL:

value.match(/This is a match ([A-Z]+) match/)

will give you an array with a single entry - the word TERM. To store the content in a cell, you need a string (or number/date) not an array - so you need to do a bit more GREL to get to a string  -e.g.

value.match(/This is a match ([A-Z]+) match/)[0]

The [0] extracts the first thing in the array generated by `match` - which is the word "TERM"

The find function works in a very similar way and you could use:

value.find(/ [A-Z]+ /)

This would also result in an array containing a single entry with the word "TERM". But because find will find all matches to your pattern in the cell you have to be careful - a very similar expression:

value.find(/[A-Z]+/)

Would find two matches in the example string I gave "T" and "TERM" - so you'd have an array with two matches instead of one.

Sorry this seems like a complicated explanation for a simple question!
If you need any additional help please supply a specific example of the starting content of the cell and the desired captured content and I can try and give you a more specific piece of GREL that will work

Owen
Message has been deleted

Marco

unread,
May 20, 2019, 7:19:05 AM5/20/19
to OpenRefine
Thanks Owen for your response.

Here is an example of what I want. Purpose is simple, put this result in a new column.

open_re.JPG


Ettore RIZZA

unread,
May 20, 2019, 7:29:49 AM5/20/19
to openrefine
Hello Marco,

When the transformation window displays something in square brackets, such as ["x", "y", "z"], it is an array. OpenRefine cannot display arrays in a column. You can either:

- Choose an item from the list using its zero-based index, like this: value.find(/your regex/)[0] <-- select the first element in the array. 
- Use .join("SEPARATOR") to concatenate the array into a string, for example like this: value.find(/your regex/).join(",")

Cheers,

Ettore Rizza


--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openrefine/615b3171-5950-40ce-8f9c-cb460e14e98f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Marco

unread,
May 20, 2019, 8:46:40 AM5/20/19
to OpenRefine
@Ettore:

Nice ! exactly what i want. Thanks a lot.

Do you know a nice book to learn this language ?

Ettore Rizza

unread,
May 20, 2019, 10:55:50 AM5/20/19
to OpenRefine
There is only one book on OpenRefine, "Using OpenRefine" by Max De Wilde and Ruben Verborgh. I recommend it for beginners, but it only describes the basics and does not cover advanced topics such as GREL. Library Carpentry lessons or Martin Magdinier's video course can be used as a complement. 
Reply all
Reply to author
Forward
0 new messages