GREL filter find NOT "this string"

941 views
Skip to first unread message

Jonathon Paarlberg

unread,
Aug 27, 2015, 2:50:48 PM8/27/15
to OpenRefine
How do I invert a search for a character? For example, in a multipurpose text field where sub-fields, which I've split into rows within the original record, are usually delimited by a code followed by an equal sign character (=), how do I search only for those rows that do not have the "=" ?

Thanks once again for any help you can offer. I really appreciate this forum.

Joe Wicentowski

unread,
Aug 27, 2015, 3:10:06 PM8/27/15
to openr...@googlegroups.com
The regex for "a non-= character" is:

[^=]

For learning regex, I'd suggest reading Chapter 8 of the TextWrangler
manual (http://www.barebones.com/support/textwrangler/manual.html).
You'll find an explanation of this character class exclusion on page
138.

Joe

Thad Guidry

unread,
Aug 27, 2015, 3:21:56 PM8/27/15
to openrefine
You can also use the Text Facet (even multiple times on different columns) to quickly filter on them.  Even clicking on include(exclude) to show or hide those rows.

Facets are powerful in OpenRefine.  Begin to use and apply them as your 1st pass filtering mechanism when you can.
--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jonathon Paarlberg

unread,
Aug 27, 2015, 4:43:19 PM8/27/15
to OpenRefine
Thanks, guys. I'll add that manual to my reference library, Joe.

Thad, how would you write a custom filtering expression that would find only those rows containing a "=" ? If I can do that, then I can facet by inverting the facet. Otherwise, I could just as easily use the GREL filtering search that Joe pointed out.

Thad Guidry

unread,
Aug 27, 2015, 5:07:21 PM8/27/15
to openrefine
We have docs....you just have to read them thoroughly. :)

Here's probably what your asking about directly...but that whole page on Filtering/Faceting is highly important.

I would suggest going through all the Feature Areas: Essential  from here:  https://github.com/OpenRefine/OpenRefine/wiki/Documentation-For-Users

We never did put the Wiki docs in a true tutorial form, but we might one day. :)  For now, just go step by step through the Essentials.

--

Thad Guidry

unread,
Aug 27, 2015, 5:12:36 PM8/27/15
to openrefine
And we collect MANY external resources that are valuable to further your learning of OpenRefine:


David Hay

unread,
Jul 8, 2017, 12:22:13 AM7/8/17
to OpenRefine
I had this problem and solved it with a Custom Text Facet and the indexOf string function

Go to: Facet > Custom Text Facet...

Enter the formula:

value.indexOf("=")

(replace "=" with any other character or substring you wish to filter out)

The text facet will show the position of the character "=" within each string as an integer - but if the string is absent it shows -1
In the results click on -1 to filter your dataset for all the strings you want to exclude. Works like a dream.

Owen Stephens

unread,
Jul 10, 2017, 4:57:53 AM7/10/17
to OpenRefine
Thanks David,

You could also use 
value.contains("=")

Which would give you a true/false result instead of a numeric one

Jonathon Paarlberg

unread,
Aug 27, 2019, 6:07:45 AM8/27/19
to OpenRefine
Thanks, all. 'Sorry it's been so long since I looked back. I figured out a faceting technique that worked and then forgot to check the forum post. :-( Nevertheless, I do appreciate your help!
To unsubscribe from this group and stop receiving emails from it, send an email to openr...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages