regex help - do not match optional character

152 views
Skip to first unread message

olivia solis

unread,
Dec 15, 2017, 11:11:55 AM12/15/17
to OpenRefine
Hi all,

I'm trying to normalize a lot of manually entered dates, including changing "c." --> "circa". The "c." may appear anywhere in the date. I'm using a filter with regular expression checked. The problem here is that a lot of the dates include "Dec." as well. I'm probably missing something obvious, but how do I match:

c. 1985
1985 to c. 1990

But not

Dec. 1998

There are definitely some instances of dates like
c. Dec. 1998
But I'll deal with those limited use cases when I get to them.

[^e]*c\. won't work in cases where the "c." is at the beginning of the expression.

Thanks,
Olivia 

olivia solis

unread,
Dec 15, 2017, 11:16:47 AM12/15/17
to OpenRefine
^^ Correction:
[^e]c\.

won't work.

Owen Stephens

unread,
Dec 15, 2017, 11:17:50 AM12/15/17
to OpenRefine
Hi Olivia,

Would:

(^|\s)c\.

work (i.e. either start of string or space before the 'c.')

John Little

unread,
Dec 15, 2017, 11:20:36 AM12/15/17
to openr...@googlegroups.com
I suggest a text filter on "dec. " or "Dec. " and then invert (or exclude) those rows.  Then the normalize your the target "c."

text-filter_invert.png

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

olivia solis

unread,
Dec 15, 2017, 11:31:13 AM12/15/17
to OpenRefine
Thank you, Owen! Yes, that works. And holy moly, John, you just opened my eyes to a new world. I never knew you could use an invert on a text filter.

Owen Stephens

unread,
Dec 15, 2017, 11:32:38 AM12/15/17
to OpenRefine

On Friday, December 15, 2017 at 4:31:13 PM UTC, olivia solis wrote:
Thank you, Owen! Yes, that works.
 No problem!

And holy moly, John, you just opened my eyes to a new world. I never knew you could use an invert on a text filter.
That's probably because it's only just been added - new in 2.8!

Owen

John Little

unread,
Dec 15, 2017, 11:44:06 AM12/15/17
to openr...@googlegroups.com
So nice to get a "holy moly" on this fine day -- glad I could open that feature up to you.  But, really good catch Owen!  It's important to note that "invert" is a new 2.8 feature -- thanks to the ongoing hard work of the OR developer community.  

I've used the "include/exclude" feature of facets so much that adding the "invert" feature not only made great sense (!!) but I already forgot "invert" hasn't been there always, that it is a new 2.8 feature.  

olivia solis

unread,
Dec 15, 2017, 2:19:13 PM12/15/17
to OpenRefine
Great idea!
Reply all
Reply to author
Forward
0 new messages