Expression

33 views
Skip to first unread message

John Otto Knoke

unread,
Nov 25, 2015, 2:49:46 PM11/25/15
to openr...@googlegroups.com
Hello,

I have been trying this simple expression, but without success. I am trying to filter only characters without numbers.

Appreciate any help.


Inline image 1

Tom Morris

unread,
Nov 25, 2015, 3:07:12 PM11/25/15
to openr...@googlegroups.com
The best regular expression will probably depend on the context because they get used slightly differently in different places, but it looks like you're working with a Text Filter facet.

The first thing I'd try is: 

    ^[^0-9]*$

which will restrict the entire cell value to zero or more non-numerics.

The Text Filter facet is more permissive than other contexts in that it'll match substrings by default, whereas GREL functions typically match the entire value (so you need to explicitly wrap your RE with .* on either side, if you want the more common substring style matching).

Tom

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

John Otto Knoke

unread,
Nov 25, 2015, 3:21:41 PM11/25/15
to openr...@googlegroups.com
Thanks Tom! Work perfect in the text filter. I understand the *$ at the end but don't understand ^ at the beginning.

Thanks again!

Tom Morris

unread,
Nov 25, 2015, 3:25:28 PM11/25/15
to openr...@googlegroups.com
It's a little confusing because ^ has two different meanings in this expression: negation (in your original character class) and beginning-of-string anchor (which I added).  $ is the end-of-string anchor and * means zero or more (which you may want to change to + for 1 or more, depending on what you're trying to achieve).

Tom

John Otto Knoke

unread,
Nov 25, 2015, 3:29:57 PM11/25/15
to openr...@googlegroups.com
Got it thanks.

John Otto Knoke

unread,
Dec 1, 2015, 11:35:16 AM12/1/15
to openr...@googlegroups.com
Hello,

Can someone help understand what I am doing wrong with this expression using the value.replace

value.replace(\^[^a-zA-Z0-9]\,'')

My goal is to remove any non-word characters at the beginning. 

Thanks for the help

Inline image 1

Owen Stephens

unread,
Dec 1, 2015, 12:22:03 PM12/1/15
to openr...@googlegroups.com
I think this is as simple as using the wrong direction slash at the start/end of your regular expression. Try:

value.replace(/^[^a-zA-Z0-9]/,’')

(n.b. this will replace just a single non-alphanumeric char. To remove any number of leading non-alphanumeric chars you need to add a * to the expression:

value.replace(/^[^a-zA-Z0-9]*/,’')


Owen

John Otto Knoke

unread,
Dec 1, 2015, 1:03:56 PM12/1/15
to openr...@googlegroups.com
Thank you Owen, 5 minutes after I posted, I fixed the slashes. 



Owen

Reply all
Reply to author
Forward
0 new messages