How to trim period at end of field?

2,577 views
Skip to first unread message

disne...@gmail.com

unread,
Nov 13, 2015, 2:08:17 PM11/13/15
to OpenRefine
Hi,

Does anyone know how to accomplish trimming a period off of the end of data in a column? It is not on all cells and there are other periods in the data at other locations (other than the end) that I want to keep so replace on its own wont work and just taking off the last character wont work.

Thank you!

Owen Stephens

unread,
Nov 13, 2015, 2:33:24 PM11/13/15
to OpenRefine
There are different ways to do this. A few of options:

Option 1
-----------
Use a text filter on the column, click the 'regular expression' option and in the filter type:
\.$

This will filter to all rows where the last character in the cell is a period '.'
Then carryout a transform on the column of:
value[0,-1]

or

value.substring(0,-1)

Either of these will trim the last character from the string.

Option 2
-----------
Use a match expression to extract everything from the value with the exception of a trailing period:

value.match(/(.*?)\.?$/)[0]

Option 3
-----------
Use an 'if' to test if the last character in the cell is a period, and if so trim it off:

if(value[-1]==".",value[0,-1],value)


Owen

Thad Guidry

unread,
Nov 13, 2015, 2:33:42 PM11/13/15
to openrefine
Use regex to "look" for the pattern that you need... then replace() with whatever other pattern you need...like nothing. "" double quotes with nothing in between.

replace() GREL function happens to be handy in that it also can take a regex pattern ! (encapsulated with 2 forward slashes such as  /..some pattern../   See https://github.com/OpenRefine/OpenRefine/wiki/GREL-String-Functions#replacestring-s-string-f-string-r

So your usecase looks like this in a single GREL expression:

value.replace(/\.$/,"")

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Owen Stephens

unread,
Nov 13, 2015, 2:34:45 PM11/13/15
to OpenRefine
Doh - I forgot the most obvious option :)

Thanks Thad !

Chad Nelson

unread,
Nov 13, 2015, 2:52:13 PM11/13/15
to OpenRefine
Hi,

You can use the extra parameters in replace and some regular expressions to do this. The replace help documentations explains:
replace(string s, string or regex f, string r)returns: stringReturns the string obtained by replacing f with r in s
So, for your use case we can say, if the last character before the end of the string is a period, replace it (the last character) with an empty string. Otherwise, do nothing.

replace(value, /[.]$/, '')

Hope that helps,

Chad

--

disne...@gmail.com

unread,
Nov 13, 2015, 3:40:59 PM11/13/15
to OpenRefine
Thank you! These are great! I have been having a hard time coming up with the script on my own, these examples will help me with other bits I have too :)
Message has been deleted

Thad Guidry

unread,
Nov 19, 2015, 2:17:55 PM11/19/15
to openrefine
Hmm, works just fine for me.

Screenshot: ​



On Thu, Nov 19, 2015 at 12:31 PM, <disne...@gmail.com> wrote:
Hi again, I tried using this

value.replace(/\.$/,"")

and it seems to replace every period not just the one at the end. This came up because now I need to remove a ; at the beginning of the line and so tried value.replace(/\;?/,"") and it removed all the semi-colons

What am I getting wrong here?


Thank you!

Thad Guidry

unread,
Nov 19, 2015, 2:20:58 PM11/19/15
to openrefine
Your regex is probably not correct for what your trying to accomplish...you need to prefix with ^ which means from the start....and $ means from the end

value.replace(/^\;/,"")

John Little

unread,
Nov 19, 2015, 2:48:14 PM11/19/15
to openrefine
Thad is right.  If you had examples of your text, that might help troubleshoot.  
In any case, RegEx has a lot of character class and boundary options.  

Give these a try, maybe they will work with your data:
  • line ends with a period:   value.replace(/(\.)(\B)/,"$2")
  • sentence begins with a semicolon:  value.replace(/(\;)(\w+)/,"$2")

To learn more about the regular expression options you might try the RegExr tool.  
OpenRefine developers will certainly know best but I think RegExr tool is closely aligned with the flavor of regular expressions used by OpenRefine.  




Reply all
Reply to author
Forward
0 new messages