Remove the beginning and ending white spaces in a whole column

569 views
Skip to first unread message

Mauricio Salazar

unread,
Oct 23, 2013, 5:51:59 AM10/23/13
to openr...@googlegroups.com
Hi there, I want to remove the beginning and ending white spaces in a particular column, I was using Edit Cells/Transform/value.trim().

But when I check some of the cells I see that some of the cells still have a blank space at the beginning. I wonder if there is another way to do that.

Thanks in advance 

Tom Morris

unread,
Oct 23, 2013, 9:17:26 AM10/23/13
to openr...@googlegroups.com
On Wed, Oct 23, 2013 at 5:51 AM, Mauricio Salazar <mauu...@gmail.com> wrote:
Hi there, I want to remove the beginning and ending white spaces in a particular column, I was using Edit Cells/Transform/value.trim().

But when I check some of the cells I see that some of the cells still have a blank space at the beginning. I wonder if there is another way to do that.

What character(s) is/are being left behind?  If it is something that trim() doesn't deal with, you can get rid of it using replace('<char>','').  I don't remember off the top of my head whether it deals with NonBreak Space, Zero Width Space, and all the other weird space characters (it should if it doesn't).

Tom 

Thad Guidry

unread,
Oct 23, 2013, 11:04:49 AM10/23/13
to openr...@googlegroups.com
No, trim() does not.  We had plans to add an additional parameter to say trim(mystring, all) which would remove all types of whitespace.  I did have a Closure quick script around that does it in one line...but forgot where I put it... bah...

You can do what I documented in the Quick Recipe instead however... (modify to suit the type of whitespace that is found after you do a unicode(value))




--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--

Thad Guidry

unread,
Oct 23, 2013, 11:12:00 AM10/23/13
to openr...@googlegroups.com
Forgot to mention, another way that I sometimes do it is to create a new column based off your column with the custom expression:

value.escape("javascript")

which will show all the weird space characters you have and then you can use split() or chomp() or whatever method to cut them out in mass..

Owen Stephens

unread,
Oct 23, 2013, 6:27:37 PM10/23/13
to openr...@googlegroups.com
Also worth looking at the 'Recipes' on the OpenRefine wiki
-----------

if you have non-breaking spaces &nbsp on both ends then you might try this instead:

  split(escape(value,'xml'),"&#160;")[0]
-------------
Reply all
Reply to author
Forward
0 new messages