Case insensitive comparing and deleting cells

24 views
Skip to first unread message

gruen...@gmail.com

unread,
Feb 14, 2018, 12:53:13 PM2/14/18
to OpenRefine
Dear OpenRefiners,

To clean up identical film titles, I'm using Edit cells > Transform with this code to compare the values of two cells in different columns and to delete one of them in case they're identical.

if(cells["TitelCombo 1"].value == cells["TitelCombo 2"].value, cells["TitelCombo 2"].value.replace(cells["TitelCombo 2"].value,""), cells["TitelCombo 2"].value)

However, some of the titles differ only in upper and lower cases, i.e. and are therefore not detected as doubles:

This Was a Woman
This was a woman

As I'd like to also eliminate those, I'd like to modify my code so that it works case insensitive.

I've tried to include regular expressions such as /\?i/ behind the values of the if expression, but haven't been successful. Can anybody help? That would be great!

Many thanks and best
Christiane

Ettore RIZZA

unread,
Feb 14, 2018, 1:05:16 PM2/14/18
to openrefine
Hi Christiane, 

Why not put your titles in lowercase letters before comparing them?
if(cells["TitelCombo 1"].value.toLowercase() == cells["TitelCombo 2"].value.toLowercase() , cells["TitelCombo 2"].value.replace(cells["TitelCombo 2"].value,""), cells["TitelCombo 2"].value)



--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

gruen...@gmail.com

unread,
Feb 15, 2018, 10:31:28 AM2/15/18
to OpenRefine
Wow, thanks so much, Ettore!

It works perfectly. I thought it would change my data into lowercases, but that's not the case.

Best regards
Christiane
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.

Ettore RIZZA

unread,
Feb 15, 2018, 10:40:47 AM2/15/18
to openrefine
You're welcome, Christine. If ever toLowercase() is not enough, you can also try the fingerprint() function. 

value.fingerprint() puts everything in lowercase, delete the punctuation and sorts the words in alphabetical order. In this way "Love, actually" will be considered as identical to "love actually".

To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages