Programmatically delete columns

74 views
Skip to first unread message

mj.ke...@gmail.com

unread,
May 9, 2018, 2:08:22 PM5/9/18
to OpenRefine
I would like to be able to do something in a script (grel or Jython, doesnt matter) - that I can specify a list of column names, and if any of the columns that exist in the "current" table do not match one in the list I provide, I want to delete that column.

Does anyone have any suggestions for this kind of operation?

Thanks

Thad Guidry

unread,
May 9, 2018, 2:50:44 PM5/9/18
to openr...@googlegroups.com
You could do this in Python itself, outside of OpenRefine ... by using Paul Makepeace's Python Client to perform that remove column action against your OpenRefine Project based on a dictionary list provided in your Python script.

https://github.com/OpenRefine/refine-client-py

-Thad

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ettore Rizza

unread,
May 9, 2018, 3:17:50 PM5/9/18
to OpenRefine
Hi,

If you do not know Python, you can use this hackish script in OpenRefine to create a json that you can then use in Undo/Redo. 

my_list = ["name1"]

col_to_delete = []
result = []
for el in row['columnNames']:
    if el not in my_list:
        col_to_delete.append(el)


for el in col_to_delete:
    result.append("""
  {
    "op": "core/column-removal",
    "description": "Remove column %s",
    "columnName": "%s"
  }
""" % (el, el))

return "[" + ",\n".join(result) + "]"


Just replace the elements of "my_list" with your column names. Here is an example of use (screencast) :




Hope this helps,

Ettore

mj.ke...@gmail.com

unread,
May 9, 2018, 3:29:41 PM5/9/18
to OpenRefine
That is awesome,  I had not considered that.  Thank you!  For what I want to do right now, that works.

Ettore Rizza

unread,
May 9, 2018, 3:38:55 PM5/9/18
to OpenRefine
You're welcome. Obviously, the same tick is easier to do if you start from a list of columns that must be deleted (and not kept). In this case, simply put the list of column names in a column of Open Refine and use Templating to reproduce an Undo/Redo Json.

Felix Lohmeier

unread,
May 10, 2018, 3:36:38 PM5/10/18
to OpenRefine
The JSON code of the column-reorder function (All > Edit columns > Re-order / remove columns...) actually contains a list of columns that should be kept.

[
 
{
   
"op": "core/column-reorder",
   
"description": "Reorder columns",
   
"columnNames": [
     
"friend",
     
"address"
   
]
 
}
]

You could simply put your list of column names there and use Undo/Redo > Apply...

Ettore RIZZA

unread,
May 10, 2018, 3:45:08 PM5/10/18
to openrefine
Ouch, seems right. So Templating could do the job in any case.

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+unsubscribe@googlegroups.com.

mj.ke...@gmail.com

unread,
May 10, 2018, 3:49:38 PM5/10/18
to OpenRefine
Holy Canola oil... that's even better.

Thanks!

MRB

unread,
Jun 18, 2019, 2:00:24 PM6/18/19
to OpenRefine

MRB

unread,
Jun 18, 2019, 2:03:13 PM6/18/19
to OpenRefine
I actually find this destructive ordering of column reordering a bad thing because while exploring data  may have left off a column or two I need later.  if I want to keep a column I need to edit the script I used to be sure it's not erased ... it seems it would be better to explicitly remove a column so it's in the script - not delete new ones ... Thanks, Mark



On Wednesday, May 9, 2018 at 2:08:22 PM UTC-4, mj.ke...@gmail.com wrote:
Reply all
Reply to author
Forward
0 new messages