Escape characters and GREL histories

74 views
Skip to first unread message

Ryan Wheeler

unread,
Feb 27, 2020, 10:58:12 AM2/27/20
to OpenRefine
Hi, all-

I'm working on transforming a set of data in OpenRefine to produce MARC bibliographic records in MarcEdit. One element I'm trying to include is a MARC 008 field with custom data for each record, such as year of publication - by default, MarcEdit supplies a generic 008. The 008 format MarcEdit expects uses backslashes to represent blank bytes (the default it supplies is s9999\\\\xx\\\\\\\\\\\\000\0\und\d, for example). I've gotten this to work just fine by escaping the backslashes. To keep better track of string length, I started with a string using pound signs:
 grel:"sYEAR####xx#RUN############muund#d"
And then replaced the pound signs with escaped backslashes, while also filling in the data I want from other columns:
grel:value.replace("#","\\").replace("YEAR",cells["008 year"].value).replace("RUN",cells["008 runtime"].value)

Which gives me the custom 008 strings I'm looking for. My problem is that I'm trying to make this process repeatable by others, and when I try reproducing it with a new set of data by pasting the extracted JSON history into the "Apply..." dialog box, most of the steps related to producing this 008 disappear - they're gone entirely if I look at the newly-created history. I was thinking the presence of backslashes in the string might be at fault, but even the original step of creating the string with pound signs vanishes. Oddly, if I segregate the JSON having to do with the 008 and apply it as a separate, second step, it works. I'm wondering if I'm missing anything obvious that would make this work as a single JSON history, or if there are any common pitfalls when using a long JSON history to reproduce a transformation process.

I'll also note that there are ways to build this field from the MarcEdit side as an alternative solution, but they require more manual intervention; the hope is that staff can paste this history in and get to the end result with minimal steps. Thanks for any thoughts on this!

Ryan Wheeler

Thad Guidry

unread,
Feb 27, 2020, 11:40:53 AM2/27/20
to openr...@googlegroups.com
Hi Ryan!

OK !  So this might be a huge problem, or a very tiny problem... it depends on which version of OpenRefine you running, since a few missed bugs happened after 3.0 (like serialization that affects JSON)

So, first off...
1. which version of OpenRefine (and OS) ?  Can you try with latest from our website? 3.3 ?
2. which importer are you using?  How are you getting the data into OpenRefine cells?  via our Fetch URL's, or manually through clipboard importer, or ?  Which would be the most standard, easiest process for your users to repeat?

If you are using 3.3 latest, then can you see if the console window shows any errors or stacktrace, and if so, can you copy that in your reply?

Incidentally, having repeatable steps is very much what we want to improve for our Librarian community!
Your workflows are fabulously boring at times, hehe, but yield the most amazing results for humankind!  I worked in a library for 7 years, as well as a few previous OpenRefine developers and our own Owen Stephens, so we know the problems well within your domain.  In fact, it was because since many of us with prior Library experience that you now see the MARC files, Wikitext, XML type importers we have!



--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openrefine/c90db33d-9c77-42b4-9f4f-f7f3b2c72333%40googlegroups.com.

Ryan Wheeler

unread,
Feb 27, 2020, 12:36:54 PM2/27/20
to OpenRefine
Thad-

Thanks for your response! To answer your questions:
1. This is OpenRefine 3.3 on Windows 10. I was using the previous version when I started this project, so some of the GREL I'm trying to reuse would have been written under it - not sure if that could be related to the problem.
2. The data is coming from an internal FileMaker database; I've been exporting selected records from there in UTF-8 CSV format, and importing the data into OpenRefine using those CSV files (technically .mer files, since that's the FileMaker export option that includes column headers).

I just tried repeating the whole process, and I'm not seeing errors or anything unusual in the console window. Let me know if there's more info I can provide.
I've really been enjoying getting to know OpenRefine for library purposes - it's certainly helped make this project possible!

Thanks again-
Ryan
To unsubscribe from this group and stop receiving emails from it, send an email to openr...@googlegroups.com.

Thad Guidry

unread,
Feb 27, 2020, 1:08:33 PM2/27/20
to openr...@googlegroups.com
Ah!  Our GREL hasn't changed, BUT other things under the covers did between versions, and as I said we had a few regression bugs.

@Antonin don't we have an existing issue related to this from other users who were experiencing similar issue after upgrades with broken "Apply JSON history" ?  Any way for Ryan to recover or manually tweak his JSON history to get it to work again in 3.3 version ?



To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openrefine/89242a29-3a51-4908-af93-08d64e2a0dab%40googlegroups.com.

Owen Stephens

unread,
Feb 27, 2020, 5:54:38 PM2/27/20
to OpenRefine
Hi Ryan,

Would you be able to share a couple of rows of example data and the history that you've extracted?

Ryan Wheeler

unread,
Feb 28, 2020, 9:29:25 AM2/28/20
to OpenRefine
Owen-

Sure thing; they're attached. I just lopped off the majority of the .mer file, so hopefully it works; the original batch is around 150 records. The JSON history is pretty long and largely unremarkable (or fabulously boring, to borrow Thad's phrase) - it's mostly a lot of text transformation to get internal data closer to cataloging standards. (I'm sure it's also riddled with inefficiencies, as I'm relatively new to OpenRefine.) Most, but not all, of the steps related to creating and editing the column named "008" are the ones that seem to disappear when I retry the process.

Thanks for taking the time to look!
hfa.json
hfa.mer
Reply all
Reply to author
Forward
0 new messages