Hi all,
We recently turned on the OAI extracts for our Symphony ILS and I've been testing different methods of working with the data. I've got it mostly working with pymarc but am running into an encoding error when writing out the record for
https://catalog.libraries.psu.edu/catalog/2844997 which contains
(which contains
W 78⁰22ʹ30ʺ--W 77⁰07ʹ35ʺ/N 41⁰10ʹ00ʺ--N 40⁰41ʹ00ʺ, which is what's causing the issue).
My step-by-step process code (vs breaking into functions) is here:
https://gitlab.com/-/snippets/2186472
Anyone should be able to harvest that file from our OAI, but I can provide the OAI response as well.
I'm getting: UnicodeEncodeError: 'latin-1' codec can't encode character '\u2070' in position 27: ordinal not in range(256)
Basically I'm requesting the record, using fromstring to turn it into an etree root, then a tree, then selecting the MARC file (represented by dangMARC), using marcxml_to_array(params)[0] to turn that into a pymarc record object. I then write the record object to a .mrc file. It's not elegant but it works fine. It roundtrips nicely in both pymarc and MarcEdit and I thought I was done until I hit this encoding issue.
Because it's bytes, not text, I got stumped. Would appreciate any help.
Thanks,
Ruth