> On Jan 7, 2025, at 12:07 PM, Patrick H <
pathar...@gmail.com> wrote:
>
> I recognize this is a very niche (and unfortunate) need for our system to still be using this encoding, so if it's not feasible to encode Records as anything other than UTF-8 then I don't expect changes to PyMarc to accommodate that. I just thought perhaps the "as_marc()" or RawField tools in Pymarc might allow me to use the native support in Python to encode the bytes in one of those other encodings prior to saving the Records to a file.
Thanks for this Patrick. Sadly, even after all these years of Unicode, it seems like it is not very niche to be working with an ILS that still only supports MARC-8 encoded records. Just out of curiosity what ILS are you working with? From your examples it looked like you were reading in UTF-8 encoded records? Were these exported from your ILS?
After reading the MARCEdit post you shared it seemed like Terry was saying MARC-8 could be easily mistaken for cp1252 when trying to automatically determine the encoding of a MARC record, not that there was sufficient overlap to consider them equivalent? But I could be wrong about that.
If you want to experiment with having a parameter that would allow Record.as_marc('cp1252') I could try to add it to an experimental version of pymarc. My only worry with simply introducing it, is that pymarc’s handling of encodings is already so complicated, and adding another knob will simply make it worse…
As an alternative, you might want to consider writing out your modified data as UTF-8 encoded with pymarc. Then you could use yaz-marcdump, which is part of the yaz toolkit [1] to convert your records to marc-8:
$ yaz-marcdump -f utf8 -t marc8 -o marc utf8-records.raw > marc8-records.raw
yaz has been around a long time, and has been heavily used over the years, so it should be as reliable as you can get for going backwards from UTF-8 to MARC-8. Maybe others have found different approaches to this, if so please chime in, it’s one of the most difficult areas to work with in pymarc.
//Ed
[1]
https://www.indexdata.com/resources/software/yaz/