Merging fields from two MARC Files

84 views
Skip to first unread message

Chelsea Patella

unread,
Sep 30, 2022, 3:56:40 PM9/30/22
to pymarc Discussion
Hi all,

I was wondering if anyone can help with this. I need to merge certain fields from one MARC file to another by matching on the 001.

Anyone know how to go about this?

Thanks!
Chelsea

Andy Kohler

unread,
Oct 2, 2022, 7:37:00 PM10/2/22
to pym...@googlegroups.com
Hi Chelsea -

I suggest reading both files into memory, making a dictionary from each, keyed on the 001 values.  If memory is a concern, you could do this with just the target records (the ones you will merge fields into).

Then iterate through your source records (the ones you're pulling fields from).  Grab each source 001, see if it's a key in the target dictionary.  If so, pull whatever fields from it and add them to the target record.  Finally, write the target records to file or whatever destination is needed.

If the source file could contain duplicates, you may need extra steps to avoid updating a target record multiple times.

If any questions, feel free to ask! --Andy

--
You received this message because you are subscribed to the Google Groups "pymarc Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pymarc+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pymarc/7d5bc35e-ad07-482c-96f5-7a34fc28af80n%40googlegroups.com.

Tomasz Kalata

unread,
Oct 4, 2022, 9:26:25 AM10/4/22
to pymarc Discussion
I second Andy's suggestions. His approach will work well for small to moderate size of MARC files. 

If you are dealing with two large sets, then I would advise more elaborate approach that involves creating a database (simple SQLite should do). Instead of creating an in-memory dictionary of source records/fields, it would be better to parse that file, store 001 control numbers and fields you would like to transfer. You can pickle a list of pymarc field objects and store them in the db (as BLOB in SQLite). When that is ready, you iterate over the target MARC file, find match in the database based on 001 tag, retrieve stored fields and add them to a target record. Append each modified record to a new file. In your db you can also track the progress so if something goes wrong you can easily troubleshoot it and pick up exactly from the moment your batch job tripped on.

Cheers,
Tomasz

Reply all
Reply to author
Forward
0 new messages