Control fields giving a traceback error--is this right?

73 views
Skip to first unread message

Ben W.

unread,
Apr 6, 2020, 5:46:00 PM4/6/20
to pymarc Discussion
Hello all,

I've built a script that reaches out to the govinfo.gov API to grab "all the Congressional publications released today" in XML format. The XML data for each record is then combed for specific elements, which are placed into a list of Field objects for that publication, and then that list of Field objects is "add_ordered_field"ed into a Record object which is then written to a MARC file.

[background]
The script will need to handle publications from one of eight different document collections, so I've got a base metaclass, "Collections", then a subclass for the major collection type, and then a variety of subclasses for each type of document within that particular document collection. The basic structure looks something like this:
Collections (base metaclass)
CHRG (hearings)
House hearings
Senate hearings
Joint hearings
CRPT (reports)
House reports
Senate reports 
Senate Executive reports
And so on...
[/background] 

My main problem is that I'm getting an IndexError when the system tries to write the records to file ("list index out of range"). In debugging the problem, I noticed that when the system builds the list of Field objects, the control fields are always marked as though something went wrong in the subfield indicators, even though I'm calling them correctly via the "data" element. BTW, I'm using VSCode, hence the debugger image:

2020-04-06_17-06-17.png

Here's that code building the list of Field objects (or the start of it, at least):
    def base_fields(self):
        base_list = [
            Field(tag="006"data="m     o  d f      "),
            Field(tag="007"data="cr"),
            Field(tag="008"data=self.date_008()),
            Field(.....

I'm adding about 20 Fields to each record (it will vary according to the needs of the particular document class) with the following bit of code:
    @abstractmethod
    def build_record(self):
        self.marc = Record()
        for b in self.base_list:
            self.marc.add_ordered_field(b)
        for s in self.spec_list:
            self.marc.add_ordered_field(s)
        self.marc.leader.type_of_record = "a"
        self.marc.leader.bibliographic_level = "m"
        self.marc.leader.coding_scheme = "a"
        self.marc.leader.encoding_level = " "
        self.marc.leader.cataloging_form = "i"
        return self

Is there a reason those control fields are marked as having an error? Note that the 024 field at the bottom of the first image doesn't seem concerned about the subfield indicators.

Is it my metaclass--subclass--subsubclass the issue? Is it the way the fields are being built? Any help is absolutely appreciated (despite what feels like spaghetti code at this point).

Regards,

Ben

Geoffrey Spear

unread,
Apr 6, 2020, 5:56:45 PM4/6/20
to pym...@googlegroups.com
Ben,

00x fields don't have indicators. I'm not sure why your debugger is trying to show them, let alone why it's showing a traceback inside a string as the value, but I suspect the but is in the debugger.

--
You received this message because you are subscribed to the Google Groups "pymarc Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pymarc+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pymarc/6aa71e83-b3fe-48cb-8fe4-e094fe98da8e%40googlegroups.com.

Ben W.

unread,
Apr 6, 2020, 6:45:15 PM4/6/20
to pymarc Discussion
Thanks for that. After further work, I think it's likely that the script is expecting a piece of data in the returned XML that isn't actually there, and that's what's causing the issue. It appears the debugger *always* gives a traceback on the (non-existent) subfield indicators for the control fields.

But you saw that nice bit with the explicit Leader field declarations, right? OOOOHHHHH, AAAAHHHHHHH, so nice! ;)

Ben W.

unread,
Apr 6, 2020, 7:30:39 PM4/6/20
to pymarc Discussion
All that said, still getting an "IndexError: list index out of range" when the script hits the MARC writer for this particular record. The Fields are all built correctly, and there's only one record being built for the test day being used, but still this error when the script hits the writer:
    join_file = "JOINED.mrc"
    if os.path.isfile(join_file):
        os.remove(join_file)
    writer = MARCWriter(open(join_file, "wb"))
    # Now write the records to file
    for item in new_obj_list:
        item.build_record()
        writer.write(item.marc) <--here's where the IndexError is occurring

Ben W.

unread,
Apr 6, 2020, 7:33:54 PM4/6/20
to pymarc Discussion
Here's a better look at the actual error message: 

  File "C:\Users\bwebb\AppData\Local\pypoetry\Cache\virtualenvs\octo-fort-_NVY1WrV-py3.8\lib\site-packages\pymarc\field.py", line 139, in __next__
    subfield = (self.subfields[self.__pos], self.subfields[self.__pos + 1])

Geoffrey Spear

unread,
Apr 7, 2020, 8:38:45 AM4/7/20
to pym...@googlegroups.com
The only way I could see this happening would be is if len(self.subfields) is odd, which shouldn't happen since there should always be two elements for each subfield (the subfield code, and the value.) 

--
You received this message because you are subscribed to the Google Groups "pymarc Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pymarc+un...@googlegroups.com.

Ben W.

unread,
Apr 7, 2020, 12:49:33 PM4/7/20
to pymarc Discussion
That's what led me in the initial direction of questioning the debugger's traceback on the control field subfields. I've got a report that consistently breaks the script with the error above, but I watched the creation of that record and did not notice anything untoward going on...I'll try again later while watching the subfield creation, maybe something will jump out at me.

Thanks for the help, Geoffrey!

Edward Summers

unread,
Apr 8, 2020, 12:08:00 PM4/8/20
to pym...@googlegroups.com
Hi Ben,

Of course please free to continue to discuss this here. But if you are able to able to provide an example that gets pymarc to throw an exception that doesn't make sense please add it as an issue over on GitLab:

https://gitlab.com/pymarc/pymarc/issues

I hope everyone is keeping healthy & safe.

//Ed

🏡 👩‍💻👨‍💻🌳

> On Apr 7, 2020, at 12:49 PM, Ben W. <wreck...@gmail.com> wrote:
>
> That's what led me in the initial direction of questioning the debugger's traceback on the control field subfields. I've got a report that consistently breaks the script with the error above, but I watched the creation of that record and did not notice anything untoward going on...I'll try again later while watching the subfield creation, maybe something will jump out at me.
>
> Thanks for the help, Geoffrey!
>
> --
> You received this message because you are subscribed to the Google Groups "pymarc Discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pymarc+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pymarc/b3707bca-bbdc-43bf-ae1c-57c4c4209b28%40googlegroups.com.

Ben W.

unread,
Apr 8, 2020, 12:41:24 PM4/8/20
to pymarc Discussion
Nothing would please me more than to find a bug in pymarc that's not of my own making! If I find something to that effect, I'll definitely do that--thanks for the heads up.
Reply all
Reply to author
Forward
0 new messages