problem writing records

113 views
Skip to first unread message

Pablo

unread,
Jun 5, 2012, 11:22:51 AM6/5/12
to pym...@googlegroups.com
Dear list members,

I am trying to remove some fields from some authority records I need to import in my system.
For this I read the record and write every field (except those that I need to skip) to a new file.
The problem is that when I want to write the leader information I get the following exeption.
Message    File Name    Line    Position   
    
exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 58: ordinal not in range(128)               


this is my code and attached is the marc file I am using to test the script:

output = open('file.dat','w')
record = Record()

reader = MARCReader(file('espionage.mrc') , to_unicode=False)
for readrecord in reader:

    l = list(record.leader)
    l[5] = 'n'
    l[6] = 'z'
    l[9] = 'a'
    l[17]= 'n'

    record.leader = ("".join(l))
    for field in readrecord:
        if field.tag != '016' and field.tag != '697':
            record.add_field(field)

output.write(record.as_marc())
output.close()

It seems it does not like the encoding on the leader but I don't know how to change it.
It is probably obvious that I am not very experienced in scripting, so any help is greatly appreciated.
Pablo
espionage.mrc

Godmar Back

unread,
Jun 5, 2012, 11:33:01 AM6/5/12
to pym...@googlegroups.com

You set leader[9] to be 'a'. Pymarc requires that a record's data is in Unicode (meaning that it's fields are stored as Python unicode objects which can be utf8-encoded upon output) when writing it to a file if the leader[9] field is 'a'. You fail to decode the record to unicode (because you set to_unicode=False).

Solution: make 'to_unicode=True' or don't set leader[9] to 'a'.

 - Godmar
Reply all
Reply to author
Forward
0 new messages