fin = codecs.open(fname,"r",encoding="UTF-8")
eader = csv.DictReader(fin)
for values in reader:
pass
results in:
File "run.py", line 23, in process_file
for values in reader:
File "/usr/local/lib/python2.5/csv.py", line 83, in next
row = self.reader.next()
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in
position 13: ordinal not in range(128)
As you can see the exception is thrown in csv.py. How it is possible?
The csv.DictReader should not use ascii codec for anything, because the
file encoding is UTF-8.
Please help.
Best,
Laszlo
Reader works with byte strings, not unicode objects.
--
Jarek Zgoda
Skype: jzgoda | GTalk: zg...@jabber.aster.pl | voice: +48228430101
"We read Knuth so you don't have to." (Tim Peters)
The csv module doesn't support unicode. Read the values as byte strings and
decode afterwards.
Peter
Thanks,
Laszlo
>> Read the values as byte strings and decode afterwards.
Or monkey-patch:
import csv
def make_reader(fin, encoding="UTF-8"):
reader = csv.DictReader(fin)
reader.reader = ([col.decode(encoding) for col in row] for row in reader.reader)
return reader
fin = open("example.csv")
for record in make_reader(fin):
print record
> Is there a plan to make csv reader compatible with unicode?
I don't know.
Peter