reading CSV file with non ASCII characters

1,390 views
Skip to first unread message

elcaiaimar

unread,
Feb 5, 2016, 9:14:01 PM2/5/16
to Django users
Hello,

I have a CSV File and I want read it. The problem is that it has non ASCII characters such as 'Ñ' and accents and I need that they are recognised to save the CSV content in a DB.

To simplify, I've summed up my code in django to the next:

​import csv

reader = csv.DictReader(open("file.csv", "rb"))
for row in reader:
    title=row['title']
    country=row['country']
    print title
    print country
This code returns, for example: Espa�a or Mediterr�neo and I want to get España or Mediterráneo.
Thank you very much!

Bill Blanchard

unread,
Feb 5, 2016, 10:38:28 PM2/5/16
to django...@googlegroups.com

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/54c6c685-f4dd-4233-9a30-22424c986a43%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Erik Cederstrand

unread,
Feb 5, 2016, 11:00:55 PM2/5/16
to Django Users

> Den 6. feb. 2016 kl. 09.14 skrev elcaiaimar <sep...@gmail.com>:
>
> Hello,
>
> I have a CSV File and I want read it. The problem is that it has non ASCII characters such as 'Ñ' and accents and I need that they are recognised to save the CSV content in a DB.
>
> To simplify, I've summed up my code in django to the next:
>
> ​import csv
>
> reader = csv.DictReader(open("file.csv", "rb"))

You should always convert bytestrings to unicode as soon as possible in Python. You need to specify the encoding of your file, e.g.:

reader = csv.DictReader(open("file.csv", "rb", encoding='utf-8'))

See https://docs.python.org/3/library/functions.html#open

For Python 2, have a look at the notes about encodings in https://docs.python.org/2/library/csv.html

Erik

elcaiaimar

unread,
Feb 6, 2016, 6:06:31 AM2/6/16
to Django users
Thank you for your quick answers. I've tried what you recommended me. My code is like that now:

import unicodecsv as csv

csvfile=open("mediosdigitales.csv")
reader = csv.reader(csvfile, encoding='utf-8', delimiter=',')
for row in reader:
    titulo=row['titulo']
    pais=row['pais']
    print titulo
    print pais

However, when I compile it I get this error: list indices must be integers, not str

How could I solve this problem?

Thank you again

Tim Chase

unread,
Feb 6, 2016, 7:41:54 AM2/6/16
to django...@googlegroups.com
On 2016-02-06 03:06, elcaiaimar wrote:
> import unicodecsv as csv
>
> csvfile=open("mediosdigitales.csv")

> > reader = csv.DictReader(open("file.csv", "rb"))

> reader = csv.reader(csvfile, encoding='utf-8', delimiter=',')
> for row in reader:
> titulo=row['titulo']
> pais=row['pais']
>
> However, when I compile it I get this error: *list indices must be
> integers, not str*

you switched from using csv.DictReader() to csv.reader()

Just switch back and your string-indexing should be good.

-tkc



paul.her...@gmail.com

unread,
Feb 6, 2016, 2:20:50 PM2/6/16
to django...@googlegroups.com
On Sat, Feb 6, 2016 at 5:38 AM, Tim Chase
<django...@tim.thechases.com> wrote:
> On 2016-02-06 03:06, elcaiaimar wrote:
>> import unicodecsv as csv
>>
>> csvfile=open("mediosdigitales.csv")
>
>> > reader = csv.DictReader(open("file.csv", "rb"))
>
>> reader = csv.reader(csvfile, encoding='utf-8', delimiter=',')

Do you actually know that the character encoding of the file is utf_8?

If the file is coming from a Western European (including English)
system, it could easily be cp_1252.

https://docs.python.org/3/library/codecs.html
Reply all
Reply to author
Forward
0 new messages