mongoimport error -> exception:Invalid UTF8 character detected

7,533 views
Skip to first unread message

zeevikm

unread,
Oct 25, 2010, 7:26:19 PM10/25/10
to mongodb-user
Hi All,

I am trying to import a CSV file (maxmind geo data) with mongoimport.
All non-latin characters (ex: Myggsjön) produce and exception (bash/
mac os x). What am I doing wrong?

Output example:

Mon Oct 25 15:58:35 got line:291169,"NL","15","Wiene","",
52.2500,6.6500,,
Mon Oct 25 15:58:35 got line:291170,"IR","24","Shahresduneh","",
34.3242,50.7353,,
Mon Oct 25 15:58:35 got line:291171,"IT","05","Camposonaldo","",
43.9500,11.8833,,
Mon Oct 25 15:58:35 got line:291172,"SE","10","?ngarna","",
60.3833,15.7167,,
exception:Invalid UTF8 character detected
291172,"SE","10","?ngarna","",60.3833,15.7167,,
Mon Oct 25 15:58:35 got line:291173,"SE","10","Myggsj?n","",
60.3333,15.3000,,
exception:Invalid UTF8 character detected
291173,"SE","10","Myggsj?n","",60.3333,15.3000,,
Mon Oct 25 15:58:35 got line:291174,"JP","23","Higashi-kurobe","",
34.5833,136.6000,,
Mon Oct 25 15:58:35 got line:291175,"CN","04","Daijiacun","",
31.6083,119.7111,,

Thanks for your help!

Kristina Chodorow

unread,
Oct 25, 2010, 7:37:50 PM10/25/10
to mongod...@googlegroups.com
Make sure the file is encoded as UTF-8.  A lot of text editors will let you choose a new encoding for a file, so try saving it (under a different name, just in case) as UTF-8 and importing that.



--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


zeevikm

unread,
Oct 25, 2010, 8:36:56 PM10/25/10
to mongodb-user
Kristina,

Thanks for your quick reply and pointing in the right direction. The
original file is not UTF8 encoded as you mentioned. I got this
resolved with iconv:

$ file GeoLiteCity-Location.csv
GeoLiteCity-Location.csv: ISO-8859 text

$ iconv -f ISO-8859-1 -t UTF-8 GeoLiteCity-Location.csv >GeoLiteCity-
Location-UTF8.csv

$ file GeoLiteCity-Location-UTF8.csv
GeoLiteCity-Location-UTF8.csv: UTF-8 Unicode text


Kristina Chodorow

unread,
Oct 26, 2010, 5:11:45 AM10/26/10
to mongod...@googlegroups.com
Cool, glad it worked out.



--

Mark Clancy

unread,
Feb 21, 2014, 12:38:02 AM2/21/14
to mongod...@googlegroups.com
Thanks zeevikm - also worked for me.

prasad.ad...@gmail.com

unread,
Jul 27, 2016, 8:33:54 PM7/27/16
to mongodb-user
Hello Team,

I am trying to import json format file using this command ::C:\MONGO\bin>mongoimport --jsonArray --db padir_mongodb --
collection employees<
C:\mongo\employees.json
 but I am getting below error :
Wed Jul 27 09:50:34 exception:Invalid UTF8 character detected. Could you please help how will I rectify the error.



Thanks for your help!

Raghunadh Prasad

unread,
Jul 28, 2016, 3:51:42 PM7/28/16
to mongodb-user@googlegroups com
Hi Prasad,

I believe this may be associated with encoding of input file.

Make sure the file is encoded as UTF-8.

If the file size is small, try to open it in notepad++ or other text editor and save with UTF-8 and try to perform the import.

Thanks
Raghunadh


--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.

To post to this group, send email to mongod...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages