The Mint with unicode/ascii

9 views
Skip to first unread message

Matthias Liffers

unread,
Nov 30, 2015, 2:24:12 AM11/30/15
to redbo...@googlegroups.com

Hi everyone,

 

We’ve just added our first person with a non-English character to ReDBox (blasted Germans!). Andreas Bollhöfer is giving the Mint some grief. If I try to harvest a CSV with his name in it, the Mint (and subsequently ReDBox) presents us with an ugly, black diamond-with-a-question-mark-in-it character.

 

Has anyone else encountered this issue? I’ve tried saving the CSV in various character encoding methods, without much luck.

 

Thanks,

 

Matthias Liffers
BCompSc W.Aust. MInfoStud CSturt AALIA (CP)
Coordinator, Research Services | University Library

Curtin University
Tel |
+61 8 9266 2439
Fax | +61 8 9266 4185

Email | matthias...@curtin.edu.au
Web | http://library.curtin.edu.au

ORCID | http://orcid.org/0000-0002-3639-2080


Description: email_logo.png

 

Curtin University is a trademark of Curtin University of Technology.
CRICOS Provider Code 00301J

 

Grant Jackson

unread,
Nov 30, 2015, 6:25:22 PM11/30/15
to redbo...@googlegroups.com
Hi Matthias,

Yes, I've encountered this issue in Mint and other systems in the past. As you suggest, it is usually caused by incorrect character encoding.

In my case it was a CSV file derived from Microsoft Excel using Windows character encoding (eg. WINDOWS-1250) but Mint expecting UTF-8. The way I fixed it under Linux was:

1/ Convert the CSV from WINDOWS-1250 to UTF-8 using "iconv". Eg.

iconv -f WINDOWS-1250 -t UTF8 input_win1250.csv > output_utf8.csv

2/ Import/harvest output_utf8.csv into Mint.

Hope this helps.

Cheers, Grant

--
-- Website: http://www.redboxresearchdata.com.au
 
You received this message because you are subscribed to the Google Groups ReDBox group. To post to this group, send email to redbo...@googlegroups.com. To unsubscribe from this group, send email to redbox-repo...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/redbox-repo?hl=en
---
You received this message because you are subscribed to the Google Groups "ReDBox" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redbox-repo...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matthias Liffers

unread,
Nov 30, 2015, 6:59:06 PM11/30/15
to redbo...@googlegroups.com

Thanks Grant, that did the trick.

 

It looks like Excel doesn’t actually save files as UTF-8 when you think you’ve told it to save in UTF-8, which is why I was so confused! iconv is exactly what I needed.

 

Matthias.

Reply all
Reply to author
Forward
0 new messages