How to open a textfile in UNICODE UTF-8 format in MapInfo Pro 9.0?

1,247 views
Skip to first unread message

kartmann

unread,
Jan 24, 2008, 6:49:47 AM1/24/08
to MapInfo-L
Hi!

I am trying to open a file containg placenames for the hole world.
(See http://www.geonames.org/export/#dump)
Its in unicode UTF-8 to cover all world's of charcter sets.

UTF-8 isn't mentioned in the list of Charsets in the MI documentation

Does't MIpro 9.0 support text files in UTF-8?

And can MIpro9.0 display all Unicode glyphs by using the Arial Unicode
MS font?

BR
-alf

Bill Thoen

unread,
Jan 24, 2008, 10:43:49 AM1/24/08
to mapi...@googlegroups.com
I don't think MI 9.0 supports unicode yet, and earlier versions don't.
But one thing that I've tried that works is to open the file as Binary
and read the characters as SmallInts and convert the two bytes to a
character. I don't remember which byte is which, but one will be zero
and the other translates directly to the Windows-equivalent character
with Chr$(). This technique is a kludge, and MapInfo needs to complete
the MapBasic XML library and support UTF-8, but until then, this will
get the job done.

You can also easily convert a UTF-8 encoded file to Windows by using a
good text editor (like UltraEdit).

Peter Horsbøll Møller

unread,
Jan 24, 2008, 10:59:34 AM1/24/08
to mapi...@googlegroups.com
As far as I have understood MapInfo does support UTF-8, but only thru the XML-library.

Peter Horsbøll Møller
GIS Developer, MTM GeoInformatics
Geographical Information & IT

COWI A/S
Odensevej 95
DK-5260 Odense S.
Denmark

Tel +45 6311 4900
Direct +45 6311 4908
Mob +45 5156 1045
Fax +45 6311 4949
E-mail p...@cowi.dk
http://www.cowi.dk/gis

gasenngo

unread,
Jan 24, 2008, 11:46:51 AM1/24/08
to MapInfo-L
In my case we need to have our programmer write a font conversion
procedure to convert all UTF-8 font into in our own language code and
display it with local font.

Never get luck with UTF-8 in MI. Our solution work fine for small
dataset, but for table with tens of thousand record, it's really suck.

On Jan 24, 6:49 pm, kartmann <alf...@gmail.com> wrote:
> Hi!
>
> I am trying to open a file containg placenames for the hole world.
> (Seehttp://www.geonames.org/export/#dump)

Lars I. Nielsen (GisPro)

unread,
Jan 24, 2008, 1:55:11 PM1/24/08
to mapi...@googlegroups.com
What you're describing, Bill, is Unicode, not UTF-8.

UTF-8 characters are encoded as 1 - 4 bytes, depending on their value.
US ASCII, i.e. 7 bit characters, are encoded as is (i.e. 1 byte), and 8
bit characters are encoded as a double byte.

It's described in detail here - http://en.wikipedia.org/wiki/UTF-8

If MapInfo 9 supports UTF-8 as Peter says, it's entirely due to the
implementation of the external XML library. But it's unfortunately not
exposed for use in Mapbasic (yet), as far as I know.

It would have been nice, now that MapInfo were making a non-backward
compatability break in the data format anyway, if they'd introduced
storing texts as Unicode in the DAT file in v9. Even UTF-8 would have
been an improvement.

Best regards / Med venlig hilsen
Lars I. Nielsen
GIS & DB Integrator
GisPro

Bill Thoen skrev:

Bill Thoen

unread,
Jan 24, 2008, 2:23:38 PM1/24/08
to mapi...@googlegroups.com
Ah, thanks... you're right. My knowledge of the wild and wacky world of
character set encodings and code pages is pretty limited. Looking at the
NRSI site on this subject
(http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=IWS-Chapter03)
shows me that the legacy of the biblical Tower of Babel is still working
its curse on us humans. It's amazing to me that something so central to
human communication as the written word has become so complicated to
produce. So much for one of the most crowed about features of XML --that
is simple and portable because it is pure text!

Warren Vick

unread,
Jan 25, 2008, 2:49:23 AM1/25/08
to mapi...@googlegroups.com
Hello Lars,

The "U" in UTF does actually stand for Unicode. Unicode is a character
representation system and UTF-8 is one of several system to encode it in
a (mostly) compact way. You're right that ACSII is essentially "pass
thru". So, to support a UTF-8 encoded data set fully, Pro would need to
be able to decode UTF-8 and then do something with the content which (in
all probability with a global gazetteer) will utilise Unicode
characters. The best Pro can do (and I have not played with v9.0 yet -
still on the shelf) is represent characters that are native to the
Windows version (or deceived with "AppLoc") and simplify the rest. For
example, run Pro in Japanese and you will loose all Western Europe
charset accents.

The trick to get UTF-8 data in to Pro is to go via Access. We have
Japanese and others working here, but until full Unicode support is
added (with all the compatibility headaches associated with that), you
will not see all the character sets of the world running in a single Pro
session. If that has changed in v9.0, as a vendor of global data, I will
be delighted!

Regards,
Warren Vick
Europa Technologies Ltd.
http://www.europa-tech.com

-----Original Message-----
From: mapi...@googlegroups.com [mailto:mapi...@googlegroups.com] On
Behalf Of Lars I. Nielsen (GisPro)
Sent: 24 January 2008 18:55
To: mapi...@googlegroups.com
Subject: [MI-L] Re: How to open a textfile in UNICODE UTF-8 format in
MapInfo Pro 9.0?

kartmann

unread,
Feb 21, 2008, 7:08:41 AM2/21/08
to MapInfo-L
SUM:
MIpro including version 9.0 is a "non-Unicode" application.
How characters are treated is depended on
1) Which language version of MapInfo you are using
2) Your localization settings in Windows (in version 9.0 at least).

Qustion to PB Mapinfo:
Do you plan to add full Unicode support in future versions?

BR
-alf

jie guan

unread,
Oct 18, 2017, 5:23:49 AM10/18/17
to MapInfo-L
Hey
I find there is a new method to solve the problem

change *.mif  
from  Charset "utf-8" to Charset "WindowsSimpChinese"

then have try

在 2008年1月24日星期四 UTC+8下午7:49:47,kartmann写道:

Eric Blasenheim

unread,
Oct 18, 2017, 4:45:37 PM10/18/17
to MapInfo-L
This seems like a very old post so I will just be brief.
Charset "utf-8" is not the same as Charset "WindowsSimpChinese" Not even close. I am not sure that this post was for real but if it was, it is not true.

However, MapInfo Pro is a Unicode program  in the v15 and 16 versions and supports reading/writing data encoded in UTF-8 and UTF-16. In addition, a data format, which is called Extended TAB or NativeX technically, supports the storage of data in those encodings as well as in databases that are set up for them. Also GeoPackage format is always UTF-8. 
Regards,
Eric Blasenheim
Pitney Bowes Software
Reply all
Reply to author
Forward
0 new messages