What is ANSI_CHARSET?

Art Decco

unread,

Jul 14, 2000, 3:00:00 AM7/14/00

to

In the context of specifying a font via a LOGFONT struct as follows:

LOGFONT lf;
memset( &lf, 0, sizeof(LOGFONT) );
lf.lfHeight = 240; // tenths of a point
_tcscpy( lf.lfFaceName, _T("Arial") );
lf.lfCharSet = ANSI_CHARSET;

VERIFY( m_Font.CreatePointFontIndirect(&lf) );

...what is the meaning of ANSI_CHARSET? I know the (mis)use of "ANSI" to
mean "not Unicode" in its usual Win32 API context. In its "not Unicode"
sense, "ANSI" means shift-JIS as well as CP1252. The FooA() APIs take
shift-JIS, while FooW takes Unicode, for example.

The above usage is just a way of specifying a subset of fonts installed on
the current system. Since there is a separate SHIFTJIS_CHARSET, as well as
Korean and Chinese equivalents, does the above (mis)use really just mean
"only CP 1252", therefore not Japanese or Chinese on a US system, or does it
mean whatever the standard script system is on the current OS?

Thanks.

Michael (michka) Kaplan

unread,

Jul 15, 2000, 3:00:00 AM7/15/00

to

Actually, in this sense ANSI_CHARSET is not the misused term, iy is the
actual term... meaning it is only CP1252. You chould be using
DEFAULT_CHAERSET to define the charset referring to the default code page on
the machine.

--
MichKa

random junk of dubious value at the multilingual
http://www.trigeminal.com/ and a new book on
i18N in VB at http://www.trigeminal.com/michka.asp

"Art Decco" <ple...@dontemail.com> wrote in message
news:u3gxpRh7$GA....@cppssbbsa02.microsoft.com...

Brendan Murray

unread,

Jul 15, 2000, 3:00:00 AM7/15/00

to

"Michael (michka) Kaplan" <forme...@spamfree.trigeminal.nospam.com> wrote in message news:uCqRhSl7$GA.195@cppssbbsa04...

> Actually, in this sense ANSI_CHARSET is not the misused term, iy is the
> actual term... meaning it is only CP1252. You chould be using

In fact anything referring to ANSI in Windows terms *is* a misused term. No Windows codepage is ANSI- or ISO-compliant.

B=

Michael (michka) Kaplan

unread,

Jul 15, 2000, 3:00:00 AM7/15/00

to

Well, ANSI is often misused because the ~Unicode (non-Unicode) APIs have an
"A" suffix.

CP1252 is closer to it than the "A" functions with actually will accept any
code page depending on the system default. So it is less of a misuse of the
term (jaywalking rather than assault? <g>).

In any case, DEFAULT_CHARSET is what you want to use here.

--
MichKa

random junk of dubious value at the multilingual
http://www.trigeminal.com/ and a new book on
i18N in VB at http://www.trigeminal.com/michka.asp

"Brendan Murray" <bpmu...@no-spam-At-mediaone.net> wrote in message
news:Ww%b5.26914$DJ2.1...@typhoon.ne.mediaone.net...

Michael (michka) Kaplan

unread,

Jul 16, 2000, 3:00:00 AM7/16/00

to

"Brendan Murray" <bpmu...@no-spam-At-mediaone.net> wrote in message

news:h2sc5.30168$Q8.2...@typhoon.ne.mediaone.net...
> Well actually Michael, my point was that MS appear to have
> hijacked the term, even though none of the MS-defined
> charsets are ANSI-compliant, the C1 region being populated
> with graphics in direct contravention of ISO-2022.

Agreed. But of since they use it in a way that is obviously not true, I do
not have as hard of a time with it as some people do. In summary (MS term
and its actual meaning):

*ANSI (for charsets)= all western european code pages (fonts map all of
that)
*ANSI (for everything else) = ~Unicode ("Not" Unicode)
*Unicode (<Win2000) = UCS-2, Little endian
*Unicode (>=Win2000) = UTF-16, Little endian

Seems simple enough, even if it is not standard terminology. :-)

> And because of their continued insistance on tagging CP1252
> data with "charset=iso-8859-1", this is an assault on all other
> purveyors of better-quality messaging systems who have to
> accomodate this abuse.

What programs do this? I do not think any of the ones I run do, but I only
use Exchange 5.0 and Outlook Express for e-mail (and the latter has separate
tags for Western European (ISO) and Western European (Windows).

michka

Brendan Murray

unread,

Jul 17, 2000, 3:00:00 AM7/17/00

to

"Michael (michka) Kaplan" <forme...@spamfree.trigeminal.nospam.com> wrote in message news:u0AdDDp7$GA.242@cppssbbsa05...

> CP1252 is closer to it than the "A" functions with actually will accept any
> code page depending on the system default. So it is less of a misuse of the
> term (jaywalking rather than assault? <g>).

Well actually Michael, my point was that MS appear to have hijacked the term, even though none of the MS-defined charsets are ANSI-compliant, the C1 region being populated with graphics in direct contravention of ISO-2022. And because of their continued insistance on tagging CP1252 data with "charset=iso-8859-1", this is an assault on all other purveyors of better-quality messaging systems who have to accomodate this abuse.
B=

Brendan Murray

unread,

Jul 17, 2000, 3:00:00 AM7/17/00

to

"Michael (michka) Kaplan" <forme...@spamfree.trigeminal.nospam.com> wrote in message news:#5KST947$GA.281@cppssbbsa04...

> What programs do this? I do not think any of the ones I run do, but I only
> use Exchange 5.0 and Outlook Express for e-mail (and the latter has separate
> tags for Western European (ISO) and Western European (Windows).

I appear to stand corrected - Outlook Express, Outlook 98 (and predecessors), Exchange, Frontpage, etc. all were guilty of this. Apparantly they have at long last fixed this: I need to retry it.

B=

Michael (michka) Kaplan

unread,

Jul 17, 2000, 3:00:00 AM7/17/00

to

FrontPage 2000 definitely does not do this..... and Outlook Express 5.0
works for me. I have never really used Outlook except when forced to for bug
repros, and although I use Exchange 5.0 for two mail accounts I never send
multilingual text from it...

--
MichKa

random junk of dubious value at the multilingual
http://www.trigeminal.com/ and a new book on
i18N in VB at http://www.trigeminal.com/michka.asp

"Brendan Murray" <bpmu...@no-spam-At-mediaone.net> wrote in message
news:TxCc5.31999$DJ2.1...@typhoon.ne.mediaone.net...

Antoine Leca

unread,

Jul 20, 2000, 3:00:00 AM7/20/00

to

Michael Kaplan wrote:
>
> In summary (MS term and its actual meaning):
>
> *ANSI (for charsets)= all western european code pages (fonts map all of
> that)

Agreed

> *ANSI (for everything else) = ~Unicode ("Not" Unicode)
> *Unicode (<Win2000) = UCS-2, Little endian
> *Unicode (>=Win2000) = UTF-16, Little endian

Widely seen is "Wide" being a synonym for what you call "Unicode".

> Seems simple enough, even if it is not standard terminology. :-)

> > And because of their continued insistance on tagging CP1252
> > data with "charset=iso-8859-1", this is an assault on all other
> > purveyors of better-quality messaging systems who have to
> > accomodate this abuse.
>

> What programs do this?

About every old ones. BTW, competitors were no better (and since
MS entered this market later, this is hardly MS's fault).

This have been corrected in the products that presently ship (thanks!)

Antoine

Michael (michka) Kaplan

unread,

Jul 20, 2000, 3:00:00 AM7/20/00

to

"Antoine Leca" <Antoin...@renault.fr> wrote in message
news:3976D195...@renault.fr...

> > *ANSI (for everything else) = ~Unicode ("Not" Unicode)
> > *Unicode (<Win2000) = UCS-2, Little endian
> > *Unicode (>=Win2000) = UTF-16, Little endian
>
> Widely seen is "Wide" being a synonym for what you call "Unicode".

Well, yes.... but not in any MS documentation or product UIs.... they call
it Unicode there. Even UTF-8 is JAM (just another multibyte? <g>).