This bug is not limited to Windows; it occurs in MS-DOS and OS/2 as well.
Code page 437 is the default character set for US English. It contains a
rich set of graphical symbols and a number of accented letters for writing
West European languages.
Code page 850 is the default character set for most West European languages
other than US English. It contains all the characters found in ISO 8859-1,
though not in the same order, as well as a number of graphical symbols,
though not as many as code page 437.
Code page 852 is the default character set for Central European languages
using the Latin alphabet. It contains all the characters found in ISO
8859-2, again not in the same order, and mostly the same graphical symbols
found in code page 850.
Some other notable character sets for the Windows console box are:
Code page A permutation of Used for:
737 ISO 8859-7 Greek
857 ISO 8859-9 Turkish
866 none Russian
IBMgraphics uses seven symbols that are present in code page 437 but missing
from most other PC character sets:
CP437 Unicode
0xAD U+00A1 '¡' Rogue level potion
0xE7 U+03C4 'τ' Rogue level wand
0xF0 U+2261 '≡' Iron bars and Rogue level stairs
0xF1 U+00B1 '±' Tree
0xF4 U+2320 '⌠' Fountain
0xF7 U+2248 '≈' Water and lava
0xFA U+00B7 '·' Floor of a room
(If your newsreader supports Unicode, and your font has the necessary
glyphs, you'll see the map symbols between the quotes. If not, just
pretend they're not there. Flames for posting Unicode to a newsgroup will
be ignored.)
In Mariusz's screenshot at http://republika.pl/amiro/nethack/screen1.png,
you can see a small red glyph that looks like a low hook. This is '¸', or
U+00B8 CEDILLA. Code page 852 has this character at position 0xF7. But
IBMgraphics expects this position to be '≈', or U+2248 ALMOST EQUAL TO,
used to display lava and pools. Mariusz was running NetHack in wizard mode
and had created a lava pool there.
When using a bitmap font and a windowed console, you are limited to the
repertoire of the starting character set, even after changing the code
page. Using "chcp" to change the code page is thus not a solution.
Furthermore, a user may want to type names, notes, and such in his native
language using the configured character set.
Spanish NetHack has a possible fix for that. The fix used there is to make
IBMgraphics a setting from 0 to 3:
0 - ASCII only
1 - line drawing characters only
2 - the full set with the above seven characters removed
3 - the full set
For backward compatibility, IBMgraphics without a number is equivalent to
IBMgraphics:3, and !IBMgraphics and its synonyms are equivalent to
IBMgraphics:0.
OEM code pages compatible with IBMgraphics:3 are:
437 - US English; also default for South Africa, Zimbabwe, the
Philippines, and Swahili
862 - Hebrew
OEM code pages compatible with IBMgraphics:2 are all of the above, plus:
737 - Greek
775 - Estonian, Lithuanian, Latvian and Maori
850 - West European languages
852 - Central European languages using the Latin alphabet
857 - Turkish
866 - Cyrillic
OEM code pages compatible with IBMgraphics:1 are all of the above, plus:
708 - Arabic
(Yeah, I know, but at the time I defined this it was KOI8-R that I had in
mind. :-/)
OEM code pages not compatible with IBMgraphics are:
874 - Thai
932 - Japanese
936 - Simplified Chinese
949 - Korean
950 - Traditional Chinese
1258 - Vietnamese
This list only includes OEM code pages that are the default for some locale.
Others exist, but are not the default for their locales, and using them
encounters the same difficulties as using code page 437 in these locales.
--
--------------===============<[ Ray Chason ]>===============--------------
The War on Terra is not meant to be won.
Delendae sunt RIAA, MPAA et Windoze