A Unicode enabled IRCjr is available for testing

161 views
Skip to first unread message

Michael Brutman

unread,
Jan 26, 2023, 8:42:56 PM1/26/23
to mTCP

There is a detailed readme.txt file in the zip.  Here is the short version:
  • This version supports UTF-8 encoding and decoding.
  • The mapping from Unicode to your local character set is defined in a file.  I've included a mapping for CP437.
  • Enable it by setting an environment variable.
If it works correctly, you should be able to use Unicode in messages and on channel topics.

If you are not using CP437 or are using a non-US layout keyboard you need to set your codepage and keyboard layout as you usually do, but also create your own mapping file from Unicode to your code page.  I'd like to include more mapping files but I wanted to get some early feedback first.

Thanks,
Mike

Michael Brutman

unread,
Jan 29, 2023, 4:24:45 PM1/29/23
to mTCP
I have added a Unicode enabled Telnet to the same zip file. Just set the environment variable to the mapping file and it will interpret Unicode sent via UTF-8. Sending Unicode also works but I have not implemented a "compose" sequence for arbitrary Unicode code points so you are limited to what your keyboard can produce.

The text rendering speed is fine even on my slowest system.  I tested by editing a local copy of the UTF-8 demo file at https://www.w3.org/2001/06/utf-8-test/UTF-8-demo.html; Thai and Amharic is never going to work, but the more common symbols are available and even a full page of Unicode characters renders quickly.

I'm looking for feedback - if you try them please let me know what you thought of the experience.


-Mike
Message has been deleted

LLG

unread,
Jun 16, 2023, 10:02:58 AM6/16/23
to mTCP
I've tested it and noticed that it does not allow to enter the character 0xE0 in the command line.

Michael Brutman

unread,
Jun 16, 2023, 11:36:50 AM6/16/23
to mTCP
That depends on how you are trying to generate 0xE0 ...  If your keyboard code page mapping will generate an 0xE0 then IRCjr will map it to Unicode and send it.  Otherwise, you are stuck for the moment.  (And my apologies for that ...)

Telnet has a "compose" function that will let you generate any Unicode code point even if your keyboard code page mapping does not.  IRCjr doesn't have that function yet - it's going to be a pain to implement it give the way I handle the keyboard in IRCjr. (It can be done, it's just more complex than I was willing to tackle at that time.)


-Mike

LLG

unread,
Jun 16, 2023, 9:38:42 PM6/16/23
to mTCP
If your keyboard code page mapping will generate an 0xE0 then IRCjr will map it to Unicode and send it.  Otherwise, you are stuck for the moment.
The key(+shift state) that produces 0xE0 does not get registered by the input line.

Michael Brutman

unread,
Jun 16, 2023, 10:51:39 PM6/16/23
to mTCP
I just tested it using the following:
  • Vmware Player (because VirtualBox has known limitations with keyboard handling.
  • IRCjr setup to use code page 437.
If I use ALT and the three digit code at the numerical keypad (hold ALT and hit 2 2 4) that generates the 0xE0 character, which in code page 437 is alpha.  That gets sent as Unicode 0x3B1 to the server.  I don't have a Greek layout keyboard but if I did, using the KEYB.COM command to tell DOS that I had a Greek layout keyboard and then hitting the corresponding key should have the same result.  I've tested it with French and German and the process is the same.

Have you set the environment variable to point at a code page mapping file before starting IRCjr?  (I provided two sample files, one for CP 437 and one for CP850).

What keyboard layout are you using and did you use KEYB.COM to tell DOS about it?


-Mike

LLG

unread,
Jun 17, 2023, 9:32:26 AM6/17/23
to mTCP
> If I use ALT and the three digit code at the numerical keypad (hold ALT and hit 2 2 4) that generates the 0xE0 character, which in code page 437 is alpha.

Interesting, I tried in 86box and it seems to work. That seems specific to VirtualBox. However, Alt+### works for me in VB, unlike what the manual states (except for that particular character). 

LLG

unread,
Jun 17, 2023, 11:30:35 AM6/17/23
to mTCP
> Interesting, I tried in 86box and it seems to work. That seems specific to VirtualBox.

On further testing, there are some DOS apps where that key and Alt-224 do not work, but it DOES work in the command shell even inside VirtualBox. That suggests there is an error in keyboard reading routine that ircjr or Norton Commander use, but which command.com does not use (and work properly in VirtualBox and other emulators).

Michael Brutman

unread,
Jun 17, 2023, 12:33:46 PM6/17/23
to mTCP
I'm using VirtualBox 6.1.0.  The Alt-keypad trick for entering characters does not work at all; you can read about it at http://www.brutman.com/Adventures_In_Code/Adventures_In_Code.html.  If you are using a different version of VirtualBox we should test that to see what it's behavior is.  (COMMAND.COM from IBM PC DOS 6.3 doesn't see the keycodes either, nor should it - that's a BIOS function.)

None of the mTCP programs use the OpenWatcom keyboard BIOS routines because they had a bug that when combined with the VirtualBox behavior was making it look like the programs were freezing up.  They use the BIOS directly.  If you can't generate a character using Alt-numpad that's a BIOS bug.

LLG

unread,
Jun 18, 2023, 9:49:12 PM6/18/23
to mTCP
I'm using VirtualBox 6.1.0.  The Alt-keypad trick for entering characters does not work at all

I see what's going on here. The KEYB utility from MS DOS 5+ seems to handle the Alt input by itself; probably the bugged int 16h routine cannot handle the character 224, because it is the only character that cannot be input by the Alt+### method while in ircjr.

Some DOS applications (like COMMAND.COM and MS-WORKS) do not have that issue even in VirtualBox. Maybe they use another method of keyboard input that bypasses that bug?

LLG

unread,
Jun 18, 2023, 10:37:36 PM6/18/23
to mTCP
 probably the bugged int 16h routine cannot handle the character 224
 
On further investigation, it seems indeed an error in int 16h fn 0 in VBox: it probably works as fn 10h (which works correctly) internally and then zeroes the lower byte if it is 224, resulting in return value being 0.
Reply all
Reply to author
Forward
0 new messages