Character encoding of redirected output

12 views
Skip to first unread message

Anton Shepelev

unread,
Aug 26, 2022, 8:32:05 AM8/26/22
to
Hello, all.

Is there a way to affect the encoding of the log file
with the output of the `net' command:

net start d >> log.txt 2>&1

Invoking `chcp' has no effect, so that

chcp 1251
net start d >> log.txt 2>&1

still produces `log.txt' with the default Cyrillic en-
coding, 866.

--
() ascii ribbon campaign - against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

JJ

unread,
Aug 27, 2022, 12:06:18 AM8/27/22
to
On Fri, 26 Aug 2022 15:32:03 +0300, Anton Shepelev wrote:
> Hello, all.
>
> Is there a way to affect the encoding of the log file
> with the output of the `net' command:
>
> net start d >> log.txt 2>&1
>
> Invoking `chcp' has no effect, so that
>
> chcp 1251
> net start d >> log.txt 2>&1
>
> still produces `log.txt' with the default Cyrillic en-
> coding, 866.

Not possible using batch file alone.

You'll need additional tools to convert the text encoding.

e.g. GNU's recode which can be downloaded as part of a Unix utilities
bundle.

https://sourceforge.net/projects/unxutils/

Anton Shepelev

unread,
Aug 27, 2022, 10:26:08 AM8/27/22
to
JJ to Anton Shepelev:

> > Invoking `chcp' has no effect, so that
> >
> > chcp 1251
> > net start d >> log.txt 2>&1
> >
> > still produces `log.txt' with the default Cyrillic
> > encoding, 866.
>
> Not possible using batch file alone.

Do you mean that codepage 866 is built into the pro-
gram, and the terminal emulator does not apply any
convesion?

> You'll need additional tools to convert the text en-
> coding. e.g. GNU's recode which can be downloaded
> as part of a Unix utilities bundle.

I have never heard of `recode' but am familliar with
`iconv' and can use it. Thank you.


--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

JJ

unread,
Aug 27, 2022, 12:56:35 PM8/27/22
to
On Sat, 27 Aug 2022 17:26:05 +0300, Anton Shepelev wrote:
> JJ to Anton Shepelev:
>
>>> Invoking `chcp' has no effect, so that
>>>
>>> chcp 1251
>>> net start d >> log.txt 2>&1
>>>
>>> still produces `log.txt' with the default Cyrillic
>>> encoding, 866.
>>
>> Not possible using batch file alone.
>
> Do you mean that codepage 866 is built into the pro-
> gram, and the terminal emulator does not apply any
> convesion?

Oh, wait. That code _should_ work. Even if NET's output contains a character
which is not within code page 1251's character set, the character will be
converted to `?`. Everything else should be converted properly.

Can you post the actual text output (not screenshot) of that NET command?

> I have never heard of `recode' but am familliar with
> `iconv' and can use it. Thank you.

Anything which can convert between character sets, can be used. It'll be
required for manual conversion or as a fallback when something doesn't work.

Anton Shepelev

unread,
Aug 27, 2022, 4:12:46 PM8/27/22
to
AS: Invoking `chcp' has no effect, so that

chcp 1251
net start d >> log.txt 2>&1

still produces `log.txt' with the default Cyrillic
encoding, 866.

JJ: Not possible using batch file alone.

AS: Do you mean that codepage 866 is built into the
program, and the terminal emulator does not apply
any convesion?

You'll need additional tools to convert the text
encoding. e.g. GNU's recode which can be down-
loaded as part of a Unix utilities bundle.

AS: I have never heard of `recode' but am familliar
with `iconv' and can use it. Thank you.

JJ: Oh, wait. That code _should_ work. Even if NET's
output contains a character which is not within
code page 1251's character set, the character will
be converted to `?`. Everything else should be
converted properly.

I never said it didn't work, all I said was that chcp
had no effect on the actual encoding of the resulting
file.

> Can you post the actual text output (not screenshot)
> of that NET command?

As I said, the actual text output is encoded in CP866.
How shall I post it here and what meaining will it
have? The best thing I can do is to attach the actual
file. I have no idea how or why attachments work in
Usenet, but they do... Also for your consideration,
my test script is shown below:

8<-------------------- test.bat ----------------------
@echo off

chcp 866 > NUL
net start d 2> 866.txt

chcp 1251 > NUL
net start d 2> 1251.txt

ECHO N|comp 866.txt 1251.txt > NUL 2>&1
IF NOT ERRORLEVEL 1 (ECHO 866.txt == 1251.txt) ^
ELSE (ECHO 866.txt != 1251.txt)
>8-------------------- test.bat ----------------------
8<------------------ .bat output ---------------------
866.txt == 1251.txt
>8------------------ .bat output ---------------------
866.txt
1251.txt

JJ

unread,
Aug 29, 2022, 12:47:20 AM8/29/22
to
On Sat, 27 Aug 2022 23:12:43 +0300, Anton Shepelev wrote:
> AS: Invoking `chcp' has no effect, so that
[snip]

I've tested it in a VM using Russian language pack and Russian system code
page. The CHCP is not effective in that case. A manual conversion is still
needed.

Anton Shepelev

unread,
Aug 29, 2022, 3:25:25 AM8/29/22
to
JJ to Anton Shepelev:

> > chcp 1251
> > net start d >> log.txt 2>&1
>
> I've tested it in a VM using Russian language pack
> and Russian system code page. The CHCP is not ef-
> fective in that case. A manual conversion is still
> needed.

Huge thanks for taking the trouble to confirm it, JJ.
Have you an idea why CHCP may not work? What is it
supposed to do? Shall console programs query the ef-
fective encoding (as set by CHCP) and recode their
output themselves?

--
() ascii ribbon campaign - against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

JJ

unread,
Aug 30, 2022, 2:14:55 AM8/30/22
to
On Mon, 29 Aug 2022 10:25:23 +0300, Anton Shepelev wrote:
>
> Huge thanks for taking the trouble to confirm it, JJ.
> Have you an idea why CHCP may not work? What is it
> supposed to do? Shall console programs query the ef-
> fective encoding (as set by CHCP) and recode their
> output themselves?

I think the main reason is because all programs are run using the system
code page, which is the global code page setting. IOTW, the system code page
is the default code page for all programs.

There's active code page which is a per-process code page setting. CMD's
`CHCP` command simply changes the active code page. But that code page
setting is not inherited to child processes. Unlike something like the
working directory and environment variables which are (by default) inherited
to child processes.

In your case, the system code page is 866. Even though CMD's code page has
been changed to 1251 using `CHCP`, the `NET` program is run using code page
866. The `NET` program can not know the destination code page (which is
1251), because the standard output handle is basically a temporary data
storage which is not aware of code page and data format (i.e. it's just a
dumb binary data storage), and the system treat data storage as non Unicode
text storage with unknown code page. Thus the system convert `NET` program's
Unicode text to non Unicode using its own code page 866. CMD doesn't and
can't know which code page the received data is supposed to be treated as.
CMD only know one code page: 1251. So, even if a conversion is applied, the
source and destination code page would be the same and the resulting data
would be unchanged.

Unfortunately, Windows doesn't provide a built in feature to specify which
code page a program should be run with. Microsoft *does* provide the feature
as a separate downloadable tool called Mcrosoft AppLocale back in Windows XP
era, but it is now discontinued and no longer unsupported. While it's still
usable in newer Windows versions, it requires installation. That's less
convenient than using e.g. `iconv` or `recode`.

Anton Shepelev

unread,
Aug 30, 2022, 7:05:36 PM8/30/22
to
JJ to Anton Shepelev:

> > Have you an idea why CHCP may not work? What is it
> > supposed to do? Shall console programs query the
> > effective encoding (as set by CHCP) and recode
> > their output themselves?
>
> I think the main reason is because all programs are
> run using the system code page, which is the global
> code page setting. IOTW, the system code page is the
> default code page for all programs.
>
> There's active code page which is a per-process code
> page setting. CMD's `CHCP` command simply changes
> the active code page. But that code page setting is
> not inherited to child processes. Unlike something
> like the working directory and environment variables
> which are (by default) inherited to child processes.
> [...]

Many thanks for the detailed and humane explanation,
JJ. Very rarely these days one is honoured with a co-
herent answer of several fluent paragraphs. The mod-
ern norm is a sloppy oneliner in a stinking chat or
social network. Usenet must be the last resort of
truly educated people.

--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]
Reply all
Reply to author
Forward
0 new messages