cscs.exe changes console output encoding to UTF8 unexpectedly

448 views
Skip to first unread message

bog...@gmail.com

unread,
Apr 15, 2015, 2:33:50 PM4/15/15
to cs-s...@googlegroups.com
Hello,

If you open Command Prompt and type:
powershell -command [System.Console]::OutputEncoding.EncodingName
the output will be something like:
Cyrillic (DOS)

But if you run cscs.exe with any parameters, e.g:
cscs.exe /s
the output of the above powershell command become:
Unicode (UTF-8)

Such behavior of cscs.exe unfortunately triggers some undesirable features of powershell (or maybe bugs). For example, if cscs was run at least once, powershell outputs UTF8 BOM when its output is redirected. E.g. the following command
powershell -command [System.Console]::OutputEncoding.EncodingName | MyFile.txt
inserts UTF8 BOM at the begining of file MyFile.txt if cscs was run at least once.

To workaround this problem I have to add string:
Console.OutputEncoding = Encoding.ASCII;
to my scripts.

Oleg Shilo

unread,
Apr 15, 2015, 11:46:06 PM4/15/15
to cs-s...@googlegroups.com
Correct. CS-Script always sets Console.OutputEncoding to utf-8. This feature was implemented upon request of Chinese users. 
However I do see how others may want to use an alternative encoding. The latest release (v3.9.7) contains new setting ConsoleEncoding, which you can set to the name of the desired encoding as it is named in System.Text.Encoding.GetEncodings(). The setting can be accessed via ConfigConsole.

bog...@gmail.com

unread,
Apr 17, 2015, 5:57:17 PM4/17/15
to cs-s...@googlegroups.com, osh...@gmail.com
Have tried the new setting but unsuccessfully. I set ConsoleEncoding to "ascii", "windows-1251", "koi8-r" and "Cyrillic (Windows)" but cscs.exe continues to set System.Console.OutputEncoding to utf-8.
In any case I think it's not quite well that cscs changes console encoding on its own and does not allow to disable this behavior. Console encoding is external environment setting that is visible to other programs and can affect them (e.g. the powershell issue that I mentioned in my previous post). So changing this setting without explicit instructions from user is undesirable behavior IMHO. I think the most clear solution is to allow cscs to set utf-8 console encoding only if some command line argument is specified or some setting is explicitly set in ConfigConsole.

четверг, 16 апреля 2015 г., 6:46:06 UTC+3 пользователь Oleg Shilo написал:

Oleg Shilo

unread,
Apr 18, 2015, 8:12:55 PM4/18/15
to cs-s...@googlegroups.com, bog...@gmail.com, osh...@gmail.com
I don't know what exactly went wrong in your test but I just test the a default installation from Chocolatey: 

C:\> choco install cs-script


and it works just fine:



Keep in mind that the encoding name supposed to be exactly the name of the encoding as it's required by System.Text.Encoding.GetEncoding(). Not the DisplayName but Name.

>Console encoding is external environment setting that is visible to other programs and can affect them. So changing this setting without explicit instructions from user is undesirable behavior IMHO.
I actually agree with you but the problem is that I had multiple support requests from Win users using Chinese charset who were struggling with the encoding. The problem had to be solved in such a way that cs-script should 
  • Use the most common encoding right out of box
  • No command line parameters. So parent process (e.g. Notepad++) doesn't have to know about the encoding considerations.
  • The appropriate encoding should be set at the very first execution step so CS-Script error messages could be rendered with Simplified Chinese.
UHF8 was an obvious choice and so far it worked well in Latin, Chinese, Greek and Cyrillic environments. So setting the console encoding as a feature is to stay.

Saying that, I do understand the there is a need to switch off the feature completely. Thus I have implemented support for CSSCRIPT_CONSOLE_ENCODING_OVERWRITE environment variable, which overwrites the script engine encoding behavior.

The following will force CS-Script not to adjust encoding at all (what you are looking for): set SSCRIPT_CONSOLE_ENCODING_OVERWRITE=default

The following will force CS-Script not to set encoding to a specific value: set SSCRIPT_CONSOLE_ENCODING_OVERWRITE=<encoding name>

Inline image 1


If you wish you can test/use the feature without waiting fr the release. Here you can find both test script and script engine executable. Alternative you can build it from the source.
Regards,
Oleg

bog...@gmail.com

unread,
Apr 19, 2015, 8:26:57 PM4/19/15
to cs-s...@googlegroups.com, osh...@gmail.com, bog...@gmail.com
>>I don't know what exactly went wrong in your test
I
figured out what was going on. cscs.exe ignores
ConsoleEncoding setting if it does not run any script actually. To confirm run:

powershell -command [Console]::OutputEncoding.WebName
cscs
/s
powershell
-command [Console]::OutputEncoding.WebName

The second powershell command outputs: utf-8


>>Saying that, I do understand the there is a need to switch off the feature completely. Thus I have implemented support for CSSCRIPT_CONSOLE_ENCODING_OVERWRITE environment variable, which overwrites the script engine encoding behavior.
Why not make this setting available in css_config.exe? Or in cscs.exe command line? In my opinion it would be more convenient.

воскресенье, 19 апреля 2015 г., 3:12:55 UTC+3 пользователь Oleg Shilo написал:

Oleg Shilo

unread,
Apr 26, 2015, 1:27:01 AM4/26/15
to cs-s...@googlegroups.com, osh...@gmail.com, bog...@gmail.com
After the reports about cmd.exe exhibiting defect triggered UTF-8, followed by careful considerations I have decided to disable the encoding change by default.
In the latest release v3.9.8.0 the encoding isn't changed unless it is configured to do so:


bog...@gmail.com

unread,
Apr 27, 2015, 9:22:08 AM4/27/15
to cs-s...@googlegroups.com, bog...@gmail.com, osh...@gmail.com
Glad to hear it, it's the most proper behavior for cscs I think.
But I see some differences between theory and practice. For now I see that in case of ConsoleEncoding equals to "default" or "utf-8" cscs.exe actually changes console output encoding to utf-8 and then restores it to the system default one.

So, the following commands:
powershell -command [Console]::OutputEncoding.WebName
cscs
Script.cs
powershell
-command [Console]::OutputEncoding.WebName

output:
cp866

C
# Script execution engine. Version 3.9.8.0.
Copyright (C) 2004-2014 Oleg Shilo.
System.Text.UTF8Encoding

cp866

Content of Script.cs:
using System;

class Script
{
   
static void Main()
   
{  
       
Console.WriteLine(Console.OutputEncoding);
   
}
}


воскресенье, 26 апреля 2015 г., 8:27:01 UTC+3 пользователь Oleg Shilo написал:

Oleg Shilo

unread,
Apr 27, 2015, 10:03:58 PM4/27/15
to cs-s...@googlegroups.com, osh...@gmail.com, bog...@gmail.com
Well it's not so much the "difference between theory and practice" :)

The solution to the encoding challenge consist of two different use-cases. Defaulting the encoding to "do not change"  and restoring the encoding on exit if it was changed. These two features are completely independent. And you just confirmed that restoring works fine. However, the defaulting is indeed broken. Due to the testing code accidentally left in the codebase my release testing was compromized by the environment variables present on my system. So the defect wen unnoticed. And I apologize for the inconvenience.

I have streamlined the delivery of the update and you can find the corrected release at the CodePlex: https://csscriptsource.codeplex.com/releases/view/614649  

I will delay publishing the fix on Chocolatey so you have the opportunity to test the fix.

BTW you can check the Console encoding by using the "/verbose" switch:

> ----------------                                                                             
  TragetFramework: v4.0                                                                        
  Console Encoding: ibm850 (Western European (DOS))                                            
  CurrentDirectory: E:\...\Projects\CS-Script\Releases\v3.9.8.2\bin                          
  NuGet manager: C:\ProgramData\chocolatey\lib\cs-script.3.9.8.0\tools\cs-script\lib\nuget.exe 
  NuGet cache: C:\ProgramData\CS-Script\nuget                                                  

bog...@gmail.com

unread,
Apr 28, 2015, 6:44:27 AM4/28/15
to cs-s...@googlegroups.com, bog...@gmail.com, osh...@gmail.com
Have checked 3.9.8.2 release, everything is fine with it. Thanks a lot, Oleg!

вторник, 28 апреля 2015 г., 5:03:58 UTC+3 пользователь Oleg Shilo написал:

bog...@gmail.com

unread,
Apr 28, 2015, 9:08:28 AM4/28/15
to cs-s...@googlegroups.com, bog...@gmail.com, osh...@gmail.com
Have found some potential problem. Latest cscs (v3.9.8.2) sets console encoding to original after any script executing, even if the ConsoleEncoding setting is set to "default". So for now it's impossible to make a script that change console output encoding to some desired value because cscs restores encoding to original value after the script executed.

To confirm run:
>powershell -command [Console]::OutputEncoding.WebName
cp866

>cscs Script.cs
C
# Script execution engine. Version 3.9.8.2.

Copyright (C) 2004-2014 Oleg Shilo.


cp866
utf
-8

>powershell -command [Console]::OutputEncoding.WebName
cp866

Contents of the Script.cs:
using System;
using System.Text;


class Script
{
   
static void Main()
   
{

       
Console.WriteLine(Console.OutputEncoding.WebName);
       
Console.OutputEncoding = Encoding.UTF8;
       
Console.WriteLine(Console.OutputEncoding.WebName);
   
}
}

вторник, 28 апреля 2015 г., 13:44:27 UTC+3 пользователь bog...@gmail.com написал:

Oleg Shilo

unread,
Apr 29, 2015, 1:57:53 AM4/29/15
to cs-s...@googlegroups.com
Well, you are right. While the impact of the current behavior is very low I agree it would be more appropriate to do the restoring only for non-default setting. I will update the code correspondingly. Will keep you posted.

bog...@gmail.com

unread,
May 14, 2015, 6:46:54 AM5/14/15
to cs-s...@googlegroups.com, osh...@gmail.com
Have checked the latest v3.9.10.0, the problem has gone.

Oleg Shilo

unread,
May 14, 2015, 8:21:28 AM5/14/15
to cs-s...@googlegroups.com
Great, thanks
Reply all
Reply to author
Forward
0 new messages