Utf8 in the Lua standalone interpreter on Windows

90 views
Skip to first unread message

Stephen Hewitt

unread,
Apr 24, 2026, 1:06:52 PM (9 days ago) Apr 24
to lua-l
Currently on Windows you can't open a file that contains utf8 characters in the standalone interpreter.Similarly, the console chokes on utf8. For me this fixes it (lua.c):


#ifndef LUA_NO_UTF8_IN_STANDALONE_INTERPRETER

#include <fcntl.h>

#include <locale.h>



static void setup_utf8() {

   setlocale(LC_ALL, ".UTF-8");

   SetConsoleCP(CP_UTF8);

   SetConsoleOutputCP(CP_UTF8);

   _setmode(_fileno(stdout), _O_BINARY);

   _setmode(_fileno(stdin), _O_BINARY);

}

#else

static void setup_utf8() {}

#endif


AND


int main (int argc, char **argv) {

   int status, result;

   lua_State *L = luaL_newstate(); /* create state */

   setup_utf8();

Sww here for more context: 
https://github.com/lua/lua/compare/master...shewitt-au:lua:windows

Thijs Schreijer

unread,
Apr 29, 2026, 9:03:32 AM (4 days ago) Apr 29
to lu...@googlegroups.com


On Fri, 24 Apr 2026, at 18:55, Stephen Hewitt wrote:
Currently on Windows you can't open a file that contains utf8 characters in the standalone interpreter.Similarly, the console chokes on utf8. For me this fixes it (lua.c):


I haven't tried reading data from files, but using LuaSystem we have had no issues reading UTF8 input in Terminal.lua. LuaSystem contains the functions for setting the Console codepages. I've never had to set the locale or set the mode to binary.

Thijs

Thijs Schreijer

unread,
Apr 30, 2026, 9:36:54 AM (3 days ago) Apr 30
to lu...@googlegroups.com


On Wed, 29 Apr 2026, at 19:26, Stephen Hewitt wrote:
When I try to open a file with characters that are not in the current console code page it failes without my changes. I'm using this folder name: "C:\有問題的". Without the changes, for me at least, the console does not even accept the Chinese characters and the open fails. I can not find where lua sets the console code pages you say it contains. Can you point my to it? Setting to binary. while not strictly necessary, makes the behavious more consistant with other systems, unless I am mistaken.

Ok, so that is something else; the UTF8 characters are in the filename, not its contents. This could indeed be more problematic, though I'm no expert on the subject, iirc lfs (luafilesystem module) also had some issues with unicode characters on Windows.

As for setting the console pages, those are not standard Lua calls, but they are available in LuaSystem, see: https://lunarmodules.github.io/luasystem/modules/system.html#Terminal_UTF_8

Thijs
Reply all
Reply to author
Forward
0 new messages