Urgent: how to give a utf8 sys command

103 views
Skip to first unread message

Lars Doucet

unread,
Jun 15, 2016, 12:22:00 PM6/15/16
to Haxe
Hey, sorry for the urgency, but I just ran into a live bug in production that I don't know how to solve.

Basically, if a user directory contains unicode characters, the standard Haxe sys File/Filesystem api's have a really hard time dealing with it.

I've isolated down to this reproduction case in a simple neko app:
Sys.command("mkdir",["C:\\Денис"]);
File.saveContent("C:\\Денис\\test.txt", "Денис");

Doing that will create a directory,
C:\<garbage>

But inside will be test.txt with the correct text content, "Денис".

If I can figure out how to get this simple test case to work maybe I can unspool the entire ball of craziness.

Andy Li

unread,
Jun 15, 2016, 12:33:18 PM6/15/16
to haxe...@googlegroups.com
Sadly, the sys API isn't really unicode-ready. See https://github.com/HaxeFoundation/haxe/issues/4773

You may try to issue "raw" commands (need haxe 3.3 + neko 2.1) in the form of `Sys.command("mkdir C:\\Денис");`. See https://github.com/HaxeFoundation/haxe/wiki/Breaking-changes-in-Haxe-3.3.0#syscommand-and-sysioprocess

--
To post to this group haxe...@googlegroups.com
http://groups.google.com/group/haxelang?hl=en
---
You received this message because you are subscribed to the Google Groups "Haxe" group.
For more options, visit https://groups.google.com/d/optout.

Lars Doucet

unread,
Jun 15, 2016, 12:44:19 PM6/15/16
to Haxe
Thanks for the quick reply! Unfortunately I'm stuck on haxe 3.2 for now, but I'll try a few workarounds and let you know if I think of something.

Lars Doucet

unread,
Jun 15, 2016, 1:05:03 PM6/15/16
to Haxe
Okay, no luck. The issue seems to be that I simply can't get a string out Haxe and into a native system command without getting mangled.

Even calling this, which is about as direct as it gets:
untyped __cpp__("system(\"mkdir C:\\Денис")");

results in:
C:\ДениÑ

So unless someone has a brilliant idea, I need to consider workarounds that aren't about this issue but what I need it *FOR*.

What I need to do in my game is provide a safe & reliable root location for save files. I let the player pick their own save file location, but the problem is that this custom save file location must itself also be stored somewhere. Right now I've been using the user's application storage directory. However, that contains their username, which can contain unicode.... so I could default to something else, but I'm not sure about that either. I can't even guarantee that C:\ exists.

What would one of you do in this situation?

Lars Doucet

unread,
Jun 15, 2016, 2:57:13 PM6/15/16
to Haxe
Alright, MauveCow pointed this out:

var path = "c:\\Денис";
untyped __global__
._wmkdir(path.__WCStr());

Will let me get it working. I'm posting this here for the record in case it comes up in search.

Now I have to figure out how to go from here to fixing up my system calls in general.

Philippe Elsass

unread,
Jun 15, 2016, 2:59:04 PM6/15/16
to Haxe

How are Haxe strings encoded on the cpp side? You may have to convert the filename's String to UTF16 and use Unicode-capable IO cpp functions.

--

Cauê Waneck

unread,
Jun 15, 2016, 3:01:28 PM6/15/16
to haxe...@googlegroups.com
Hey Lars, does that happen only on Windows?

You might want to actually send a hxcpp PR that changes the filesystem system calls to call their wchar version instead

Lars Doucet

unread,
Jun 15, 2016, 3:06:51 PM6/15/16
to Haxe
So MauveCow's last solution ultimately resulted in this in generated C++, which does in fact work:

::String path = HX_HCSTRING("c:\\\xd0""\x94""\xd0""\xb5""\xd0""\xbd""\xd0""\xb8""\xd1""\x81""","\x93","\x1a","\x79","\xc5");        HX_STACK_VAR(path,"path");
HX_STACK_LINE(99)
::_wmkdir(path.__WCStr());

Lars Doucet

unread,
Jun 15, 2016, 3:07:24 PM6/15/16
to Haxe
I've only gotten bug reports from windows users so far, but I'm looking into monkey-patching my version of hxcpp, and thinking it might make a good PR, yes.

Nicolas Cannasse

unread,
Jun 15, 2016, 3:17:06 PM6/15/16
to haxe...@googlegroups.com
Le 15/06/2016 à 18:22, Lars Doucet a écrit :
> Hey, sorry for the urgency, but I just ran into a live bug in production
> that I don't know how to solve.
>
> Basically, if a user directory contains unicode characters, the standard
> Haxe sys File/Filesystem api's have a really hard time dealing with it.

I've worked on Unicode sys API for HL and it's actually quite hard to
get right, in particular because Windows uses UTF16 and Linux UTF8 for
their respective native API wrt file names.

Best,
Nicolas

Juraj Kirchheim

unread,
Jun 15, 2016, 3:41:39 PM6/15/16
to haxe...@googlegroups.com
On Wed, Jun 15, 2016 at 7:05 PM, Lars Doucet <lars....@gmail.com> wrote:
[...]
What would one of you do in this situation?

I'd try nodejs or java or python (in that order, I guess). Not sure it's an option for you.

Best,
Juraj

Lars Doucet

unread,
Jun 15, 2016, 4:36:13 PM6/15/16
to Haxe
Great news!

We've nearly cracked it. Joshua has some fixes here:
https://github.com/HaxeFoundation/hxcpp/commit/854ff818695ecc6c17fc820d974e39a31889531d

and then these two patches of my own:
static value file_open( value name, value r ) {
    val_check(name,string);
    val_check(r,string);
    fio *f = new fio(val_filename(name));
   #ifdef NEKO_WINDOWS
        const wchar_t *fname = val_wstring(name);
        const wchar_t *mode = val_wstring(r);
    #else
        const char *fname = val_string(name);
       const char *mode = val_string(r);
    #endif
    gc_enter_blocking();
    #ifdef NEKO_WINDOWS
        f->io = _wfopen(fname,mode);
    #else
        f->io = fopen(fname,mode);
    #endif
    if( f->io == NULL )
       {
        file_error("file_open",f,true);
       }
    gc_exit_blocking();
    value result =  alloc_abstract(k_file,f);
       val_gc(result,free_file);
    return result;
}

static value file_contents( value name ) {
    buffer s=0;
    int len;
    int p;
    val_check(name,string);
    fio f(val_filename(name));
    #ifdef NEKO_WINDOWS
        const wchar_t *fname = val_wstring(name);
    #else
       const char *fname = val_string(name);
    #endif
    gc_enter_blocking();
    #ifdef NEKO_WINDOWS
        f.io = _wfopen(fname,L"rb");
    #else
        f.io = fopen(fname,"rb");
    #endif
    if( f.io == NULL )
        file_error("file_contents",&f);
    fseek(f.io,0,SEEK_END);
    len = ftell(f.io);
    fseek(f.io,0,SEEK_SET);
    gc_exit_blocking();
    s = alloc_buffer_len(len);
    p = 0;
    gc_enter_blocking();
    while( len > 0 ) {
        int d;
        POSIX_LABEL(file_contents);
        d = (int)fread((char*)buffer_data(s)+p,1,len,f.io);
        if( d <= 0 ) {
            HANDLE_FINTR(f.io,file_contents);
            fclose(f.io);
            file_error("file_contents",&f);
        }
        p += d;
        len -= d;
    }    
    fclose(f.io);
    gc_exit_blocking();
    return buffer_val(s);
}

A bit ugly to be sure, but it gets the job done and lets me ship a patch today. Hopefully someone smart can clean up and apply to hxcpp after it's been properly tested for general use.
Reply all
Reply to author
Forward
0 new messages