How to get a filename as UTF8, compatible with fopen() ?

3,455 views
Skip to first unread message

ardi

unread,
Jul 4, 2012, 2:56:45 PM7/4/12
to wx-u...@googlegroups.com
Hi,

From what I've read, although fopen() accepts UTF8 C-strings in all (most?) systems, some systems encode the filename in their own flavour, so constructing the UTF8 string yourself doesn't always work for fopen().

I've an ISO-8859-1 encoded string filename with international characters. I can successfully open such file via wxFileName.

However, if I use the wxString facilities to convert the filename to utf8, and then open it with fopen(), it fails on MSW (but works fine on Cocoa, IIRC).

So, is there someway of converting this string to an utf8 encoding which will work with fopen() on MSW?

I'm asking this because I'm using a third-party library that opens files expecting a char* filename (otherwise, if I could pass a FILE* pointer, I could use the wxFileName approach, which works fine for me).

TIA

ardi

Eric Jensen

unread,
Jul 4, 2012, 3:27:29 PM7/4/12
to ardi
Hello ardi,

Wednesday, July 4, 2012, 8:56:45 PM, you wrote:

a> So, is there someway of converting this string to an utf8 encoding which
a> will work with fopen() on MSW?

a> I'm asking this because I'm using a third-party library that opens files
a> expecting a char* filename (otherwise, if I could pass a FILE* pointer, I
a> could use the wxFileName approach, which works fine for me).

Windows doesn't use UTF-8, ever.

wxString::fn_str() is probably what you're looking for:
http://docs.wxwidgets.org/stable/wx_wxstring.html#wxstringfnstr

However, if the library you're using only supports "char *", it just
won't be able to load files that can be represented in the current
locale encoding.

E.g if you're on a German system, you'll never be able to load a file
with Japanese chars in the filename if you're stuck with an 8 bit encoding.

HTH
Eric


David Connet

unread,
Jul 4, 2012, 4:15:09 PM7/4/12
to wx-u...@googlegroups.com
Or you could use the windows apis to get the 8.3 short name from the
unicode name. I would think that would work... You would need to use the
wide string version (even if compiling as MBCS).

Dave

ardi

unread,
Jul 4, 2012, 5:46:34 PM7/4/12
to wx-users


On Jul 4, 10:15 pm, David Connet <d...@agilityrecordbook.com> wrote:
> On 7/4/2012 12:27 PM, Eric Jensen wrote:
>
[...]
>
> > Windows doesn't use UTF-8, ever.
>
> > wxString::fn_str() is probably what you're looking for:
> >http://docs.wxwidgets.org/stable/wx_wxstring.html#wxstringfnstr
>
[...]
>
> Or you could use the windows apis to get the 8.3 short name from the
> unicode name. I would think that would work... You would need to use the
> wide string version (even if compiling as MBCS).

Thanks a lot, Eric and Dave. The thirdparty lib in question is open
source, so I think the best option for me is to change its calls to
fopen() by calls to my custom fopen (implemented on top of wxFileName
and wxFFile). I preferred to leave such lib in its original state, but
after reading your replies I believe it's best to patch its fopen()
calls.

Thanks a lot!

ardi

Eric Jensen

unread,
Jul 4, 2012, 6:02:34 PM7/4/12
to ardi
Hello ardi,

Wednesday, July 4, 2012, 11:46:34 PM, you wrote:

>> > Windows doesn't use UTF-8, ever.
>>
>> > wxString::fn_str() is probably what you're looking for:
>> >http://docs.wxwidgets.org/stable/wx_wxstring.html#wxstringfnstr
>>

a> Thanks a lot, Eric and Dave. The thirdparty lib in question is open
a> source, so I think the best option for me is to change its calls to
a> fopen() by calls to my custom fopen (implemented on top of wxFileName
a> and wxFFile). I preferred to leave such lib in its original state, but
a> after reading your replies I believe it's best to patch its fopen()
a> calls.

I need to correct myself. After reading up a bit, it seems Windows
does indeed support an additional encoding parameter to fopen, even
when not using wide chars:
http://msdn.microsoft.com/en-us/library/yeby3zcb.aspx

This may be VS specific, i don't know how it works with MingW.

Eric


Luc Bloom

unread,
Jun 15, 2015, 7:58:46 AM6/15/15
to wx-u...@googlegroups.com, m...@j-dev.de
http://msdn.microsoft.com/en-us/library/yeby3zcb.aspx

Looks like that only opens the FILE* stream in a Unicode-mode.

Or you could use the windows apis to get the 8.3 short name from the unicode name

I don't know how well that holds up with Chinese characters...

I have the same problem for my Lua library. I already made an adjustment to it so it will skip the UTF8 BOM characters that Notepad++ adds (our level designers should not have to take care of that). So now I'm going to add a custom fopen function callback...
Reply all
Reply to author
Forward
0 new messages