Fluid generated \303\251 instead of é

542 views
Skip to first unread message

Jean-Yves Garneau

unread,
Nov 18, 2016, 8:06:06 AM11/18/16
to fltk.general
Is it possible to generated true utf8 file from FLUID instead of backslash sequence?

Albrecht Schlosser

unread,
Nov 18, 2016, 10:45:40 AM11/18/16
to fltkg...@googlegroups.com
On 18.11.2016 14:06 Jean-Yves Garneau wrote:
> Is it possible to generated true utf8 file from FLUID instead of
> backslash sequence?

Which FLTK version are you using, and what is the context?

Please post a short (minimal) fluid (.fl) file that shows the issue, and
if you created it interactively, please tell us what you did, what is
"wrong", and what you expect it to do.

Jean-Yves Garneau

unread,
Nov 18, 2016, 11:32:55 AM11/18/16
to fltk.general, Albrech...@online.de
This is the minimal test files make with FLUID. The generated test.cxx file contains the string "R\303\251server" instead of "Réserver". See below.

// generated by Fast Light User Interface Designer (fluid) version 1.0303

#include "test.h"
UserInterface::UserInterface(int X, int Y, int W, int H, const char *L)
  : Fl_Group(X, Y, W, H, L) {
{ new Fl_Button(55, 35, 205, 55, "R\303\251server");
} // Fl_Button* o
end();
}
test.fl
test.cxx
test.h

Albrecht Schlosser

unread,
Nov 18, 2016, 1:55:43 PM11/18/16
to fltkg...@googlegroups.com
I see...

Why exactly are you asking, i.e. what is you problem with this encoding?
AFAICT it works, but you may not be able to read and edit the generated
source files. Are there other reasons?


Changing this is not as simple as it might look at a first glance. There
are some points to consider:

(1) The chosen encoding with \nnn ensures that the generated file
contains only ASCII characters. This will always work with all
compilers, independent of locale settings etc..

We'd have to check consequences if we did not do that encoding. Some
compilers might have issues (particularly I'm thinking of MS Visual
Studio...). I'm not sure if we should do that. After all we might need
an option so the developer can choose if s/he needs the encoding or
prefers clear, readable text.

(2) The written text is limited to a certain number of characters per
line. This doesn't always work as expected, but basically it works.

(3) This line breaking mechanism works encoding agnostic, i.e. it would
break lines in the middle of UTF-8 characters if we just wrote bytes w/o
taking into account character limits. That would still work, but the
issue would be similar as before (unreadable characters, but only at
line breaks).

(4) Line breaking (if implemented "as expected") with UTF-8 characters
would have to count *characters* (not bytes).

That said, AFAICT the fluid (.fl) file does not limit the string length
and writes raw (UTF-8 encoded) strings. I tested with some longer lines,
but didn't check the code yet.

FYI: The code that writes the strings to the .cxx file is in
fluid/code.cxx, function write_cstring(const char *s, int length).

I experimented with the function so it didn't "octal-encode" the string,
but I found another, maybe unrelated, issue. Looking into that issue now.

I just wanted to let you know...

PS: Question to others: what is your feeling of potential compiler
issues with un-encoded (i.e. UTF-8) strings in the generated .cxx and .h
files. Is there anybody who expects problems with _his_ compiler?

Jean-Yves Garneau

unread,
Nov 18, 2016, 2:41:11 PM11/18/16
to fltk.general, Albrech...@online.de
Presently, I have no needs to translate so I use French in FLUID but the generated cxx file have contains UTF8 encoded in ASCII and it's annoying visually and difficult to search. French caracter can be hold by a single byte (e.g. é = ASCII 230). I'm using Visual Studio 2015 and searching the better way using FLTK, FLUID and make French GUI.

The better solution for now, is to use english in FLUID, use PoEdit and translate in French (and use Gettext). But I must translate all GUI from French to English finding the good traduction, and our customers want only French. More works for less. 

If any know a better solution, please tell me?

Albrecht Schlosser

unread,
Nov 18, 2016, 3:11:10 PM11/18/16
to fltkg...@googlegroups.com
On 18.11.2016 20:41 Jean-Yves Garneau wrote:
> Presently, I have no needs to translate so I use French in FLUID

It's absolutely okay to use French in fluid. Other users use Russian,
and you can use every language you like.

> but the
> generated cxx file have contains UTF8 encoded in ASCII and it's annoying
> visually and difficult to search.

Yes, this is the point, I know. I didn't say it's impossible to write
out "readable" text, but it is not something that can be done easily in
a minute.

> French caracter can be hold by a single byte (e.g. é = ASCII 230).

That is no longer true with UTF-8 encoding. FLTK uses UTF-8 since FLTK
1.3.0, and FLTK expects its input (e.g. strings in .cxx files) to be in
UTF-8 encoding.

http://www.fileformat.info/info/unicode/char/e9/index.htm

The UTF-8 encoding of "Unicode Character 'LATIN SMALL LETTER E WITH
ACUTE' (U+00E9)" is two bytes: 0xC3 0xA9 (octal: 303 251), which is what
you see in the source file.

> I'm using Visual Studio 2015 and
> searching the better way using FLTK, FLUID and make French GUI.
>
> The better solution for now, is to use english in FLUID, use PoEdit and
> translate in French (and use Gettext). But I must translate all GUI from
> French to English finding the good traduction, and our customers want
> only French. More works for less.

If you only need French you should stay with your French GUI. Using
gettext with fluid isn't easy and to translate your text to English
wouldn't be useful.

> If any know a better solution, please tell me?

Would you be willing to patch your FLTK sources? I _might_ be able to
provide you with a patch that would enable you to write out "readable"
text in generated files (no promise, just a question if this would help
you).

Jean-Yves Garneau

unread,
Nov 19, 2016, 4:59:10 PM11/19/16
to fltk.general, Albrech...@online.de
Tell me, if FLUID use UTF-8 internally, it's easy to add general option to generate UTF-8 file from FLUID, with or without BOM? It's just a fwrite(), no? 

A patch can be fine for me, but everybody today want to use universal caracter coding without code page and no 7 bits ASCII.

VS2015 support utf-8 and compile well.

Thank you for your support!

Albrecht Schlosser

unread,
Nov 20, 2016, 8:47:41 AM11/20/16
to fltkg...@googlegroups.com
On 19.11.2016 22:59 Jean-Yves Garneau wrote:
> Tell me, if FLUID use UTF-8 internally, it's easy to add general option
> to generate UTF-8 file from FLUID, with or without BOM? It's just a
> fwrite(), no?

No, it's not just a fwrite(). There can always be characters inside a
string that must be quoted (decimal 0-31, e.g. 10 = 0x0a = <LF> = '\n')
or DEL (decimal 127). The current fluid code does also quote all
character values in the range 128 to 255.

I did not write the code, but I can only assume that this is always safe
for all compilers, as I wrote before.

> A patch can be fine for me, but everybody today want to use universal
> caracter coding without code page and no 7 bits ASCII.

I agree, but it's still problematic. The patch I append should work for
all Unicode characters if the compiler you use interprets strings as UTF-8.

> VS2015 support utf-8 and compile well.

I don't know the Visual Studio compilers very well, but I know of their
option to define UNICODE (not sure, maybe something similar?) and use
the TEXT macro for strings to distinguish ASCII and "Unicode"
compilations. In case of "Unicode" they expect "their own" wide
character encoding (UTF-16), AFAICT. I'm not sure about the
implications, but if you don't define Unicode then it should just work.

> Thank you for your support!

Welcome.

Now to the patch: I attach three files to this post for later reference:

(1) test.fl: a modified version of your fluid file with all ISO-8859-1
characters encoded as UTF-8 (only extended range, not ASCII part). This
is also a subset of Microsoft's Windows Codepage 1252 ("Western").
Unicode range U+00a0 to U+00ff).

(2) main.cxx: a main program to compile test.cxx. This #include's
test.cxx and indirectly test.h generated by fluid from test.fl (I didn't
want to add a main program to your test.fl file).

(3) fluid_write_code_utf8.patch: the patch against FLTK 1.3.4 (stable
release).

This patch basically does three things:

- Fix reading character string bytes "unsigned", i.e. in range 0-255.
- Don't limit line length to avoid breaking lines inside UTF-8 char's.
- Write all ASCII and UTF-8 characters literally, i.e. without quoting.

You may use this patch if it works for you. Note that this is tested
with your and my modified test cases, resp., but I'm not sure if this
will be okay for all users and compilers. Please report if it works for you.

Note: this will not be integrated if FLTK 1.3.x because FLTK 1.3 is
closed for new features. If you want this to be in FLTK 1.4 please file
an STR with status RFE (Request for enhancement) for FLTK 1.4
("1.4-feature") at our "Bugs & Features") page:

http://www.fltk.org/str.php

Note 2: A "complete" solution would split strings (limit line length)
w/o breaking inside UTF-8 characters and would presumably have an option
to switch literal UTF-8 output on and off (on: literal/new vs. off:
octal-quoted/old behavior).

test.fl
main.cxx
fluid_write_code_utf8.patch

Jean-Yves Garneau

unread,
Nov 21, 2016, 8:20:58 AM11/21/16
to fltk.general, Albrech...@online.de
Thank you. I currently used fltk 1.3.3 but I will change to 1.3.4 in next few days. I will try and return comment.

Maybe you can use literal string prefix available in C++11 like u8,u and U?

Albrecht Schlosser

unread,
Nov 21, 2016, 9:16:32 AM11/21/16
to fltkg...@googlegroups.com
On 21.11.2016 14:20 Jean-Yves Garneau wrote:

> Maybe you can use literal string prefix available in C++11 like u8,u and U?

No, we can't because we don't use C++11.

However, this could be another option for generating user code in FLTK
1.4.0.

As I wrote before, please file an STR (RFE) for FLTK 1.4 and ask for
this feature...
http://www.fltk.org/str.php

Note: fltkgeneral is good for discussion, but there is no permanent
to-do-list or such, hence discussions here may be forgotten. The STR
system is a permanent storage for bug reports and RFE's.

Jean-Yves Garneau

unread,
Nov 21, 2016, 11:57:59 AM11/21/16
to fltk.general, Albrech...@online.de
I have submit a RFE for that. Thank You.
Reply all
Reply to author
Forward
0 new messages