I would like to drop this support in KDE Frameworks 5.0. There would be a
fully automatic conversion script for sources to resolve KUIT tags in
existing i18n() calls into appropriate target formats. The reasoning is as
follows.
Firstly, in the past 4 years, KUIT tags didn't get to be used very much.
Only 0.56% of all messages (1144 out of 200,000) contain any. Only 5 out of
24 KUIT tags were used more than 100 times (<filename> being the most used
with 333 appearances). This means that both original strategic goals were
not accomplished: text elements still have different formatting across most
of KDE applications (such as whether filenames are singly or doubly quoted,
bold, etc.), and translators still have little additional semantic
indication of what text placeholders are substituted with.
Secondly, XML processing in strings was made somewhat lax, as a compromise
between ease of use, mixing with existing markup (Qt rich text), and not
changing programming habits. Most conspicuously, string arguments
substituted for placeholders are not automatically escaped, e.g. < into <,
which causes silent non-well formedness behind the scene. In the other
direction, people also complained about < inexpectedly becoming <, etc.
(i.e. the programmer didn't know about the XML nature of i18n() and doesn't
want this at all).
Based on these two observations, I myself would drop KUIT and that's it. But
there are a few heavy users, and I'd like to know if they would "strongly
object" to this. Among them: KAlarm, Partition Manager, DrKonqi, libkcdraw...
One automatic question could be: can we have KUIT as option, default off? In
KDE 4 this was not even technically possible, due to one ugly design problem
of i18n(), but I plan to deal with this problem in KDE 5; so it should be
technically possible. But, given the usage statistics above, I'm not sure if
it makes sense spending time on this. (There would also have to be some
redesign, making everything stricter, e.g. automatic escaping on
substitution and no mixing with Qt rich text. This means that current KUIT
users who would like to continue to use it, would have to do some manual
checking and modification in existing code.)
--
Chusslove Illich (Часлав Илић)
I confirm. They are used much more than tags, and have no problems on their
own; they are simply useful whenever present. They would only have no
functional effect any more (this means dropping /format modifier too).
Personally, I have put a lot of time and effort into adding KUIT into my projects
over the years and think it is a great help, even if just for the developers to understand
how the strings are being used.
True, the semantic tags are harder to use and understand for me in the more complex cases.
Sometimes I'm afraid to touch since I'm not sure the implictions of my change.
I'm really surprised at this proposal.
I'm not getting what's broken nor what's causing problems.
I hope we had a small misunderstanding here. David's earlier message was
precisely to clear that up.
What I want to remove are only in-text tags (like <filename>, <emphasis>,
etc). In-context markers (like @action:button, @option:check, etc) would
certainly remain. There is no technical reason to remove them, and they are
used much more than tags. E.g. in kdepim and kdepimlibs, 16.7% of all
messages have context markers, whereas 1.7% have text tags (6.4%/0.6% for
whole of "trunk"). In fact, context markers can be used as-is in any i18n
system with Gettext-like lookup key semantics.
Is it sufficiently less bad now, or should I address your other points? :)
The original intention of enabling consistent formatting of displayed text
via semantic tags seems a very desirable one. Removing the tags seems to
imply that KDE would abandon the aim of presenting a consistent interface
for such items. If an inconsistent interface is generally considered
acceptable, then I can live with it. But if we really want to try to make
these interface elements consistent, we shouldn't drop the existing scheme
without first considering what might replace it.
Removing the functional effects which context markers have, including the
/format modifiers, might have a significant effect if this makes
everything plain text rather than rich text, so at first sight I'm not too
keen on this idea.
--
David Jarvie.
KDE developer.
KAlarm author - http://www.astrojar.org.uk/kalarm
Based on the (lack of) usage so far, I would say that inconsistent UI text
markup is considered acceptable. Or at least too small an issue to be worth
bothering with.
It occured to me that I could examine usage-over-time statistics, since KDE
4.0. Here is the percentage of strings in core (SC) modules containing KUIT
markup, in 6-month steps:
2008-01-01 0.28%
2008-06-01 0.32%
2009-01-01 0.36%
2009-06-01 0.41%
2010-01-01 0.42%
2010-06-01 0.41%
2011-01-01 0.49%
2011-06-01 0.49%
2012-01-01 0.60%
While there is some rise in usage, I would consider a 0.32% rise in 4 years
to support the "tolerable inconsistency" conclusion above.
> Removing the functional effects which context markers have, including
> the /format modifiers, might have a significant effect if this makes
> everything plain text rather than rich text, so at first sight I'm not too
> keen on this idea.
When KUIT tags are removed on conversion target formats would be heeded,
since they are statically resolvable. So one would end up with some strings
converting to plain text, and other Qt rich text. In other words, it would
become as if these visual formats were used carefully and consistently from
the start.
> [...] if we really want to try to make these interface elements
> consistent, we shouldn't drop the existing scheme without first
> considering what might replace it.
Even if majority of programmers would rather not bother, I agree that it
would be nice to provide for those who would. So, actually, I have
considered a lot what the replacement might be, one which would avoid
technical issues I observed so far, and provide extra flexibility that I've
seen to be needed. I wrote it up in a proposal for Gettext itself, but there
was little enthusiasm. The proposal is here:
http://nedohodnik.net/gettextbis/. Chapter 4 and section 5.1 deal with
markup, and it is easy to extrapolate back to KDE i18n (revert to %1, %2...
placeholders, and consider ggettext() = ki18n() and igettext() = i18n()).
However, I don't propose implementing this now, for two reasons. First is
that it would be some work in absence of significant number of interested
people (which, admittedly, usually does not stop me...), and the second is
that I have a small hope that in the future we could actually push the full
system as proposed :)
Looking at the numbers I'm not sure your optimism is warrented; this feature
has been around for many years and its documented on techbase yet its being
used in very very low numbers. (333 times in all of KDE for the filename tag..)
Sure, it may be ignorance. Frankly, I didn't know about this feature.
The fact that developers didn't know about this feature is just as much
education as that they never needed it and asked how to do it.
I think its nice to be optimistic and think that we can get people to fix their
UIs and suddenly get people to care.
But can we be certain enough of succeeding now where we clearly failed before
that this is actually worth stopping the innovations that Chusslove is working
on?
Read those numbers again; its kinda depressing really;
> Only 5 out of
> 24 KUIT tags were used more than 100 times (<filename> being the most used
> with 333 appearances).
--
Thomas Zander
That's only because we are geeks and don't care if half the time a filename
appears as '/home/tsdgeos/foo.txt' or "/home/tsdgeos/foo.txt" or
BOLD/home/tsdgeos/foo.txtBOLD or whatever.
In a polished environment this is important.
IMHO this is something similar to i18n, needs someone that goes after people
and nags them to fix it.
> But can we be certain enough of succeeding now where we clearly failed
> before that this is actually worth stopping the innovations that Chusslove
> is working on?
I did not understand that it was stopping any innovation, Chusslove can you
clarify if you want to remove them for the sake of simpler code (which I don't
say it's unimportant) or because they create problems with other features you
are developing?
>
> Read those numbers again; its kinda depressing really;
Yes, they are, but to be honest noone pushed for them, what you expected?
Cheers,
Albert
> I have a small hope that in the future we could actually push the full
> system as proposed :)
>
i wouldn't set the hopes too high.
while the system is certainly well thought out, it isn't such a
spectacular improvement (as far as the average dev is concerned) that
you'd have much of a chance to stand against the momentum of the
solutions the various communities have. it's way more likely that you
gain traction when you optimize for minimal migration pain in a
community which is actually in search of a solution. the next qt
contributor summit is in only two months. how about another trip to
berlin? ;)
p.s.: i still have your epic mails in my inbox, and they perfectly serve
the purpose of giving me a bad conscience about still not having
answered them properly. let alone your paper. :}
I think that this feature, as Albert said is something that we should promote
and try to get people to use them.
What we can do thuough is break compatibility and implement them in some other
way since their usage is so low.
The difference here is that there is a way to get consistent look and feel
without using this feature. Whereas with a11y there is not.
Specifically; of the 24 tags how many can you get people to care about the look
and feel sufficiently to make a difference. If history is any guide, just some,
and just a little bit.
> I think that this feature, as Albert said is something that we should
> promote and try to get people to use them.
This defending of "Dont take my feature away, I promise to use it from not on"
just sounds hollow to me.
In reality it will be really hard to actually show significant improvements in
message display to a user over plain html usage, it certainly is infinitely
harder to learn.
For reference; how many of these are really showing something different on
screen that app-developers care about?
http://techbase.kde.org/Development/Tutorials/Localization/i18n_Semantics#Semantic_Tags
In short; the deck is stacked against you, and short of proposing to do the
work, I hope you can take the last 4 years as a guide to how big an uptake
things got.
I personally think we should not tell Chusslove to back out of his plan just
because we *hope* some people other than us will start using this feature.
> What we can do thuough is break compatibility and implement them in some
> other way since their usage is so low.
Good point.
--
Thomas Zander
>> [: Thomas Zander :]
>> But can we be certain enough of succeeding now where we clearly failed
>> before that this is actually worth stopping the innovations that
>> Chusslove is working on?
> [: Albert Astals Cid :]
> I did not understand that it was stopping any innovation, Chusslove can
> you clarify if you want to remove them for the sake of simpler code (which
> I don't say it's unimportant) or because they create problems with other
> features you are developing?
It's not stopping any innovation as such, since I just want to drop it and
add nothing new. But the system cannot remain as it is, because of too many
quirks. To remain, it would have to be fixed, and to be made optional. Both
these aspects are problematic.
"Fixed" would make it require more discipline. For example, one could no
longer do:
QString problem = i18n("Blah blah <emphasis>foom</emphasis>.");
...
QString report = i18n("Blah blah: <note>%1</blah>", problem);
because substitution would cause autoescaping of any target format tags
(e.g. if <emphasis> was turned into <i>), and show them verbatim. Instead,
one would have to do:
KLocalizedString problem = ki18n("Blah blah <emphasis>foom</emphasis>.");
...
QString report = i18n("Blah blah: <note>%1</blah>", problem);
as only KLocalizedString as argument would not be autoescaped (it would be
enforced to be valid wrt. markup).
"Optional" would cause uncertainty. One could not count on KUIT being
available in a particular section of code, but would have to check 1) which
catalog are messages looked up in 2) does that catalog have KUIT enabled
(optionality would be by-catalog). That someone in doubt does not have to be
a human, but also source code/translation validation tool.
These two implications, combined with low usage as it is, makes me conclude
it is not worth investing the work in fixing the system. Higher discipline
and more uncertainty would mean even less people would use it than they do
now.
(The stakes are somewhat different for the more radically new system that I
describe in that proposal for extending Gettext. The higher discipline
requirement would remain, but is (supposed to be) offset by the fact that
you could use the exact same i18n in any programming language and toolkit,
providing availability of bindings, and use arbitrary target visual formats
transparently for translators; i.e. translators would no longer see the
underlying programming framework. The uncertainty aspect would be mostly
removed, because new option to xgettext would be used on extraction, and all
messages in PO file would get appropriate *-format flag, whether they have
any placeholder or not.)
No need to address my other points especially since they are already being discussed.
How?
> Whereas with a11y there is not.
> Specifically; of the 24 tags how many can you get people to care about the
> look and feel sufficiently to make a difference.
None, because the bunch of geek developers [mostly] don't care about look and
feel, that's why we need to expand how community to people that care about
polish.
> If history is any guide,
> just some, and just a little bit.
>
> > I think that this feature, as Albert said is something that we should
> > promote and try to get people to use them.
>
> This defending of "Dont take my feature away, I promise to use it from not
> on" just sounds hollow to me.
That's nonsense, i'm not defending "my feature" since I as a geek don't care
about look and feel and I've never used this feature, but i recognise the fact
that we *should* be caring about it and finding someone in the greater
community to make sure how messaging to the user is consistent.
Albert
Discipline is not a problem, we are used to the compiler complaining when we
use . instead of -> even if it is obviously what we meant. In fact one of the
problems with the current system is that if you do i18n("Foo %1").arg("LALA")
it still works (depending on the type of kdelibs build you have). It should
totally break and then the developer will realize he's doing something wrong.
> "Optional" would cause uncertainty. One could not count on KUIT being
> available in a particular section of code, but would have to check 1) which
> catalog are messages looked up in 2) does that catalog have KUIT enabled
> (optionality would be by-catalog). That someone in doubt does not have to be
> a human, but also source code/translation validation tool.
I agree optional is a bad idea.
> These two implications, combined with low usage as it is, makes me conclude
> it is not worth investing the work in fixing the system. Higher discipline
> and more uncertainty would mean even less people would use it than they do
> now.
That's fine you're the one doing the work and I'm not going to do it nor try
to force you to do it.
OTOH it's another hurdle for adoption of current code from KDE 4 to KF5, that
originally was said to be "transparent" for developers and each day is getting
to look more like a bigger change.
Cheers,
Albert
I would find it even more interesting (but probably more
difficult/fuzzy to
compute) to have the ratio of messages with KUIT markup over messages
with Qt markup or using quotes.
I like the idea of KUIT markup and would be sad to see it go away.
Aurélien