Why with seconds?

ZB

unread,

Jul 8, 2009, 11:34:13 AM7/8/09

to

#v+
% clock format [clock seconds] -format %R
17:20
% clock format [clock seconds] -format %r
05:20:46 pm
#v-

I think, it rather can be described as bug, because "%r" time format needs
further processing: seconds cut-out (usually who needs the seconds, anyway?).
--
Zbigniew

Óscar Fuentes

unread,

Jul 8, 2009, 11:28:05 AM7/8/09

to

ZB <zbWITHOUT_THIS@AND_THATjabster.pl> writes:

From the fine manual:

%r
Time in a locale-specific "meridian" format. The "meridian" format in
the default "C" locale is "%I:%M:%S %p".

So you are getting what you asked for.

--
ï¿½scar

ZB

unread,

Jul 8, 2009, 11:48:21 AM7/8/09

to

Dnia 08.07.2009 ďż˝scar Fuentes <o...@wanadoo.es> napisaďż˝/a:

> From the fine manual:
>
> %r
> Time in a locale-specific "meridian" format. The "meridian" format in
> the default "C" locale is "%I:%M:%S %p".
>
> So you are getting what you asked for.

From _my_ fine manual (8.6b1) I'm getting:

#v+
%r On output, produces a locale-dependent time of day representation
on a 12-hour clock. On input, accepts whatever %r produces.
#v-

And even, if that "meridian-dependent" format is defined like you wrote,
IMHO "%r" should be changed to be "seconds-less", just to make "%r" format
available for direct use. Exactly as "%R" can be used instantly (no seconds).

For more sofisticated formatting one can complete his own format string.
--
Zbigniew

Koyama

unread,

Jul 8, 2009, 12:35:03 PM7/8/09

to

On Jul 8, 5:48 pm, ZB <zbWITHOUT_THIS@AND_THATjabster.pl> wrote:
> Dnia 08.07.2009 Óscar Fuentes <o...@wanadoo.es> napisa³/a:

>
> > From the fine manual:
>
> > %r
> > Time in a locale-specific "meridian" format. The "meridian" format in
> > the default "C" locale is "%I:%M:%S %p".
>
> > So you are getting what you asked for.
>
> From _my_ fine manual (8.6b1) I'm getting:
>
> #v+
> %r On output, produces a locale-dependent time of day representation
> on a 12-hour clock. On input, accepts whatever %r produces.
> #v-
>
> And even, if that "meridian-dependent" format is defined like you wrote,

man n clock, oscar is right.

> IMHO "%r" should be changed to be "seconds-less",

backwards comp policies of tcl will say no to this

> just to make "%r" format
> available for direct use. Exactly as "%R" can be used instantly (no seconds).
>
> For more sofisticated formatting one can complete his own format string.

%r is dependent and could be something completly diffrent.

if you do want "%R %p", why dont you just use that? is 3 more letters
so
much a burden?

cheers,
mark

ZB

unread,

Jul 8, 2009, 1:34:26 PM7/8/09

to

Dnia 08.07.2009 Koyama <koya...@gmail.com> napisaďż˝/a:

>> IMHO "%r" should be changed to be "seconds-less",
>
> backwards comp policies of tcl will say no to this

Oh, yeah: that's I was afraid of.

> if you do want "%R %p", why dont you just use that? is 3 more letters
> so much a burden?

No, I don't want "%R %p", because 24h format with AM/PM suffix doesn't make
much sense.

Yes, it's possible to use workaround; probably the simplest one will
be, as I wrote just to cut out the seconds - it's a pity though, that
someone missed such basic thing.
--
Zbigniew

Gerald W. Lester

unread,

Jul 8, 2009, 1:59:41 PM7/8/09

to

ZB wrote:

> Dnia 08.07.2009 Koyama <koya...@gmail.com> napisał/a:
>
>>> IMHO "%r" should be changed to be "seconds-less",
>> backwards comp policies of tcl will say no to this
>
> Oh, yeah: that's I was afraid of.
>
>> if you do want "%R %p", why dont you just use that? is 3 more letters
>> so much a burden?
>
> No, I don't want "%R %p", because 24h format with AM/PM suffix doesn't make
> much sense.
>
> Yes, it's possible to use workaround; probably the simplest one will
> be, as I wrote just to cut out the seconds - it's a pity though, that
> someone missed such basic thing.

No, for what you want the correct format is:

{%I:%M %p} or {%I:%M %P} depending on the case you prefer.

It is all documented on the clock man/help page:

%I
On output, produces a two-digit number giving the hour of the day (12-11) on
a 12-hour clock. On input, accepts such a number

%M
On output, produces the number of the minute of the hour (00-59) with
exactly two digits. On input, accepts two digits and interprets them as the
number of the minute of the hour.

%p
On output, produces an indicator for the part of the day, AM or PM,
appropriate to the given locale. If the script of the given locale supports
multiple letterforms, lowercase is preferred. On input, matches the
representation AM or PM in the given locale, in either case.

%P
On output, produces an indicator for the part of the day, am or pm,
appropriate to the given locale. If the script of the given locale supports
multiple letterforms, uppercase is preferred. On input, matches the
representation AM or PM in the given locale, in either case.

--
+------------------------------------------------------------------------+
| Gerald W. Lester |
|"The man who fights for his ideals is the man who is alive." - Cervantes|
+------------------------------------------------------------------------+

ZB

unread,

Jul 8, 2009, 2:27:18 PM7/8/09

to

Dnia 08.07.2009 Gerald W. Lester <Gerald...@cox.net> napisaďż˝/a:

> No, for what you want the correct format is:
>
> {%I:%M %p} or {%I:%M %P} depending on the case you prefer.
>
> It is all documented on the clock man/help page:

Thanks, I fully agree - as I wrote earlier, I'm aware, that playing with
options one can set even most sophisticated output. My assumption was, that
the most basic things - like usual, quite common time output (european 24h,
US 12h AM/PM; both without seconds) - should be available simpler way.
Actually can't see a reason for adding the seconds into %r other, than most
probably just by mistake.
--
Zbigniew

Jonathan Bromley

unread,

Jul 8, 2009, 2:16:50 PM7/8/09

to

On Wed, 8 Jul 2009 17:34:26 +0000 (UTC), ZB wrote:

>Yes, it's possible to use workaround; probably the simplest one will
>be, as I wrote just to cut out the seconds - it's a pity though, that
>someone missed such basic thing.

Oh, puh-leeze.... how hard is "%I:%M %p"?
Your complaint sounds very much like this
sort of conversation:

Diner: Your salad bar doesn't serve
lettuce salad with croutons,
tomatoes and skinny blue cheese
dressing.
Waiter: No, sir, but you will find plenty
of lettuce, croutons, tomatoes and
a wide choice of dressings including
skinny blue cheese, all in separate
containers. Please feel free to mix
your own salad the way you like it.
Diner: But surely EVERYONE wants lettuce salad
with croutons, tomatoes and skinny blue
cheese dressing! It's ridiculous that
you don't provide it, ready-made.

Sheesh, Tcl even allows you to REDEFINE [clock format]
if you wish to do so - just a little fooling around with
the format string to map "%r" into "%I:%M %p", then
a call to the original [clock format]. Enjoy what
you have, stop whingeing that it doesn't serve up
precisely what you want at the push of a button. Sure
there's a slight asymmetry between %r and %R - that sort
of thing happens with any toolkit that evolves over time
and must preserve back-compatibility throughout. That's
the way it is. If Tcl had no users, and no production
applications, then you could change anything any way you
wanted any time you wanted. Happily, that's NOT the way
it is.
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan...@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.

ZB

unread,

Jul 8, 2009, 2:36:37 PM7/8/09

to

Dnia 08.07.2009 Jonathan Bromley <jonathan...@MYCOMPANY.com> napisaďż˝/a:

> Sheesh, Tcl even allows you to REDEFINE [clock format]

Sorry: I can't understand such approach at all.

If so: why, for example, there are operators like "in", "ni" added (to name
just two) - instead of relaing just on "lsearch"? I think, I know the answer
- do you?

Besides: I'm deeply convinced, that if something is spoiled, there's nothing
wrong in calling it "spoiled". Maybe someone will fix it then (maybe not).
--
Zbigniew

Robert Heller

unread,

Jul 8, 2009, 2:25:56 PM7/8/09

to

At Wed, 8 Jul 2009 17:34:26 +0000 (UTC) ZB <zbWITHOUT_THIS@AND_THATjabster.pl> wrote:

>
> Dnia 08.07.2009 Koyama <koya...@gmail.com> napisa³/a:

>
> >> IMHO "%r" should be changed to be "seconds-less",
> >
> > backwards comp policies of tcl will say no to this
>
> Oh, yeah: that's I was afraid of.
>
> > if you do want "%R %p", why dont you just use that? is 3 more letters
> > so much a burden?
>
> No, I don't want "%R %p", because 24h format with AM/PM suffix doesn't make
> much sense.
>
> Yes, it's possible to use workaround; probably the simplest one will
> be, as I wrote just to cut out the seconds - it's a pity though, that
> someone missed such basic thing.

The "%r" format has no relation to "%R". "%R" is just a shorthand for
"%H:%M".

And since "%H" vs "%I %p" correspond to 24 hour time vs 12 hour time +
am/pm, it makes sense that "%H:%M" (aka "%R") and "%I:%M %p" are the proper
matching "set" (24 hour vs 12 hour).

--
Robert Heller -- 978-544-6933
Deepwoods Software -- Download the Model Railroad System
http://www.deepsoft.com/ -- Binaries for Linux and MS-Windows
hel...@deepsoft.com -- http://www.deepsoft.com/ModelRailroadSystem/

ZB

unread,

Jul 8, 2009, 2:43:18 PM7/8/09

to

Dnia 08.07.2009 Robert Heller <hel...@deepsoft.com> napisaďż˝/a:

> The "%r" format has no relation to "%R". "%R" is just a shorthand for
> "%H:%M".

Same letter - just case difference, so I was supposing, it is just
especially for simplest availability of two most common time formats.
--
Zbigniew

Donal K. Fellows

unread,

Jul 8, 2009, 2:39:06 PM7/8/09

to

ZB wrote:
> If so: why, for example, there are operators like "in", "ni" added (to name
> just two) - instead of relaing just on "lsearch"? I think, I know the answer
> - do you?

I know I have the answer in my backed-up mail archives, plus at at least
one location online. But what has that got to do with anything? [expr]
operators are entirely distinct from [clock format]...

Donal.

Jonathan Bromley

unread,

Jul 8, 2009, 2:57:53 PM7/8/09

to

There's a really interesting "human" question about any toolkit.
How far do you go in providing utility functions to do things
that can easily be expressed using more basic constructs, but
are so commonly needed that it's worth making them available
even though they are strictly redundant?

If you don't provide enough of these utilities, then the toolkit
is perceived to be incomplete and academic (Pascal, anyone?).

If you provide too many, then the majority of users will be
unable to remember what's in the toolkit, and will end up
rewriting most things from scratch anyhow (C++ STL, anyone?).

The boundary between "too little" and "too much" is fuzzy,
and varies dramatically from user to user. My own preference
is emphatically for lean and simple languages/toolkits, but
I freely admit that richer toolkits offer better productivity
for those who have the skill and time to get really familiar
with them. Tcl seems to me to have done an amazing job of
satisfying both ends of that spectrum of demand, with a tiny
kernel that is astonishingly malleable and a repertoire of
commands and options that are a boon for those who can
remember them (or, at least, can remember that they exist
and then look them up in the docs). [clock format] is
one of the few places where there just might be Too Much
Useful Stuff that no-one can be bothered to remember...

tom.rmadilo

unread,

Jul 8, 2009, 3:01:37 PM7/8/09

to

On Jul 8, 11:36 am, ZB <zbWITHOUT_THIS@AND_THATjabster.pl> wrote:
> Dnia 08.07.2009 Jonathan Bromley <jonathan.brom...@MYCOMPANY.com> napisa³/a:

>
> > Sheesh, Tcl even allows you to REDEFINE [clock format]
>
> Sorry: I can't understand such approach at all.

This is how clock formatting/scanning works...even in C.

There are certain primitives defined and these are used to create more
complex patterns, which are represented as one letter keys. During a
scan, the derived shortcuts are expanded into the original complex
pattern. The primitives usually address locale translations, so if you
add new derived patters it is pretty easy.

So all you have to do is to create a new letter to represent your
complex pattern, and then do the substitution yourself, using [string
map], or edit the C code which handles the details...or redefine
[clock] to hide these details and avoid editing the C code.

When you have lots of options you have to deal with two realities:
there can only be one default and there are only so many one letter
keys. It is probably impossible to represent all or even most possible
patterns within these limits.

slebetman

unread,

Jul 8, 2009, 9:49:30 PM7/8/09

to

On Jul 9, 2:43 am, ZB <zbWITHOUT_THIS@AND_THATjabster.pl> wrote:
> Dnia 08.07.2009 Robert Heller <hel...@deepsoft.com> napisa³/a:

>
> > The "%r" format has no relation to "%R". "%R" is just a shorthand for
> > "%H:%M".
>
> Same letter - just case difference, so I was supposing, it is just
> especially for simplest availability of two most common time formats.

Not so at all. As stated previously %r is time in locale specific
format and %R is shorthand for a commonly used time format.

If you happen to have your locale specific time format defined as "%I:
%M %p" then you'll get exactly what you want. (though, I'm not sure
how one goes about doing this)

Same letter different case does not necessarily mean they're related
even though some format specifiers are like this. Consider %m=month
number %M=minute or %h=month name %H=hour

dkf

unread,

Jul 9, 2009, 5:26:21 AM7/9/09

to

On 8 July, 19:57, Jonathan Bromley <jonathan.brom...@MYCOMPANY.com>
wrote:

> [clock format] is one of the few places where there just might be
> Too Much Useful Stuff that no-one can be bothered to remember...

That's what documentation is for; to remember that stuff for you. You
can't remember everything, as there is too much useful stuff out
there. And yes, you also have to think about what you want to do, but
that's programming for you...

Donal.

Kevin Kenny

unread,

Jul 9, 2009, 10:09:44 AM7/9/09

to

All the locale definitions for %r were taken from ICU.
Since I couldn't possibly know what's right in all locales,
I used a publicly-available third-party library and wrote a Tcl
script to transcribe its definitions to Tcl's. ICU's preference
seems to be to display seconds always - at least, that's what
a quick troll through $tcl_library/msgs seems to indicate.

By the way, the 24-hour counterpart to %r isn't %R, but rather %T.
I didn't choose the letters. They're copied from the C strftime()
and strptime() functions - with some attempt to unify various
implementations into a portable whole. The decision to ape
C's format codes predates me, but fundamentally stems from the
fact that the C library was once used to do all the work. (It isn't
any more, because it's not quite portable and most implementations
don't quite work.)

If you don't like it, the easiest way to override it (in the root
locale, which I presume is what you're using) is to execute
some [clock format] command (to make sure that clock is loaded)
and then do:

namespace eval ::tcl::clock {
::msgcat::mcset TIME_FORMAT_12 {} {%I:%M %P}
ClearCaches
}

I don't think that ought to break anything, because %r is used
only for localizing dates/times, and TIME_FORMAT_12 is
used for that in both directions. But, as always with open source,
if it breaks, you own both pieces.

--
73 de ke9tv/2, Kevin

ZB

unread,

Jul 9, 2009, 10:53:38 AM7/9/09

to

Dnia 09.07.2009 Kevin Kenny <ken...@acm.org> napisaďż˝/a:

> If you don't like it, the easiest way to override it (in the root

> locale, which I presume is what you're using) [..]

Thanks, I'm aware of this - as I wrote already - but it just looked out
for kind of oversight. Didn't use time-related functions directly from
C before, so for the first look that seconds-suffix seemed to be
unnecessarily added.

To be more precise: it still looks so :] - but now I'm aware, it's "legacy"
from other library.
--
Zbigniew

Andreas Leitgeb

unread,

Jul 9, 2009, 4:22:33 PM7/9/09

to

Now, that the tcl clock guru had his word, I guess this thread is
mostly done. :-)

One afterword, since some suggestions made throughout the whole
thread were about [string map]ing the format-string:

By using {%% %%} as the first map-pair before any {%r {%I:%M %p}}
you can prevent wrong "%%r" transformations.

I.e., if you want the said redefinition of %r (and want to do it
with [string map] independent of any clock-innards), then use:
[string map {%% %% %r {%I:%M %p}} $origfmt]

One idle test later...

So much for theory, but there's something wrong in my tcl 8.5.2:
clock format [clock seconds] -format "%%r"
should: "%r" (literally)
does: "%I:16:19 pm" (literally, at 10:16:19 pm))
Unless it's already a known bug fixed in a newer version, I'll
report it on sourceforge... I guess, the fix would be along
the previous lines of my post ...

tom.rmadilo

unread,

Jul 9, 2009, 8:44:26 PM7/9/09

to

On Jul 9, 1:22 pm, Andreas Leitgeb <a...@gamma.logic.tuwien.ac.at>
wrote:

> So much for theory, but there's something wrong in my tcl 8.5.2:
> clock format [clock seconds] -format "%%r"
> should: "%r" (literally)
> does: "%I:16:19 pm" (literally, at 10:16:19 pm))
> Unless it's already a known bug fixed in a newer version, I'll
> report it on sourceforge... I guess, the fix would be along
> the previous lines of my post ...

Yikes! Looks like the %r is substituted first. I get the correct value
with linux date. When %r is substituted first, the result is "%%I:%M:
%S %p".

Looks like the first round substitution is incorrect, but after that
it works.

Andreas Leitgeb

unread,

Jul 9, 2009, 9:52:26 PM7/9/09

to

Reported @SF: #2819334.

Kevin Kenny

unread,

Jul 10, 2009, 8:33:32 AM7/10/09

to

Yup. It's a bug.

Kevin

tom.rmadilo

unread,

Jul 10, 2009, 11:13:51 AM7/10/09

to

On Jul 10, 5:33 am, Kevin Kenny <kenn...@acm.org> wrote:

> Andreas Leitgeb wrote:
> > Reported @SF: #2819334.
>
> Yup. It's a bug.

There seems to be a significant change from tcl 8.4 to tcl 8.5. The
new version does not respect the locale setting, LC_TIME.

On linux, you can see the locale specific difference for %r.

Create a tcl script file date.tcl:

puts stdout [clock format [clock seconds] -format "%r"]

Then using tclsh, with different LC_TIME settings:

tom@boron:$ LC_TIME=en_US.utf8 tclsh date.tcl
08:11:08 AM
tom@boron:$ LC_TIME=en_GB.utf8 tclsh date.tcl
8:11:18 am PDT
tom@boron:$ LC_TIME=en_HK.utf8 tclsh date.tcl
AM08:11:25 PDT
tom@boron:$ LC_TIME=C tclsh date.tcl
08:11:38 AM

Using tclsh from 8.5.5:

tom@boron:$ LC_TIME=en_US.utf8 ./tcl8.5.5/unix/tclsh date.tcl
08:14:03 am
tom@boron:$ LC_TIME=C ./tcl/tcl8.5.5/unix/tclsh date.tcl
08:14:12 am
tom@boron:$ LC_TIME=en_HK.utf8 ./tcl/tcl8.5.5/unix/tclsh date.tcl
08:03:31 am
tom@boron:$ LC_TIME=en_GB.utf8 ./tcl/tcl8.5.5/unix/tclsh date.tcl
08:04:23 am

Andreas Leitgeb

unread,

Jul 10, 2009, 6:11:16 PM7/10/09

to

Kevin Kenny <ken...@acm.org> wrote:

> Andreas Leitgeb wrote:
>> Reported @SF: #2819334.
> Yup. It's a bug.

Since I do not see any changes yet at sourceforge's public CVS,
here's the fix:
in library/clock.tcl proc ::tcl::clock::LocalizeFormat
for each of the respective 4 lines like:
set format [string map [list %...
make that:
set format [string map [list %% %% %...

PS: I find this case really peculiar, because I thought of this fix
*before* I even noticed that that bug was in tcl :-)
A fix searching for it's bug, so to speak.

tom.rmadilo

unread,

Jul 10, 2009, 9:19:53 PM7/10/09

to

Okay, I figured out how to get a different locale, but found a problem
related to the [string map] bug.

% clock format [clock seconds] -locale en_hk -format %r
6:09:07 PM
% clock format [clock seconds] -locale en_gb -format %r
%T
% clock format [clock seconds] -format %r
06:10:01 pm

So, the en_hk locale for %r is the same as the unix en_GB format. The
en_gb %r is %T, an obvious error. The command without the -locale
option seems okay. Not sure why it uses lowercase. The unix %r format
uses upper case for PM. The new [clock] command appears to present a
difficult challenge for unit testing.

BTW, it is very cool to be able to change the locale/timezone/etc. on
a per command basis, the main problem is that programs must still get
time and timezone data from the operating system and it seems like any
bugs in this database/code would still show up in the answers
provided.

tom.rmadilo

unread,

Jul 11, 2009, 2:52:07 AM7/11/09

to

On Jul 10, 6:19 pm, "tom.rmadilo" <tom.rmad...@gmail.com> wrote:
> % clock format [clock seconds] -locale en_hk -format %r
> 6:09:07 PM
> % clock format [clock seconds] -locale en_gb -format %r
> %T

The problem with en_gb has to due with this [string map] (from
clock.tcl):

set format [string map [list %r [mc TIME_FORMAT_12] \
%R [mc TIME_FORMAT_24] \
%T [mc TIME_FORMAT_24_SECS]] $format]

The above line maps %r to [mc TIME_FORMAT_12], but en_gb maps
TIME_FORMAT_12 to %T. Unfortunately it is too late to help because the
%T map never sees the previous substitution.

Probably %T needs to be moved to the next [string map].

Kevin Kenny

unread,

Jul 12, 2009, 10:02:23 AM7/12/09

to

Log a bug, please?

Kevin Kenny

unread,

Jul 12, 2009, 10:44:16 AM7/12/09

to

tom.rmadilo wrote:
> On Jul 10, 5:33 am, Kevin Kenny <kenn...@acm.org> wrote:
>> Andreas Leitgeb wrote:
>>> Reported @SF: #2819334.
>> Yup. It's a bug.
>
> There seems to be a significant change from tcl 8.4 to tcl 8.5. The
> new version does not respect the locale setting, LC_TIME.

The old behaviour was inconsistent (it affected [clock format] but
not [clock scan], and didn't actually function on all platforms.
(Nothing took into account the Windows idea of the current
time format, for instance.) The result that [clock format] would
disgorge strings that [clock scan] could not process.

Inspection of a fairly large body of Tcl code showed that most
uses of [clock format] produced strings that were for the consumption
of programs, not people, so the conscious decision was made to
ignore localisation by default (it hadn't completely worked up
through 8.4, in any case). This was discussed thoroughly in the
deliberations leading up to the TIP, and the general consensus
was that repairing the format/scan inconsistencies would fix
more programs than removing default l10n would break.

To make the locale available to applications, two special settings
for '-locale' were added: '-locale current' (whatever [mclocale]
says) and '-locale system' (the current platform's idea of
the time locale).

This is the first time that anyone's noticed the lack of LC_TIME.
That showed up as a side effect of using the 'msgcat' system, which
examines only LANG, LC_ALL and LC_MESSAGES. That's not quite right
for [clock], which ought to replace LC_MESSAGES with LC_TIME.
Fortunately, there appears to be only one place that would need
to be changed to support that. ::tcl::clock::EnterLocale
would have to, on non-Windows platforms, look for LC_TIME, LC_ALL
and LANG and set the system locale accordingly (using C if none
of these is set).

So, log a bug (and attach this discussion, lest I forget).
And as a workaround, set LANG or LC_ALL (and then if you need
something else localised differently, change LC_NUMERIC,
LC_MESSAGES, LC_CURRENCY appropriately).

Kevin Kenny

unread,

Jul 12, 2009, 10:48:47 AM7/12/09

to

tom.rmadilo wrote:
> The problem with en_gb has to due with this [string map] (from
> clock.tcl):
>
> set format [string map [list %r [mc TIME_FORMAT_12] \
> %R [mc TIME_FORMAT_24] \
> %T [mc TIME_FORMAT_24_SECS]] $format]

The planned fix for the logged bug removes this stanza of code
altogether, and processes these format groups in line with
all the others in the [clock scan] and [clock format] compilers.
That seems to be the easiest way to get %% processing right, and
handles the transitive dependencies as a felicitous side effect.

tom.rmadilo

unread,

Jul 12, 2009, 1:12:03 PM7/12/09

to

I think that is more consistent with the way C library code handles
the situation, it divides "format groups" into primitive and derived.
%r being derived, and also locale dependent. The first pass produces a
format string filled only with primitives.

This structure makes it easy to add new derived formats and avoids
brittle code. Obviously the derived formats are recursive and
primitive formats terminate recursion.

tom.rmadilo

unread,

Jul 12, 2009, 1:28:39 PM7/12/09

to

Is the bug in msgcat or in [clock]?

I figured out how to use the -locale setting on [clock], but each
process should be able to choose a locale. The question is if this
choice will mesh up with the msgcat system, the two databases would
have to use the same names for each locale, somewhat defeating the
purpose of maintaining a locale database independent of the system.

I think it is very innovative to have per-command locale
configuration, but I'm unsure how to change the process wide locale
( I always use en_US, so I never have to deal with this).

Kevin Kenny

unread,

Jul 12, 2009, 9:38:04 PM7/12/09

to

tom.rmadilo wrote:
> Is the bug in msgcat or in [clock]?

[clock], if you please.

> I figured out how to use the -locale setting on [clock], but each
> process should be able to choose a locale. The question is if this
> choice will mesh up with the msgcat system, the two databases would
> have to use the same names for each locale, somewhat defeating the
> purpose of maintaining a locale database independent of the system.

The settings of $env(LANG), $env(LC_ALL) and $env(LC_MESSAGES) are
used to determine the locale setting for msgcat. The locale names
are standardized; the ISO language and country codes aren't going
to change, so the only thing that's likely to be at all different
is down at the variant level, where few programmers or users ever
venture.

You have the process-wide locale if you want it - with '-locale system'
But in an application (such as a Web server) that wishes to support
multiple locales, the process-wide locale is far too difficult to
work with; nobody's ever sorted out how to manage mutual exclusion
on the environment variables of a multi-threaded process. (One of
the many botches of Posix-style threading!) The idea of maintaining
a locale per-interpreter (-locale current, and set with the [mclocale]
command) or even per-namespace within an interpreter, is attractive
because it avoids fighting over process-global settings.

> I think it is very innovative to have per-command locale
> configuration, but I'm unsure how to change the process wide locale
> ( I always use en_US, so I never have to deal with this).

You change the environment, that's how. You may also need to call
setlocale() to get the C library to follow your settings. But
don't do that, because it's never going to be thread-safe.

tom.rmadilo

unread,

Jul 12, 2009, 11:50:57 PM7/12/09

to

On Jul 12, 6:38 pm, Kevin Kenny <kenn...@acm.org> wrote:
> tom.rmadilo wrote:

> > I think it is very innovative to have per-command locale
> > configuration, but I'm unsure how to change the process wide locale
> > ( I always use en_US, so I never have to deal with this).
>
> You change the environment, that's how. You may also need to call
> setlocale() to get the C library to follow your settings. But
> don't do that, because it's never going to be thread-safe.

Yet, changing the environment doesn't work with tcl 8.5. If I need to
call setlocale(), that means that I can't change the environment
unless I write C code.

This seems like a strange requirement, since the point of localization
is to produce code which needs no modification for different locales.
For instance, the -locale option to [clock] should be unnecessary if
the programming code has been localized, at least as a default. Of
course if you want to create a virtual localization, it is very
helpful to change the locale on a per command basis. This feature is
totally cool.

Here is what worries me. TIP 173 addressed i18n and forgot L10n. This
choice seems strange to me because without the ability to localize,
there is no need to internationalize. What worries me more is that
none of this really concerns me. I work on linux/unix, I work in en_US
and utf8, so even i18n is unimportant to me.

But i18n and L10n are hard, they defy logic and require lots of data
to overcome these problems. Why would the Tcl community adopt these
difficult issues, which nobody has solved?

However much anyone loves Tcl as a programming environment they must
admit that the Tcl environment is dependent on the operating system
and the local configuration.

Localization is the ultimate goal. Everyone wants to see dates in a
familiar format...and language. But localization is something which
should be inherited from the environment. That means that the system
(operating system as installed) is the ultimate ancestor. Processes
inherit from the OS. I have never heard of a thread level environment,
so I'll skip that. With Tcl, there is the possibility of an interp
level environment, but I don't think this has been realized yet.
Instead we have access to a unique database of locales and locale
inheritance is not well documented.

But why should I care? I use en_US and utf8. Maybe we should ask
someone who has uncommon needs?

Kevin Kenny

unread,

Jul 13, 2009, 12:32:03 AM7/13/09

to

Tom,

One of us is misunderstanding something.

If you use '-locale system', you will get the same locale
a C program will get, inherited from the environment.
So if you're presenting dates to a local user who has set
the environment, [clock format] them with '-locale system'
and the user will see familiar formats. The rule *should* be
LC_TIME in preference to LC_ALL in preference to LANG, but
there's a bug - a bug, plain and simple - that is causing
'-locale system' to use LC_MESSAGES in place of LC_TIME.

It's more and more common nowadays, though, for a process
to be doing work on behalf of a remote user who is in a different
locale. For instance, the user may be a client of a Web
server, the owner of a remote login session, or a user of
an application server. Such a user will have his own locale.
It's *not* appropriate to set the process-global locale for
the sake of that user, because doing so founders on thread
safety. For that reason, there's a specific locale that's
set by [mclocale], that [clock] will use with '-locale current.'
So an interpreter that conducts a sequence of operations on
behalf of a remote user can install the remote user's locale
with [mclocale], and format dates with '-locale current'.
The default for [mclocale] and for '-locale current' is, in
order, $env(LANG), $env(LC_ALL) and $env(LC_MESSAGES). Since
the [mclocale] system doesn't do LC_CURRENCY, LC_NUMERIC,
or LC_TIME, they're ignored. (Having [mc] respect all of those
variables would add a tremendous amount of complexity to
localisation with very little gain from it.)

Transaction-oriented systems may want to format a
date or two without changing the current locale, either at
process or interpreter level. For these systems, [clock]
provides '-locale language_country_variant'. You seem to
be labouring under the misconception that this last - which
is perhaps the least useful - is the only version available
to you.

Finally, the vast majority of the uses of [clock] are for
consumption by programs instead of people. For these routine
cases, simply don't specify a locale at all, and all the
dates will be exchanged in the C locale, which means that they'll
be the same wherever the user is.

Now, it has turned out in practice that the POSIX locale,
for all that is standard, is just not very useful. Changing
it is not thread-safe. It also tends not to be library-safe.
Libraries tend to assume, for instance, that using sprintf
with a %f format will render the decimal point as '.', not
as ',' or '�'. Tcl itself makes that assumption internally.
The only way to avoid making that assumption is either to
avoid using the C library to format and scan numbers altogether,
or to bracket every call to a function like sprintf, sscanf,
or strtod with code to seize a process-global lock, save the
current LC_NUMERIC setting, set LC_NUMERIC to "C", scan or format
the number, restore LC_NUMERIC, and release the lock. Moreover,
every library that the process uses must do the same thing, and
agree on what global lock protects the locale. Since Posix
does not mandate such a lock, nobody agrees on how to do it,
and in fact the folklore is pretty much that it isn't safe to
change LC_NUMERIC.

So Tcl mostly sidesteps the issue by maintaining a locale
that is not shared among interpreters and ignoring the one
that the C library uses (except that it initialises a local
copy from it). Since interpreters are not shared among threads,
this eliminates all the locking that would otherwise be
required.

For the Tcl locale, it also seemed that the multiplicity of
LC_ constants was a needless complexity. How often is it that
an application will want to display its messages in Swiss German,
but format times for French Canada while using the Turkish
piastre as the unit of currency? Splitting the locale in that
way seems just weird. So we didn't do it.

We're not ignorant of the Posix standard for internationalisation.
We simply understand its flaws - and don't want to replicate them.

The exchanges that we've had about TDBC make me realize that
you believe that standards must be respected - no matter how
little technical sense their political compromises make. So I'm
expecting you'll disagree violently with every sentence of this
long post. So be it.

tom.rmadilo

unread,

Jul 13, 2009, 1:31:14 AM7/13/09

to

On Jul 12, 9:32 pm, Kevin Kenny <kenn...@acm.org> wrote:
> Tom,
>
> One of us is misunderstanding something.
>
> If you use '-locale system', you will get the same locale
> a C program will get, inherited from the environment.

The definition of localization means that you don't need to change the
source code, so -locale system is a change, therefore localization has
not occurred if you have to use that option.

Localization means that the program code does not need to change with
changing locales. At the very least, you should explain how
localization occurs so that programmers can write code which adapts to
the locale. Obviously this has changed with Tcl 8.5. It doesn't matter
how much better your code is compared to the previous version, change
is what screws everyone up...big time.

But I'm really more concerned about the responsibility assumed by the
Tcl community. We must now maintain the timezone and locale data.
Timezone data is political and locale data is "personal". Why should a
minor programming language take on such responsibility when no other
language has?

dkf

unread,

Jul 13, 2009, 5:01:08 AM7/13/09

to

On 13 July, 06:31, "tom.rmadilo" <tom.rmad...@gmail.com> wrote:
> The definition of localization means that you don't need to change the
> source code, so -locale system is a change, therefore localization has
> not occurred if you have to use that option.

That change is internationalization, not localization. In particular,
it is the explicit decision that at that point in the program you want
to respect the system rules for localization. It was decided (after
much soul searching) that the breakage due to having to say this for
dealing with user-oriented stuff was less than making computer-to-
computer communications work by default; the basis for this decision
was examination of existing deployed code.

> Localization means that the program code does not need to change with
> changing locales. At the very least, you should explain how
> localization occurs so that programmers can write code which adapts to
> the locale. Obviously this has changed with Tcl 8.5. It doesn't matter
> how much better your code is compared to the previous version, change
> is what screws everyone up...big time.

Before, we were thoroughly screwed up anyway. Maybe your tiny little
piece was working, but overall we were not. (We weren't even able to
parse the times that we were generating, even when they were ISO-8601
formatted, for goodness' sake!)

> But I'm really more concerned about the responsibility assumed by the
> Tcl community. We must now maintain the timezone and locale data.
> Timezone data is political and locale data is "personal". Why should a
> minor programming language take on such responsibility when no other
> language has?

We use the standard timezone files in the standard locations. We only
redistribute them for installation on broken platforms that don't
provide the data already (hello, Windows!) and then we don't modify
them in any way. The only locale data that we are distributing (apart
from for our dialogs and demos) is the data so that we can format and
parse times better. (FWIW, we do a better job of that than OSX does,
distributing more different time locales than they do.)

Donal.

Kevin Kenny

unread,

Jul 13, 2009, 9:44:02 AM7/13/09

to

tom.rmadilo wrote:
> The definition of localization means that you don't need to change the
> source code, so -locale system is a change, therefore localization has
> not occurred if you have to use that option.

>
> Localization means that the program code does not need to change with
> changing locales. At the very least, you should explain how
> localization occurs so that programmers can write code which adapts to
> the locale. Obviously this has changed with Tcl 8.5. It doesn't matter
> how much better your code is compared to the previous version, change
> is what screws everyone up...big time.

Prior to 8.5. [clock] was localised only by accident, only on some
platforms, and only for formatting, not scanning. Any attempt at
a fix was going to cause some change. Given that the least disruptive
change was going to be making things correct for the majority use
case - which didn't want things localized in the first place -
we went that way. Now we assert that [clock] is internationalised.
We didn't before. And we assert that using -locale is the way
you access the internationalization.

So yes. It changed. It's not likely to change again, because now
it works. (Well, ok, there are a couple of fiddly things, like handling
the corner case where LC_TIME is different from LC_MESSAGES.)
It didn't work before. You're really starting to sound
as if you're complaining that [puts {Hello, world!}] doesn't
access msgcat automatically: you have to do [puts [mc {Hello, world!}]]
to get your messages localized.

> But I'm really more concerned about the responsibility assumed by the
> Tcl community. We must now maintain the timezone and locale data.
> Timezone data is political and locale data is "personal". Why should a
> minor programming language take on such responsibility when no other
> language has?

The choice of locale is personal - "I speak English, live in the US,
and prefer the US International keyboard layont to the one that's
on the keycaps." But the localised strings - the locale itself, rather
than the choice - are provided by an application, generally because
a programmer or translator put them in place.

With Tcl, the timezone data is used only on systems that don't
provide any other good way of having a process access foreign
time zones. In other words, Windows and a few Unix systems like
HP-UX that still don't have zoneinfo. If you don't specify
'-timezone', you are likely to get a result that agrees 100%
with what the C library gives you, because it comes from the
same file! Moreover, Tcl's timezone data is derived from the
exact same source that Linux, the BSD's, MacOSX, the JRE,
and many lesser systems use: the curated time zone information
maintained at ftp://elsie.nci.nih.gov/. When they announce a new
version (a little more than once a month on average), it's
a five-minute job for a maintainer to run tools/tclZIC.tcl
and commit the result, and then it goes in the next patch release.

Locales aren't quite as automated, but they similarly come
from a curated source that everyone uses - http://site.icu-project.org/
They don't have to be quite as automated, because they're stable.
Governments don't suddenly turn around and say, "from henceforth,
the month and day on bank statements must be changed from
dd,mm,yy to mm/dd/yy". (They *do* make sudden changes with
time zones.)

Yes, we've had a few bugs along the way. But they've been
much less of a headache for us than dealing with the localization
bugs of the underlying platforms. At least, they're our bugs
and we can fix them. Before, we've faced issues like having
arithmetic suddenly quit working because a printer driver on Windows
set the C library locale to one where the decimal point is a
comma - in another thread - and our internal calls to sprintf()
and strtod() to format and scan numbers suddenly got commas
for decimal points. That's the level of brokenness you encounter
when dealing with vendor-supplied i18n. It appears to work for
a while, and then craps out spectacularly and without warning.
And it's hard to see how that brokenness can be avoided without
fairly pervasive changes to the Posix specification.

It really is that bad. And our users don't appreciate the answer,
"it's your C library, go out and get the vendor to fix it."
Particularly when the vendor is a behemoth like Microsoft, whose
answer is, "you shouldn't be using C anyway, go out and redevelop
your application in a language that only we support. And do it
again every two years from now to doomsday, because we'll come
out with new componentry with every operating system release and
sometimes in between."

As you would know if you were around when http://tip.tcl.tk/173
was discussed. It's much more effective if you criticise these
decisions *before* the vote is taken and the implementation is
done.