Is there no such thing? Or have I just searched the wrong places or
for the wrong things?
--
MartinS
I haven't really thought about it, but do you think that this would
be a good thing?
The application should be localised, so the user can easily understand
the error message. But for the programming language itself, I find it
more important to be able to search the web
(or any other source) for the literal message to get further help.
Besides I would consider english nowadays as the lingua franca for
science and technology.
Sorry for not being really helpful.
Oliver
Which messages are you talking about? Those generated by perl itself
cannot be localised. There has been some discussion of the possibility
in the past, but (among other things) there's just too much code out
there that does regex matching on $@ for this to be very practical
(start with diagnostics.pm, for instance). Most modules don't provide
localized error messages either.
The string value of $!, OTOH, should be localised, though you may need
to call POSIX::setlocale.
Ben
> Quoth Martin Str|mberg <a...@sister.ludd.luth.se>:
>> I've been looking for i18n for perl so that I can't get error messages
>> in Swedish. I haven't found anything, not _any_ other language.
> Which messages are you talking about? Those generated by perl itself
> cannot be localised. There has been some discussion of the possibility
> in the past, but (among other things) there's just too much code out
> there that does regex matching on $@ for this to be very practical
> (start with diagnostics.pm, for instance). Most modules don't provide
> localized error messages either.
Yes I mean those generated by perl itself. I would help my son, whom
I'm trying to teach perl, understand what's wrong.
--
MartinS
> http://search.cpan.org/~audreyt/i18n-0.10/lib/i18n.pm
Hmm, I think so. I don't remember what I searched for, but either I
got 0 results or too many and the top ones wasn't anything I looked
for. I think I looked at that one but dismissed it (then).
Anyway, that is almost what I was looking for. Now all we need is
those ~~ in all perl and the .po files...
Thank you.
--
MartinS
That is for module authors to use to internationalise their modules. It
doesn't help with the messages perl generates itself.
Ben
Thinking about it more:
A) I find the argument that Perl is a "lingua franca of science and
technology" very misplaced. *A Perl application* may be designed
for use by children and/or music lovers.
B) We need something which works as English during parsing, but is
printed out in Bengali. This looks like overloading.
Unfortunately, when I designed overloading, I did not think *at
all* about overloading objects with string semantic. There was
no provision to treat REx matching specially during overloading,
treat print() specially, treat substr() specially, treat index()
specially etc... AFAIK, the current overloading behaves the same
way.
Is there code which analyses $@ any other way than doing =~? If
not, then making overloaded-stringify inspect whether it is
called during REx-matching might me possible (at least with some
modification of perl C code to make the latter condition easier
recognizable...).
Basically, what I think of is making $@ into a 2-headed beast, with 2
different STRING values. What do people think?
Ilya
Um, yes? Who are you arguing with? :)
> B) We need something which works as English during parsing, but is
> printed out in Bengali. This looks like overloading.
>
> Unfortunately, when I designed overloading, I did not think *at
> all* about overloading objects with string semantic. There was
> no provision to treat REx matching specially during overloading,
There is now. I submitted a patch to add 'qr' overloading to 5.12, which
is called when an overloaded object is used on the RHS of =~ or is
interpolated into a regex. Since 5.12 has true REGEX SVs, it seemed
silly not to have a corresponding 'type-cast' overload.
> Basically, what I think of is making $@ into a 2-headed beast, with 2
> different STRING values. What do people think?
There was some discussion of this issue on p5p a while ago, though that
was before qr-overload was different from string-overload. IIRC the
general feeling was that trying to get people to move away from
string-matching $@ by defining a set of numeric error codes for the core
perl errors was probably the best way forward. I don't think any firm
decisions were reached, though, and the question hasn't been revisited
recently.
Ben
>> A) I find the argument that Perl is a "lingua franca of science and
>> technology" very misplaced. *A Perl application* may be designed
>> for use by children and/or music lovers.
> Um, yes? Who are you arguing with? :)
Whoever wrote this (earlier in the thread).
>> B) We need something which works as English during parsing, but is
>> printed out in Bengali. This looks like overloading.
>>
>> Unfortunately, when I designed overloading, I did not think *at
>> all* about overloading objects with string semantic. There was
>> no provision to treat REx matching specially during overloading,
>
> There is now. I submitted a patch to add 'qr' overloading to 5.12, which
> is called when an overloaded object is used on the RHS of =~ or is
> interpolated into a regex. Since 5.12 has true REGEX SVs, it seemed
> silly not to have a corresponding 'type-cast' overload.
Good. But what I was "hinting at" was the LHS of "=~" ...; which is a
string. And "a string" != "a REx" (after appropriate overloading of != ;-)
>> Basically, what I think of is making $@ into a 2-headed beast, with 2
>> different STRING values. What do people think?
> There was some discussion of this issue on p5p a while ago, though that
> was before qr-overload was different from string-overload. IIRC the
> general feeling was that trying to get people to move away from
> string-matching $@ by defining a set of numeric error codes for the core
> perl errors was probably the best way forward.
I would hate numeric errors.
(When my system boots, there is a chance
of getting an error message which essentially says:
!!! SYS2025
!!! SYS2027
Very enjoyable. (Explanation: the system knows many different ways to
expand these to much more human-readable form. But
the system did not even start booting yet, so has no
idea in which language to bash you...))
I would very much prefer "short descriptive english strings" approach.
ERROR_DISK_READ, ERROR_DISK_NONBOOT would have similar (un)convenience
for paper manual lookup, and would have some chance for the meaning to
be guessed without the manual. [*]
Anyway, this is a pipe dream. The code DOES do $@ =~ /foo/.
Ilya
[*] P.S. On the other hand, one of my friends worked as
"non-customer support" in a certain establishment of
more or less technical nature. Non-customers were kinda
engineers. So a call about the error above might sound like:
The 1st and 3rd symbols look like snakes; the second forks;
then comes number 2025. What to do?
(I did not believe this at first, but this did not sound
like a joke.) In such situation, reducing number of
distinct non-digits to two DOES help...
Of course you were. Sorry.
I think the most immediate response from p5p would be 'that's what the
new ~~ operator is for, which is already overloadable', and in general I
would agree that having two 'stringifications' was seriously confusing.
However, since this is (just) for back-compat hacks, it's possible a
case could be made. Maybe I should just do up a patch...
> >> Basically, what I think of is making $@ into a 2-headed beast, with 2
> >> different STRING values. What do people think?
>
> > There was some discussion of this issue on p5p a while ago, though that
> > was before qr-overload was different from string-overload. IIRC the
> > general feeling was that trying to get people to move away from
> > string-matching $@ by defining a set of numeric error codes for the core
> > perl errors was probably the best way forward.
>
> I would hate numeric errors.
>
> (When my system boots, there is a chance
> of getting an error message which essentially says:
>
> !!! SYS2025
> !!! SYS2027
>
> Very enjoyable. (Explanation: the system knows many different ways to
> expand these to much more human-readable form. But
> the system did not even start booting yet, so has no
> idea in which language to bash you...))
The idea was most certainly not to remove the string errors. It was more
along the lines of setting $@ to a dualvar (just like $!), though I
think everyone was assuming it would be worth making this actually an
overloaded object. Obviously the English error messages should be
compiled into the binary, so there is some fallback if whatever i18n
files are needed can't be loaded.
> I would very much prefer "short descriptive english strings" approach.
> ERROR_DISK_READ, ERROR_DISK_NONBOOT would have similar (un)convenience
> for paper manual lookup, and would have some chance for the meaning to
> be guessed without the manual. [*]
The trouble is distinguishing between 'stringify for human consumption'
and 'stringify for unambiguous matching'. Stringify and numify are
already clearly different operations, and there is precedent in the way
$! is used (anyone doing string matching on $! is insane :) ).
Ignoring back-compat, smartmatch would be a good fit here. It would be
trivial to make an object that stringifies to some localized message but
passes $@ ~~ "UNDECLARED_VARIABLE".
> Anyway, this is a pipe dream. The code DOES do $@ =~ /foo/.
That is the problem, yes.
> [*] P.S. On the other hand, one of my friends worked as
> "non-customer support" in a certain establishment of
> more or less technical nature. Non-customers were kinda
> engineers. So a call about the error above might sound like:
>
> The 1st and 3rd symbols look like snakes; the second forks;
> then comes number 2025. What to do?
>
> (I did not believe this at first, but this did not sound
> like a joke.) In such situation, reducing number of
> distinct non-digits to two DOES help...
It sounds entirely believable to me, assuming the caller spoke a
language that doesn't use the Latin alphabet but does use Arabic
numerals. For example, although Chinese has its own numerals, it seems
to be quite common to use Arabic instead.
Ben
Of course. What made is hard to believe is that the country in
question is (at least up to some extend) a part of "extended Europe"
nowadays (IIRC, it takes part in some European sport competitions etc).
And AFAIU the story does not make sense applied to, e.g., Russian-speakers...
Ilya
> That is the problem, yes.
I'm not a perl implementation hacker, but I have problems seeing the
problem.
We take that i18n module and make it a pragma (if it isn't
already). Or invent a new one. In my example below I've called this
pragma "i18n".
Then if "no i18n" is in effect, which is the default "$@" will be what
it always has been, i. e. in English.
However is a program/module uses the pragma "use i18n Swedish" or "use
i18n svenska" or whatever the syntax should be, then every "$@" in
this block/context (or whatever the term is) will suddenly be in
Swedish. And as the programmer added the "use i18n Swedish" he will be
aware of this so he will match "$@" on Swedish strings.
Perhaps this is already what that i18n module on CPAN does.
And the problem is in the perl implementation code there is a lot
matching "$�@" on English strings, which are the strings I want to be
Swedish.
Hmm. Perhaps we should just make a perl wrapper (perlint?) that
translates the output from a perl program if it matches a perl error
message, like "perlint Swedish my_perl_script.pl"? Or perhaps only if
the exit status indicates failure?
MartinS
> I'm not a perl implementation hacker, but I have problems seeing the
> problem.
>
> We take that i18n module and make it a pragma (if it isn't
> already). Or invent a new one. In my example below I've called this
> pragma "i18n".
First question: do you understand difference between lexical and
dynamic scope? If you do, which one is your pragma implementing, and
why would this help?
> However is a program/module uses the pragma "use i18n Swedish" or "use
> i18n svenska" or whatever the syntax should be, then every "$@" in
> this block/context (or whatever the term is) will suddenly be in
> Swedish.
What do you mean by "in this block/context"? Created in this context?
Read in this context?
Do you understand that some $@ are created by Perl executable, and
some by scripts? Which do you mean?
> And the problem is in the perl implementation code there is a lot
> matching "$�@" on English strings, which are the strings I want to be
> Swedish.
Can't parse what you wanted to say...
Yours,
Ilya
Ah... This is probably the reason why FreezeThaw is failing its tests
on 5.11... Whoever added "true REx" SV did not fix the modules broken
by this change...
Ilya
> I think the most immediate response from p5p would be 'that's what the
> new ~~ operator is for, which is already overloadable', and in general I
> would agree that having two 'stringifications' was seriously confusing.
> However, since this is (just) for back-compat hacks, it's possible a
> case could be made. Maybe I should just do up a patch...
Myself, I would not be so quick. As I see it, the problem with
designing a reasonable string-overloading framework is that I do not
have many SIGNIFICANTLY different models to serve as examples
applications of this framework.
I agree with Larry's estimate that "to implement a feature, I must
want it 3 times first". Usually, three "orthogonal" applications
provide enough insight to design a pilot semantic. However, I have
only 2.3 examples in mind:
a) potentially infinite streams: consider
$Pi = infinite_precision_Pi;
print OK if $Pi =~ /123456789/;
or consider
$/ = qr/[1-9][0-9]{50}/; # or something more vicious
$in = <STDIN>; # assume a pipe
system $external_program;
(in second example, we may want to gobble as few characters as
possible while achieving the match for $/).
b1) Strings with out-of-bound markup. E.g., colored output to TTY;
or *parsed* HTML streams (the string value is what you get by
cut&paste, but all the formating info is there, just out-of-bound).
You want to look for certains "features" (e.g., match RExes on
in-bound content + some restrictions on out-of-bound - as in:
find "foo|bar" at start of a "subdivision" [table cell, or div,
or whatsit]).
b2) Same, only not for read-only access, but for modification (as in
my interview to Perl Journal). E.g., suppose you want to
translate a chunk of data from HTML to LaTeX *inplace* (i.e., as
s/// is doing); the translation rules are very non-local; one
must either
re-gather all the non-local information at again and again at
each point, or
gather it once, put it in markup, and use "local structure of
markup at every point" instead of re-gathering.
After this, to do actual translation, one wants to do needed
s/// without ruining the gathered non-local information.
(This is essentially what I do in cperl-mode to facify RExes;
the only difference is that in CPerl, I only touch out-of-bound
part of content. Consider the case when I need to use these
markups to convert RExes from Perl syntax to Emacs syntax...)
(As I said in the interview, Emacs has much better facilities
for string processing than Perl. One of the purposes of string
overloading should be to narrow the gap.)
It would be wonderful if one could use "2-headed 2-language strings"
as the third application of string overloading. The problem is that I
have no idea how one would like to EDIT such strings.
For example: on English strings, one could do something like
s/each/every/g; would one want to do s/$each/$every/ (with suitably
constructed $each and $every) on 2-language strings? So far this
looks too silly to be a help in semantical design...
Hope this helps,
Ilya
Yes. Nick Clark did a lot of rejigging of the SvTYPE assignments. As of
5.10, PVBMs are now implemented as SVt_PVGVs (IMHO this was a mistake,
but it's done now); as of 5.12, RVs are implemented as SVt_IVs (mostly;
some are PVMGs, as they always were) with SvROK set and there is a new
SVt_REGEX used for qr//. There is also a new SVt_BIND, which is
currently unused but is intended for read-only aliases.
As for fixing the modules: with the number of modules on CPAN, this is
impossible. People writing XS modules that grovel around in perl's guts
are expected to keep up with p5p. I believe some effort is made to smoke
CPAN and fix breakages (certainly there was a big push to get the CPAN
smokes clean before 5.10) but in the end p5p can't fix everything.
Ben
IMO, serialization is not "everything". But I'm, of course, biased...
Ilya
(Regardless of the relative merits) FreezeThaw is not the 'standard'
serialization module. That would be Storable, which is core and thus
*will* have been fixed by p5p. AIUI, though, part of the reasoning
behind the current push to move modules out of core is to reduce the
amount of code the pumpking has to keep up to date.
Ben
> (Regardless of the relative merits) FreezeThaw is not the 'standard'
> serialization module. That would be Storable, which is core and thus
> *will* have been fixed by p5p. AIUI, though, part of the reasoning
> behind the current push to move modules out of core is to reduce the
> amount of code the pumpking has to keep up to date.
Note that there is a kinda contradiction in what you wrote.
"FreezeThaw is not the 'standard'" ONLY because Storable was pushed TO
the core.
Yours,
Ilya
> First question: do you understand difference between lexical and
> dynamic scope? If you do, which one is your pragma implementing, and
Sort of. I know lexical scope. That's what makes closures possible, e. g.
Not sure exactly what dynamic scope is. My memory is fuzzy, but was
that what local did?
> why would this help?
I was thinking lexically. The code is written to match English or
e. g. Swedish.
However I suppose if whatever is put in $@ is done in an "English"
module and then will be printed on screen for the user I surely would
like that to be in Swedish, which (I think) implies some dynamic
stuff.
>> However is a program/module uses the pragma "use i18n Swedish" or "use
>> i18n svenska" or whatever the syntax should be, then every "$@" in
>> this block/context (or whatever the term is) will suddenly be in
>> Swedish.
> What do you mean by "in this block/context"? Created in this context?
Just that you can do this with pragmas:
use strict;
# Stricly coded code here.
{
no strict;
# Nasty code here.
}
# Stricly coded code here.
> Read in this context?
I thought read in this context.
> Do you understand that some $@ are created by Perl executable, and
> some by scripts? Which do you mean?
Doesn't matter. When $@ is read while "use i18n Swedish" is in effect
it should be in Swedish, which for me implies some dynamic translation
>> And the problem is in the perl implementation code there is a lot
>> matching "$�@" on English strings, which are the strings I want to be
>> Swedish.
> Can't parse what you wanted to say...
Sorry. I was trying to ask if the perl implementation has code
something like this
if( $@ =~ m/Magic string/ ) {
# Do this.
} else if( $@ =~ m/Another magic string/ ) {
# Do that.
}
Thanks!
--
MartinS
Yes. (Or rather, it *is* what local *does*.)
> > why would this help?
>
> I was thinking lexically. The code is written to match English or
> e. g. Swedish.
>
> However I suppose if whatever is put in $@ is done in an "English"
> module and then will be printed on screen for the user I surely would
> like that to be in Swedish, which (I think) implies some dynamic
> stuff.
Or rather, some situations where the expected contents of $@ are
ambiguous. This is why this is a dead end... (except for the possible =~
hacks Ilya was suggesting.)
> > Do you understand that some $@ are created by Perl executable, and
> > some by scripts? Which do you mean?
>
> Doesn't matter. When $@ is read while "use i18n Swedish" is in effect
> it should be in Swedish, which for me implies some dynamic translation
*What* should be in Swedish? Messages from the perl C code? Messages
from modules?
> >> And the problem is in the perl implementation code there is a lot
> >> matching "$�@" on English strings, which are the strings I want to be
> >> Swedish.
>
> > Can't parse what you wanted to say...
>
> Sorry. I was trying to ask if the perl implementation has code
> something like this
>
> if( $@ =~ m/Magic string/ ) {
> # Do this.
> } else if( $@ =~ m/Another magic string/ ) {
> # Do that.
> }
No, perl itself doesn't do any parsing of that sort. Lots of modules do,
though, starting with diagnostics.
Ben
> Quoth Martin Str|mberg <a...@sister.ludd.luth.se>:
>> Ilya Zakharevich <nospam...@ilyaz.org> wrote:
>> > Do you understand that some $@ are created by Perl executable, and
>> > some by scripts? Which do you mean?
>>
>> Doesn't matter. When $@ is read while "use i18n Swedish" is in effect
>> it should be in Swedish, which for me implies some dynamic translation
> *What* should be in Swedish? Messages from the perl C code? Messages
> from modules?
Perhaps I misunderstand something? This is how it should work:
$@ = "the water";
use i18n Swedish;
print "$@\n"; # Prints "vattnet".
use i18n German;
print "$@\n"; # Prints "der Wasser". (Sorry for any genus maltreatment.)
use i18n French;
print "$@\n"; # Prints "�l'oeau". (Sorry for any misspelling.)
Except (by necessesity) limited to whatever strings are in perl (and
modules when completed).
--
MartinS
> Quoth Martin Str|mberg <a...@sister.ludd.luth.se>:
>> Ilya Zakharevich <nospam...@ilyaz.org> wrote:
>> > Do you understand that some $@ are created by Perl executable, and
>> > some by scripts? Which do you mean?
>>
>> Doesn't matter. When $@ is read while "use i18n Swedish" is in effect
>> it should be in Swedish, which for me implies some dynamic translation
> *What* should be in Swedish? Messages from the perl C code? Messages
> from modules?
Perhaps I misunderstand something? This is how it should work: