We are dealing with a limited subset of the Unicode set and have a list of
unique names for the characters we use for over 12 years now, and I wanted to
make aliases for these to Unicode official names.
The proposed change allows several new features to charnames:
Addition 1:
--8<---
use charnames ":full", {
my_name => "FULL UNICODE OFFICIAL NAME",
e_ACUTE => "LATIN SMALL LETTER E WITH ACUTE",
};
-->8---
Addition 2:
Given a file named "unicore/pro_alias.pl", findable in @INC filled like:
--8<--- /pro/lib/perl5/site_perl/5.8.0/unicore/pro_alias.pl
#!/usr/bin/perl
(
A_GRAVE => "LATIN CAPITAL LETTER A WITH GRAVE",
A_CIRCUM => "LATIN CAPITAL LETTER A WITH CIRCUMFLEX",
A_DIAERES => "LATIN CAPITAL LETTER A WITH DIAERESIS",
A_TILDE => "LATIN CAPITAL LETTER A WITH TILDE",
A_BREVE => "LATIN CAPITAL LETTER A WITH BREVE",
A_RING => "LATIN CAPITAL LETTER A WITH RING ABOVE",
A_MACRON => "LATIN CAPITAL LETTER A WITH MACRON",
:
:
lMDOT_IDX => "LATIN SMALL LETTER L WITH MIDDLE DOT",
lSTROKE_IDX => "LATIN SMALL LETTER L WITH STROKE",
oSLASH_IDX => "LATIN SMALL LETTER O WITH STROKE",
SMALL_OE_IDX => "LATIN SMALL LIGATURE OE",
RINGELS_IDX => "LATIN SMALL LETTER SHARP S",
SMALL_THORN_IDX => "LATIN SMALL LETTER THORN",
tSTROKE_IDX => "LATIN SMALL LETTER T WITH STROKE",
SMALL_ENG_IDX => "LATIN SMALL LETTER ENG",
IEM_IDX => "INVERTED EXCLAMATION MARK",
)
-->8---
I can now do
use charnames ":pro";
if a ":name" is the *only* argument, it is automatically promoted to ":full"
after the aliasses have been read. Otherwise, you have to support it yourself:
use charnames ":short", ":pro";
The anonymous HASH is only supported as last argument.
Once that is done, the :name is only supported as last argument.
use charnames ":short", ":pro",
{ A_TILDE => "LATIN CAPITAL LETTER A WITH TILDE" };
IMHO *very* useful. Opinions?
--- lib/charnames.pm 2002-05-31 13:07:59.000000000 +0200
+++ lib/charnames.pm 2002-09-26 11:20:46.000000000 +0200
@@ -38,8 +38,18 @@ my %alias2 = (
'PARTIAL LINE UP' => 'PARTIAL LINE BACKWARD',
);
+my %alias3 = (
+ # User defined aliasses. Even more convenient :)
+ );
my $txt;
+sub alias (@)
+{
+ @_ or return %alias3;
+ my %alias = ref $_[0] ? %{$_[0]} : @_;
+ @alias3{keys %alias} = values %alias;
+ } # alias
+
# This is not optimized in any way yet
sub charnames
{
@@ -48,11 +58,14 @@ sub charnames
if (exists $alias1{$name}) {
$name = $alias1{$name};
}
- if (exists $alias2{$name}) {
+ elsif (exists $alias2{$name}) {
require warnings;
warnings::warnif('deprecated', qq{Unicode character name "$name" is deprecated, use "$alias2{$name}" ins
tead});
$name = $alias2{$name};
}
+ elsif (exists $alias3{$name}) {
+ $name = $alias3{$name};
+ }
my $ord;
my @off;
@@ -156,6 +169,14 @@ sub import
## fill %h keys with our @_ args.
##
my %h;
+ if (@_ and ref $_[-1] eq "HASH") {
+ alias (pop);
+ }
+ if (@_ and $_[-1] =~ m{:(?!full|short)\w+$}) {
+ (my $file = pop) =~ s{:(.*)}{unicore/$1_alias.pl};
+ alias (do $file);
+ @_ == 0 and @_ = (":full");
+ }
@h{@_} = (1) x @_;
$^H{charnames_full} = delete $h{':full'};
--
H.Merijn Brand Amsterdam Perl Mongers (http://amsterdam.pm.org/)
using perl-5.6.1, 5.8.0 & 633 on HP-UX 10.20 & 11.00, AIX 4.2, AIX 4.3,
WinNT 4, Win2K pro & WinCE 2.11. Smoking perl CORE: smo...@perl.org
http://archives.develooper.com/daily...@perl.org/ per...@perl.org
send smoke reports to: smokers...@perl.org, QA: http://qa.perl.org
> Given a file named "unicore/pro_alias.pl", findable in @INC filled like:
> ...
> use charnames ":pro";
>
> if a ":name" is the *only* argument, it is automatically promoted to ":full"
> after the aliasses have been read. Otherwise, you have to support it yourself:
> use charnames ":short", ":pro";
Good idea but I have small problems with the implementation... going
just by the name for the @INCable file feels ... unsafe. How about a
keyword-value pair:
use charname load => "pro";
And then you could have just "pro.pl" in your @INC.
> The anonymous HASH is only supported as last argument.
> Once that is done, the :name is only supported as last argument.
--
Jarkko Hietaniemi <j...@iki.fi> http://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen
And what would make that safer than having pro_alias.pl in your @INC path,
still having an interface that feels - errr - familiar.
I agree that the unicore/ part could be dropped, but OTOH, it is just even
more clear what we are doing.
> > The anonymous HASH is only supported as last argument.
> > Once that is done, the :name is only supported as last argument.
--
Okay, forget the safeness argument ... but having a tag named by the
file being loaded just feels ... wrong. It feels like open(FH, ":pro")
to me.... I would leave the tags for the charnames internal definitions,
and have a keyword-value pair for the externals.
> I agree that the unicore/ part could be dropped, but OTOH, it is just even
> more clear what we are doing.
--
I don't understand your last sentence, but I think I'll wait for the doc patch ;-)
> IMHO *very* useful. Opinions?
I like it.
> + if (@_ and $_[-1] =~ m{:(?!full|short)\w+$}) {
> + (my $file = pop) =~ s{:(.*)}{unicore/$1_alias.pl};
> + alias (do $file);
You should check the return value of "do" here. Or use "require".
It was just a proof of concept that just *works* here. If we agree to take it
in, doc patches *will* follow (I guess)
> > IMHO *very* useful. Opinions?
>
> I like it.
Good!
> > + if (@_ and $_[-1] =~ m{:(?!full|short)\w+$}) {
> > + (my $file = pop) =~ s{:(.*)}{unicore/$1_alias.pl};
> > + alias (do $file);
>
> You should check the return value of "do" here. Or use "require".
Of course. Again. Just proof of concept
Not at all, just as :full and :short, it defines a way to be able to name your
Unicode characters.
> I would leave the tags for the charnames internal definitions,
> and have a keyword-value pair for the externals.
Why does that not sound convincing?
> > I agree that the unicore/ part could be dropped, but OTOH, it is just even
> > more clear what we are doing.
--
Scan from the front, and keep the rest of the proposal the same?
t.i. if it is the only argument, promote to default :full?
I can live with that. I guess.
Ahh, one more argument *in favour* of :pro will be
# perl -Mcharnames\ q#:pro# -le'print "\N{e_ACUTE}"
to be much easier to type than the alias convention.
I'm sorry but they are not "your Unicode characters", they are your
aliases for Unicode characters. (I'm very picky before my second cup
of coffee...) The :full and :short are not making up any aliases,
they are using the existing official names. But, I'll defer the
decision on this to Hugo.
Take 2. All of the above and docs. No tests (yet).
--- /pro/lib/perl5/5.8.0/charnames.pm 2002-05-31 13:07:59.000000000 +0200
+++ charnames.pm 2002-09-26 17:32:32.000000000 +0200
@@ -38,8 +38,18 @@ my %alias2 = (
'PARTIAL LINE UP' => 'PARTIAL LINE BACKWARD',
);
+my %alias3 = (
+ # User defined aliasses. Even more convenient :)
+ );
my $txt;
+sub alias (@)
+{
+ @_ or return %alias3;
+ my %alias = ref $_[0] ? %{$_[0]} : @_;
+ @alias3{keys %alias} = values %alias;
+ } # alias
+
# This is not optimized in any way yet
sub charnames
{
@@ -48,11 +58,14 @@ sub charnames
if (exists $alias1{$name}) {
$name = $alias1{$name};
}
- if (exists $alias2{$name}) {
+ elsif (exists $alias2{$name}) {
require warnings;
warnings::warnif('deprecated', qq{Unicode character name "$name" is deprecated, use "$alias2{$name}" instead});
$name = $alias2{$name};
}
+ elsif (exists $alias3{$name}) {
+ $name = $alias3{$name};
+ }
my $ord;
my @off;
@@ -155,8 +168,32 @@ sub import
##
## fill %h keys with our @_ args.
##
- my %h;
- @h{@_} = (1) x @_;
+ my ($promote, %h, @args) = (0);
+ my @args;
+ while (@_ and $_ = shift) {
+ if (ref $_ eq "HASH") {
+ alias ($_);
+ next;
+ }
+ if ($_ =~ m{:(?!full|short)\w+$}) {
+ (my $file = $_) =~ s{:(.*)}{unicore/$1_alias.pl};
+ if (my @alias = do $file) {
+ alias (@alias);
+ $promote++;
+ next;
+ }
+ }
+ if ($_ eq "alias" && @_) {
+ (my $file = shift) =~ s{:(.*)}{unicore/$1_alias.pl};
+ if (my @alias = do $file) {
+ alias (@alias);
+ next;
+ }
+ }
+ push @args, $_;
+ }
+ @args == 0 && $promote and @args = (":full");
+ @h{@args} = (1) x @args;
$^H{charnames_full} = delete $h{':full'};
$^H{charnames_short} = delete $h{':short'};
@@ -343,6 +380,44 @@ state of C<bytes>-flag as in:
}
}
+=head1 Custom Aliases
+
+This version of charnames supports three mechanisms of adding local
+or customized aliases to standard Unicode naming conventions (:full)
+
+=head2 Anonymous hashes
+
+ use charnames ":full", {
+ e_ACUTE => "LATIN SMALL LETTER E WITH ACUTE",
+ };
+ my $str = "\N{e_ACUTE}";
+
+=head2 Alias pairs
+
+ use charnames ":full", alias => "pro";
+
+ will try to read "unicore/pro_alias.pl" from the @INC path. This
+ file should return a list:
+
+ #!/usr/bin/perl
+ (
+ A_GRAVE => "LATIN CAPITAL LETTER A WITH GRAVE",
+ A_CIRCUM => "LATIN CAPITAL LETTER A WITH CIRCUMFLEX",
+ A_DIAERES => "LATIN CAPITAL LETTER A WITH DIAERESIS",
+ A_TILDE => "LATIN CAPITAL LETTER A WITH TILDE",
+ A_BREVE => "LATIN CAPITAL LETTER A WITH BREVE",
+ A_RING => "LATIN CAPITAL LETTER A WITH RING ABOVE",
+ A_MACRON => "LATIN CAPITAL LETTER A WITH MACRON",
+ );
+
+=head2 Alias shortcut
+
+ use charnames ":pro";
+
+ works exactly the same as the alias pairs, only this time,
+ ":full" is inserted automatically as first argument (if no
+ other argument is given).
+
=head1 charnames::viacode(code)
Returns the full name of the character indicated by the numeric code.
I'd appreciate a voice from Hugo in the matter, since I have to *use* it
shortly, and knowing that things won't break in the near future would help :)
Take 3. Now realy tested :)
bev a5:/pro/tu/bev/3gl/ars 135 > head -9 ZPS.pl
#!/pro/bin/perl
use strict;
use warnings;
use charnames ":full", { u_TILDE => "LATIN SMALL LETTER U WITH TILDE" };
print "\N{u_TILDE}\n";
__END__
bev a5:/pro/tu/bev/3gl/ars 136 > ZPS.pl
Wide character in print at ZPS.pl line 7.
?
bev a5:/pro/tu/bev/3gl/ars 137 > perl -Mcharnames=:pro -le'print"\N{u_TILDE}"' Wide character in print at -e line 1.
?
bev a5:/pro/tu/bev/3gl/ars 138 > perl -Mcharnames=:full,alias,pro -le'print"\N{u_TILDE}"'
Wide character in print at -e line 1.
?
bev a5:/pro/tu/bev/3gl/ars 139 >
The question marks actually show a u-TILDE in unicode :)
--- /pro/lib/perl5/5.8.0/charnames.pm 2002-05-31 13:07:59.000000000 +0200
+++ charnames.pm 2002-09-27 10:59:47.000000000 +0200
@@ -38,8 +38,18 @@
@@ -155,8 +168,31 @@ sub import
##
## fill %h keys with our @_ args.
##
- my %h;
- @h{@_} = (1) x @_;
+ my ($promote, %h, @args) = (0);
@@ -343,6 +379,44 @@ sub vianame
do() updates %INC. You may want to take advantage of this.
Or you can use require() -- when it actually loads the file, it
returns the file's return value ; when the file has already been
loaded, it returns 1.
$ cat foo.pl
42;
$ perl -le 'print do "foo.pl";print do "foo.pl"'
42
42
$ perl -le 'print require "foo.pl";print require "foo.pl"'
42
1
This as addition to prevent unneeded do's
--- /pro/lib/perl5/5.8.0/charnames.pm 2002-09-27 11:04:32.000000000 +0200
+++ charnames.pm 2002-09-27 13:17:23.000000000 +0200
@@ -41,7 +41,7 @@
my %alias3 = (
# User defined aliasses. Even more convenient :)
);
-my $txt;
+my ($txt, %aliased);
sub alias (@)
{
@@ -176,6 +176,7 @@ sub import
}
if ($_ =~ m{:(?!full|short)\w+$}) {
(my $file = $_) =~ s{:(.*)}{unicore/$1_alias.pl};
+ $aliased{$file}++ and next;
if (my @alias = do $file) {
alias (@alias);
$promote++;
@@ -184,6 +185,7 @@ sub import
}
if ($_ eq "alias" && @_) {
(my $file = shift) =~ s{(.*)}{unicore/$1_alias.pl};
+ $aliased{$file}++ and next;
if (my @alias = do $file) {
alias (@alias);
next;
> On Thu 26 Sep 2002 15:16, Jarkko Hietaniemi <j...@iki.fi> wrote:
> > Make it
> >
> > use charname alias => "pro";
> >
> > and the pro_alias.pl naming and I'm happy.
>
> Scan from the front, and keep the rest of the proposal the same?
>
> t.i. if it is the only argument, promote to default :full?
>
> I can live with that. I guess.
> Ahh, one more argument *in favour* of :pro will be
>
> # perl -Mcharnames\ q#:pro# -le'print "\N{e_ACUTE}"
>
> to be much easier to type than the alias convention.
Which would be:
perl -Mcharnames=alias,pro -le'print "\N{e_ACUTE}"
Not hard at all.
Regards,
Slaven
--
Slaven Rezic - slaven...@berlin.de
babybike - routeplanner for cyclists in Berlin
handheld (e.g. Compaq iPAQ with Linux) version of bbbike
http://bbbike.sourceforge.net
Well, I'm happy to adopt it in principle, but as always in the development
track there are no guarantees that we won't have thrown it out again by
the time we get to an actual release.
When you've got that far, please submit a patch with the new code as
well as tests and docs. Please ensure that the tests cater for compile
errors in the alias file, among other things.
One additional point: the "alias => pro" format is definitely the right
direction to go, else we'll be introducing nasty subtle cross-version
breakage any time we introduce a new export tag in the future.
Hugo
N'ah :) We don't throw out things that are *that* useful
> When you've got that far, please submit a patch with the new code as
> well as tests and docs. Please ensure that the tests cater for compile
> errors in the alias file, among other things.
Will craft it together. Promise. (I will even try to use the native layout,
promise)
> One additional point: the "alias => pro" format is definitely the right
> direction to go, else we'll be introducing nasty subtle cross-version
> breakage any time we introduce a new export tag in the future.
As long as we /document the taken one's, well ...
OK, I admit. Alias it'll be :/
Anonymous hashes can stay?
> Hugo