.sub '' :main
load_bytecode 'PGE.pbc'
load_bytecode 'PGE/Glob.pbc'
$P1 = find_global 'PGE', 'glob'
$S1 = unicode:"\u03b1"
$S1 = downcase $S1
$S2 = unicode:"\u0391"
$S2 = downcase $S2
$P2 = $P1($S1) # create a glob rule..
$P3 = $P2($S2) # does it match?
$I0 = $P3."__get_bool"()
print $I0
print "\n"
.end
Which should be comparing the capital and lower case Greek Alpha case
insensitively. The downcase is having the desired effect, generating
two identical strings. However, when we use PGE to compare them, we
end up with a malformed string error.
A few notes:
1) using ascii:"A" and ascii:"a", this prints 1.
2) using unicode:"A" and ascii:"a", this prints 1.
3) If I 'escape' both the pattern and the string, this still prints 0.
4) If I escape the pattern 2x (once for unicode, once for
backslashes), and the string once, this prints 1! Yay!*
5) However, if I turn on #4, then I get four test failures in tcl's
[string] tests. You can simulate this here by combining #1 and #4,
which bus errors on my os x box.
So, are all these ways of preparing arguments to Glob incorrect? (and
if so, what's the right way?), or does this behavior point to a bug?
Regards.
*Wait for it....
It's a bug, now fixed in r10020. When switching to the new escape
opcode I forgot that we sometimes have "\u" sequences in the escaped
string (Data::Escape didn't produce these).
The answer to your other question ("how should arguments be prepared
for Glob") is that PGE is intended to be able to process strings and
patterns of any charset and encoding; one shouldn't need to do any
escaping or processing of arguments in order for them to work.
Pm