Howdy all.
Apologies for the delay; Some family issues came up so I've been tangled up
in that.
Not much happened last week. Organized the commits, got a couple of warnings
that I had missed, reviewed things a bit (a handful of things I cleaned up
in toke.c weren't paying attention to XIDC, only XIDS), and kept trying to
figure out that do FILE failure.
About the latter: Preloading the swash breaks t/comp/utf.t - I'm not sure
why. There's a way to get both cases working, somewhat, but it's basically
just piling more hacks on top of the originally not-too-pleasing solution --
So nothing close to a resolution yet. I guess we'll deal with it afterwards.
I might be able to tie it with the swallow_bom part now, so it could be for
the best eventually, even if somewhat discouraging right now.
About swallow_bom(): I've been giving Nicholas' suggestion of changing the
custom filter to an encoding layer. That seems like a winner to me, but I
seem to recall reading that PerlIO & Encode don't handle BOM'd streams all
too well. Admittedly, it was a 5-6 year old post which I can't seem to track
right now, but would that be a worry here?
I also did some tinkering with normalization; With lexicals, as expected,
it's really rather trivial to implement (It only requires two calls to
Unicode::Normalize::Etc(), one when storing and one when fetching), but
that's low hanging fruit.
Having given GVs and stashes a thought, I've come to realize that Nicholas'
original assessment was spot-on, my original optimism be damned. Not only do
you have to normalize strings passed in for lookup, but since we don't
enforce a normalization form by default, you can't rely on the stash keys
being properly normalized either!
And what do you do about, say, labels? Or package names? What if a package
has a different normalization form?
Plus, if I type in $::ni\x{F1}o and later do keys %::, I want that to come
out, not "nin\x{303}o" - So it's WASUTF8 all over again there.
Is there a way out of this that doesn't require the core to normalize by
default?
..And that's about it for the report, unfortunately. Back to rebasing.