open my $fh, ">:raw:encoding(UTF-16LE):crlf:utf8", $filename or die;
print $fh "\x{feff}";
But this isn't very intuitive. I wonder if either PerlIOCrlf_pushed()
should "inherit" the flag from the lower layer, or if PerlIO_isutf8()
should walk the layer stack?
Cheers,
-Jan
That would be my preference I think.
Perhaps that should be the default behaviour for a layer?
>or if PerlIO_isutf8()
>should walk the layer stack?
Snag with that (apart from overhead) is that something would have to
"know" which layers did or did not affect UTF8-ness. So it is
better for layer code to do it as it should know.
>
>Cheers,
>-Jan
Does the attached patch look right to you?
> Perhaps that should be the default behaviour for a layer?
Probably, except for the :encoding, :raw and :utf8 layers. Any other
exceptions?
If you think the patch below is the right way to do it, then I can
try to add it to all the other PerlIOXxxx_pushed() functions too.
Or is there anything else that needs to be done?
Cheers,
-Jan
--- perlio.c.orig Wed Apr 05 07:47:13 2006
+++ perlio.c Thu Apr 06 18:06:24 2006
@@ -4192,6 +4192,21 @@
* buffer */
} PerlIOCrlf;
+/* Inherit the PERLIO_F_UTF8 flag from previous layer.
+ * Otherwise the :crlf layer would always revert back to
+ * raw mode.
+ */
+static void
+S_inherit_utf8_flag(PerlIO *f)
+{
+ PerlIO *g = PerlIONext(f);
+ if (PerlIOValid(g)) {
+ if (PerlIOBase(g)->flags & PERLIO_F_UTF8) {
+ PerlIOBase(f)->flags |= PERLIO_F_UTF8;
+ }
+ }
+}
+
IV
PerlIOCrlf_pushed(pTHX_ PerlIO *f, const char *mode, SV *arg, PerlIO_funcs *tab)
{
@@ -4209,17 +4224,19 @@
* any given moment at most one CRLF-capable layer being enabled
* in the whole layer stack. */
PerlIO *g = PerlIONext(f);
- while (g && *g) {
+ while (PerlIOValid(g)) {
PerlIOl *b = PerlIOBase(g);
if (b && b->tab == &PerlIO_crlf) {
if (!(b->flags & PERLIO_F_CRLF))
b->flags |= PERLIO_F_CRLF;
+ S_inherit_utf8_flag(g);
PerlIO_pop(aTHX_ f);
return code;
}
g = PerlIONext(g);
}
}
+ S_inherit_utf8_flag(f);
return code;
}
End of Patch.
Certainly. That always sets UTF-8 in the perl side - that is its job.
>:raw and :utf8 layers.
Don't really exist ;-) trying to push them results in manipulation
of other layers. So yes they are exceptions too.
>Any other
>exceptions?
Well something like the gzip layer (not core) is a bit different.
As the zipped side is octets it should probably complain if downstream
was expecting UTF-8.
So I am now wondering if it is only "buffering layers" that should do this
and that a global default is premature.
So perhaps PerlIO_buf should have this added but leave the others?
>
>If you think the patch below is the right way to do it,
I was a little worried that S_inherit_utf8_flag only ever set and never
cleared the flag. Then I realized that at point of call nothing else would
have set it.
>then I can
>try to add it to all the other PerlIOXxxx_pushed() functions too.
>Or is there anything else that needs to be done?
I would not be surprised if some sequence of binmode-ing on layers
needs some more. But this seems like a good start.
>
>Cheers,
>-Jan
>
>--- perlio.c.orig Wed Apr 05 07:47:13 2006
>+++ perlio.c Thu Apr 06 18:06:24 2006
>@@ -4192,6 +4192,21 @@
> * buffer */
> } PerlIOCrlf;
>
>+/* Inherit the PERLIO_F_UTF8 flag from previous layer.
>+ * Otherwise the :crlf layer would always revert back to
>+ * raw mode.
>+ */
>+static void
>+S_inherit_utf8_flag(PerlIO *f)
>+{
>+ PerlIO *g = PerlIONext(f);
>+ if (PerlIOValid(g)) {
>+ if (PerlIOBase(g)->flags & PERLIO_F_UTF8) {
>+ PerlIOBase(f)->flags |= PERLIO_F_UTF8;
>+ } else {
Thanks, applied as change #28879.