Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH?] ref() returns more information in list context

1 view
Skip to first unread message

John Peacock

unread,
Aug 14, 2002, 4:48:18 AM8/14/02
to perl5-...@perl.org
Patch after sig. Count me frustrated, since my code works just fine except for
the fact that ref() is never evaluated in a list context! This doesn't seem to
do anything:

if ( GIMME_V != G_ARRAY )

even though I thought I had done what was needed to change OP_REF from a UNI to
a LOP. E.g.

$ ./perl -e '$v=1000; bless \$v, "my_class";print join "\t", ref \$v;'
my_class

If I comment out that first branch of the if and just run the else branch, then
it looks fine:

$ ./perl -e '$v=1000; bless \$v, "my_class";print join "\t", ref \$v;'
my_class SCALAR 0x8177314

I hope that someone wiser than me in the ways of OPs can see the flaw in my
ointment... ;~(

John

--
John Peacock
Director of Information Research and Technology
Rowman & Littlefield Publishing Group
4720 Boston Way
Lanham, MD 20706
301-459-3366 x.5010
fax 301-429-5747


--- perl.test/sv.c.orig Sat Aug 10 21:40:37 2002
+++ perl.test/sv.c Wed Aug 14 04:10:29 2002
@@ -2932,7 +2932,7 @@ Perl_sv_2pv_flags(pTHX_ register SV *sv,
if (SvTHINKFIRST(sv)) {
if (SvROK(sv)) {
SV* tmpstr;
- if (SvAMAGIC(sv) && (tmpstr=AMG_CALLun(sv,string)) &&
+ if (flags && SvAMAGIC(sv) && (tmpstr=AMG_CALLun(sv,string)) &&
(SvTYPE(tmpstr) != SVt_RV || (SvRV(tmpstr) != SvRV(sv))))
return SvPV(tmpstr,*lp);
sv = (SV*)SvRV(sv);
--- perl.test/toke.c.orig Tue Jul 9 12:08:32 2002
+++ perl.test/toke.c Wed Aug 14 03:59:49 2002
@@ -4811,7 +4811,7 @@ Perl_yylex(pTHX)
UNI(OP_READLINK);

case KEY_ref:
-
UNI(OP_REF);
+
LOP(OP_REF,XTERM);

case KEY_s:
s = scan_subst(s);
--- perl.test/pp.c.orig Tue Jun 18 16:26:49 2002
+++ perl.test/pp.c Wed Aug 14 04:45:02 2002
@@ -487,9 +487,11 @@ S_refto(pTHX_ SV *sv)

PP(pp_ref)
{
- dSP; dTARGET;
+ dSP; dMARK;
SV *sv;
- char *pv;
+ char *start;
+ char *end;
+ STRLEN len;

sv = POPs;

@@ -499,9 +501,40 @@ PP(pp_ref)
if (!sv || !SvROK(sv))
RETPUSHNO;

- sv = SvRV(sv);
- pv = sv_reftype(sv,TRUE);
- PUSHp(pv, strlen(pv));
+ start = sv_2pv_flags(sv,&len,0);
+
+ for ( end = start ; *end != '=' && *end != '(' ; end++ );
+
+ if ( GIMME_V != G_ARRAY )
+ {
+
PUSHs( sv_2mortal( /* CLASS or TYPE */
+
newSVpvn( (const char*)start, end-start) )
+
);
+ }
+ else
+ {
+
MEXTEND(MARK,3);
+
+
if ( *end == '=' )
+
{
+
PUSHs( sv_2mortal( /* CLASS */
+
newSVpvn( (const char*)start, end-start) )
+
);
+
+
for ( start = ++end; *end != '(' ; end++ );
+
}
+
else
+
PUSHs(&PL_sv_undef); /* no CLASS */
+
+
PUSHs( sv_2mortal( /* TYPE */
+
newSVpvn( (const char*)start, end-start) )
+
);
+
+
for ( start = ++end; *end != ')' ; end++ );
+
PUSHs( sv_2mortal( /* ADDR */
+
newSVpvn( (const char*)start, end-start) )
+
);
+ }
RETURN;
}


h...@crypt.org

unread,
Aug 14, 2002, 10:23:53 AM8/14/02
to John Peacock, perl5-...@perl.org
John Peacock <jpea...@rowman.com> wrote:
:Patch after sig. Count me frustrated [...]
:I thought I had done what was needed to change OP_REF from a UNI to
:a LOP.
[...]
:--- perl.test/toke.c.orig Tue Jul 9 12:08:32 2002

:+++ perl.test/toke.c Wed Aug 14 03:59:49 2002
:@@ -4811,7 +4811,7 @@ Perl_yylex(pTHX)
: UNI(OP_READLINK);
:
: case KEY_ref:
:-
: UNI(OP_REF);
:+
: LOP(OP_REF,XTERM);
:

That doesn't look like the intended patch.

Hugo

John Peacock

unread,
Aug 14, 2002, 10:32:54 AM8/14/02
to h...@crypt.org, perl5-...@perl.org

No, that's wrong; I was up most of the night after my son woke up, and frankly I
was thrashing around. I realize now that that change is the wrong side of the
equation; ref() should still only take a single parameter.

However, I still don't know what is preventing me from testing whether ref() is
in a list or scalar context, so I can branch accordingly...

Help???

Graham Barr

unread,
Aug 14, 2002, 10:41:43 AM8/14/02
to John Peacock, perl5-...@perl.org
On Wed, Aug 14, 2002 at 04:48:18AM -0400, John Peacock wrote:
> Patch after sig. Count me frustrated, since my code works just fine except for
> the fact that ref() is never evaluated in a list context! This doesn't seem to
> do anything:
>
> if ( GIMME_V != G_ARRAY )
>
> even though I thought I had done what was needed to change OP_REF from a UNI to
> a LOP. E.g.
>
> $ ./perl -e '$v=1000; bless \$v, "my_class";print join "\t", ref \$v;'
> my_class
>
> If I comment out that first branch of the if and just run the else branch, then
> it looks fine:
>
> $ ./perl -e '$v=1000; bless \$v, "my_class";print join "\t", ref \$v;'
> my_class SCALAR 0x8177314

This will have serious backward compatability problems. Anyone who is using
the result of ref() as a sub argument, or the result of a map, will have thier
code broken if you change ref() to return a list in a list context.

Can you explain why you want this ? Of course it is already possible to get
this info

$ perl -MScalar::Util=reftype -l
$r = bless [], "Foo";
print ref($r);
print reftype($r);
printf "0x%08X\n", $r;
__END__
Foo
ARRAY
0x0804E054


Graham.

Yves Orton

unread,
Aug 14, 2002, 11:07:16 AM8/14/02
to Graham Barr, John Peacock, perl5-...@perl.org
> Can you explain why you want this ? Of course it is already
> possible to get
> this info
>
> $ perl -MScalar::Util=reftype -l
> $r = bless [], "Foo";
> print ref($r);
> print reftype($r);
> printf "0x%08X\n", $r;
> __END__
> Foo
> ARRAY
> 0x0804E054

Well, the third one will only work if the reference is not blessed into a
package that overloads numeric operations. (for which there is no way to
bypass that I know of, unlike overload::StrVal for the stringification case)

I believe the idea is to provide a single way to extract this information.
That is one that would live in core and would not require parsing the output
of overload::StrVal. Although personally I would like it to return four
elements, that of what ref() would return if the reference was not blessed
(ie SCALAR REF etc), what reftype() returns, what class the object is
blessed into, and the id. And id like to see qr// constructs treated as a
base type and not as a class.

Ie : uberref(\\$foo) #REF SCALAR '' '0xDeadBeef'
: uberref(qr/foo/) #REGEX SCALAR '' '0xDeadBeef'
: uberref($z=bless qr/foo/,'foo') #REGEX SCALAR 'foo' '0xDeadBeef'
: uberref(sub{}) #CODE SCALAR '' '0xDeadBeef'
: uberref(\$z) #SCALAR SCALAR '' '0xDeadBeef'
: uberref(\*foo) #GLOB SCALAR '' '0xDeadBeef'
: uberref("foo") #'' SCALAR '' ''
: uberref(*foo) #'' GLOB '' ''
: uberref([]) #ARRAY ARRAY '' '0xDeadBeef'
: uberref({}) #HASH HASH '' '0xDeadBeef'

...ykwim..

But i agree that changing the behaviour of ref() isnt the way to go. A new
keyword that provided reliable and robust type introspection for all cases
would be very useful I believe. (Also finding the name of a glob is
annoying, as is finding out what pattern a reblessed qr// contains (the
later is afaik impossible))

just my $0.02

:-)

yves


yves

John Peacock

unread,
Aug 14, 2002, 11:32:38 AM8/14/02
to Graham Barr, perl5-...@perl.org
Graham Barr wrote:
> This will have serious backward compatability problems. Anyone who is using
> the result of ref() as a sub argument, or the result of a map, will have thier
> code broken if you change ref() to return a list in a list context.

No doubt. This was more a proof of concept patch than something intended to be
applied. A small change to sv_2pv_flags permits most of the desired information
to be generated even for objects with AMAGIC.

>
> Can you explain why you want this ? Of course it is already possible to get
> this info
>
> $ perl -MScalar::Util=reftype -l
> $r = bless [], "Foo";
> print ref($r);
> print reftype($r);
> printf "0x%08X\n", $r;
> __END__
> Foo
> ARRAY
> 0x0804E054
>

Not completely:

$ perl -MScalar::Util=reftype -MMath::BigInt -l
$v = new Math::BigInt "12000";
print ref($v);
print reftype($v);
printf "0x%08X\n", $v;
__END__
Math::BigInt
HASH
0x00002EE0

h...@crypt.org

unread,
Aug 14, 2002, 11:40:59 AM8/14/02
to Orton, Yves, perl5-...@perl.org
"Orton, Yves" <yves....@mciworldcom.de> wrote:
:I believe the idea is to provide a single way to extract this information.

:That is one that would live in core and would not require parsing the output
:of overload::StrVal. Although personally I would like it to return four
:elements, that of what ref() would return if the reference was not blessed
:(ie SCALAR REF etc), what reftype() returns, what class the object is
:blessed into, and the id. And id like to see qr// constructs treated as a
:base type and not as a class.
:
:Ie : uberref(\\$foo) #REF SCALAR '' '0xDeadBeef'

[...]

Why does it need to be in the core? We already have some stuff in
Scalar::Util, and that seems the best place to put such a new function.

Other than that, I think the '' in your examples should all be C<undef>
instead; dunno whether we can handle regexes the way you suggest, but
I guess we'll find out if someone tries it.

Hugo

Graham Barr

unread,
Aug 14, 2002, 11:33:30 AM8/14/02
to Orton, Yves, John Peacock, perl5-...@perl.org
On Wed, Aug 14, 2002 at 04:07:16PM +0100, Orton, Yves wrote:
> > Can you explain why you want this ? Of course it is already
> > possible to get
> > this info
> >
> > $ perl -MScalar::Util=reftype -l
> > $r = bless [], "Foo";
> > print ref($r);
> > print reftype($r);
> > printf "0x%08X\n", $r;
> > __END__
> > Foo
> > ARRAY
> > 0x0804E054
>
> Well, the third one will only work if the reference is not blessed into a
> package that overloads numeric operations. (for which there is no way to
> bypass that I know of, unlike overload::StrVal for the stringification case)

Very true.

> I believe the idea is to provide a single way to extract this information.
> That is one that would live in core and would not require parsing the output
> of overload::StrVal. Although personally I would like it to return four
> elements, that of what ref() would return if the reference was not blessed
> (ie SCALAR REF etc), what reftype() returns, what class the object is
> blessed into, and the id. And id like to see qr// constructs treated as a
> base type and not as a class.

I dont see the use of 4, your 2nd column below is wrong as it
should be the same as the first. reftype() always returns what ref()
would have returned if the reference was not blessed.

So we already have 2 of the three needed (reftype() and blessed()).
All we need is a way to get the reference as an integer as if the
reference was not overloaded. If people want that a sub to do it
could quite easily be added to Scalar::Util

Graham.

Graham Barr

unread,
Aug 14, 2002, 11:34:52 AM8/14/02
to John Peacock, perl5-...@perl.org
On Wed, Aug 14, 2002 at 11:32:38AM -0400, John Peacock wrote:
> Graham Barr wrote:
> > This will have serious backward compatability problems. Anyone who is using
> > the result of ref() as a sub argument, or the result of a map, will have thier
> > code broken if you change ref() to return a list in a list context.
>
> No doubt. This was more a proof of concept patch than something intended to be
> applied. A small change to sv_2pv_flags permits most of the desired information
> to be generated even for objects with AMAGIC.
>
> >
> > Can you explain why you want this ? Of course it is already possible to get
> > this info
> >
> > $ perl -MScalar::Util=reftype -l
> > $r = bless [], "Foo";
> > print ref($r);
> > print reftype($r);
> > printf "0x%08X\n", $r;
> > __END__
> > Foo
> > ARRAY
> > 0x0804E054
> >
>
> Not completely:

Right. All we need is a new sub in Scalar::Util that overcomes any overloadness
for the numeric conversion

Graham.

John Peacock

unread,
Aug 14, 2002, 11:38:45 AM8/14/02
to Orton, Yves, Graham Barr, perl5-...@perl.org
Orton, Yves wrote:
> Well, the third one will only work if the reference is not blessed into a
> package that overloads numeric operations. (for which there is no way to
> bypass that I know of, unlike overload::StrVal for the stringification case)

Actually, if you read the source for overload::StrVal, you'll see that it works
is to bless the object into a class that does not have overloading, then get the
stringified object ref, then rebless into the original class. If there is any
other references to that same object, you will cause possibly irreparable harm.

>
> I believe the idea is to provide a single way to extract this information.
> That is one that would live in core and would not require parsing the output
> of overload::StrVal. Although personally I would like it to return four
> elements, that of what ref() would return if the reference was not blessed
> (ie SCALAR REF etc), what reftype() returns, what class the object is
> blessed into, and the id. And id like to see qr// constructs treated as a
> base type and not as a class.
>

I need to look at what reftype() does, but the other three I can already do. As
for changing what qr// constructs are treated as, I think that is impossible.
We are already out of flags to signify base types.

Graham Barr

unread,
Aug 14, 2002, 11:47:20 AM8/14/02
to John Peacock, perl5-...@perl.org
On Wed, Aug 14, 2002 at 04:34:52PM +0100, Graham Barr wrote:
> Right. All we need is a new sub in Scalar::Util that overcomes any overloadness
> for the numeric conversion

And here is such a patch. If people like it I will do a CPAN release
and update the repository.

Graham.

refaddr.pat

Nicholas Clark

unread,
Aug 14, 2002, 11:59:46 AM8/14/02
to Graham Barr, John Peacock, perl5-...@perl.org


> +int
> +refaddr(sv)
> + SV * sv
> +PROTOTYPE: $
> +CODE:
> +{
> + if(!SvROK(sv)) {
> + XSRETURN_UNDEF;
> + }
> + RETVAL = (int)SvRV(sv);
> +}

IV not int, surely? [64 bit addresses on 64 bit platforms with 32 bit ints]

Do the PTR2IV macros work for perl versions as far back as Scalar::Util
supports? I have this gut feeling that they avoid some compilers' warnings
where a plain cast does not, but I could be utterly wrong here.

Nicholas Clark

Yves Orton

unread,
Aug 14, 2002, 12:06:04 PM8/14/02
to h...@crypt.org, perl5-...@perl.org
> Why does it need to be in the core? We already have some stuff in
> Scalar::Util, and that seems the best place to put such a new
> function.

Well, personally, for information this useful I dont think a module should
be required.
But considering Scalar::Util is part of the standard distro its not a point
i'll argue too much.



> Other than that, I think the '' in your examples should all
> be C<undef>

Yeah agreed, I just didnt want to confuse things in my example.

> instead; dunno whether we can handle regexes the way you suggest, but
> I guess we'll find out if someone tries it.

Well the current way hides the fact that an object also happens to be a
regex. And there is precedent. ref() returns what type of scalar ref we
are dealing with (CODE GLOB SCALAR REF) whats one more added?

yves

Graham Barr

unread,
Aug 14, 2002, 12:01:38 PM8/14/02
to Nicholas Clark, John Peacock, perl5-...@perl.org
On Wed, Aug 14, 2002 at 04:59:46PM +0100, Nicholas Clark wrote:
> On Wed, Aug 14, 2002 at 04:47:20PM +0100, Graham Barr wrote:
> > On Wed, Aug 14, 2002 at 04:34:52PM +0100, Graham Barr wrote:
> > > Right. All we need is a new sub in Scalar::Util that overcomes any overloadness
> > > for the numeric conversion
> >
> > And here is such a patch. If people like it I will do a CPAN release
> > and update the repository.
>
>
> > +int
> > +refaddr(sv)
> > + SV * sv
> > +PROTOTYPE: $
> > +CODE:
> > +{
> > + if(!SvROK(sv)) {
> > + XSRETURN_UNDEF;
> > + }
> > + RETVAL = (int)SvRV(sv);
> > +}
>
> IV not int, surely? [64 bit addresses on 64 bit platforms with 32 bit ints]

Yes, you are right.

> Do the PTR2IV macros work for perl versions as far back as Scalar::Util
> supports? I have this gut feeling that they avoid some compilers' warnings
> where a plain cast does not, but I could be utterly wrong here.

I dont think they do, but we can define them when needed.

Graham.

Yves Orton

unread,
Aug 14, 2002, 12:14:54 PM8/14/02
to Graham Barr, perl5-...@perl.org
Graham Barr on 14 August 2002 17:34 wrote

> On Wed, Aug 14, 2002 at 04:07:16PM +0100, Orton, Yves wrote:
> I dont see the use of 4, your 2nd column below is wrong as it
> should be the same as the first. reftype() always returns what ref()
> would have returned if the reference was not blessed.

Which is my error _and_ my point. :-) The issue here is that ref() returns
a strange mix of class, reference nature, and variable type (and accordingly
so does reftype(), but without the class complexities). This is confusing
especially I think to beginners. What I meant is the 2nd column should
return only the variable type. Which means its either a SCALAR, GLOB, ARRAY
or HASH (and i might argue that CONST should be added as well). The first
column should be the reference type. (so its undef for a non ref like a glob
or plain scalar), the third is the class (and IMO in the case of globs (not
globrefs!) their name), with fourth being the id.



> So we already have 2 of the three needed (reftype() and blessed()).
> All we need is a way to get the reference as an integer as if the
> reference was not overloaded. If people want that a sub to do it
> could quite easily be added to Scalar::Util

Actually, i disagree. I believe that even with the additional sub you still
have problems deciding what you can do with a variable. And thats why i
think it need to be beefed up. Since perl provides many ways to be flexible
about what it recieves in terms of parameters etc, the more information that
perl can provide to the program about what a given item is the better and
smarter our code can be.

Yves


Yves Orton

unread,
Aug 14, 2002, 12:18:02 PM8/14/02
to John Peacock, perl5-...@perl.org
> Actually, if you read the source for overload::StrVal, you'll
> see that it works
> is to bless the object into a class that does not have
> overloading, then get the
> stringified object ref, then rebless into the original class.
> If there is any
> other references to that same object, you will cause possibly
> irreparable harm.

If so then theres a _lot_ of code that uses it. Data::Dumper for one. :-)



> I need to look at what reftype() does, but the other three I
> can already do. As
> for changing what qr// constructs are treated as, I think
> that is impossible.
> We are already out of flags to signify base types.

So putting in a special exception for regexes isnt possible?

Yves

John Peacock

unread,
Aug 14, 2002, 12:39:42 PM8/14/02
to Orton, Yves, perl5-...@perl.org
Orton, Yves wrote:
>> If there is any
>>other references to that same object, you will cause possibly
>>irreparable harm.
>
>
> If so then theres a _lot_ of code that uses it. Data::Dumper for one. :-)

I'm not the one who raised the objection; Larry did here:
http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/9609/msg00596.html

>
>
>>We are already out of flags to signify base types.
>
>
> So putting in a special exception for regexes isnt possible?
>

Actually, from sv.h:

typedef enum {
SVt_NULL, /* 0 */
SVt_IV, /* 1 */
SVt_NV, /* 2 */
SVt_RV, /* 3 */
SVt_PV, /* 4 */
SVt_PVIV, /* 5 */
SVt_PVNV, /* 6 */
SVt_PVMG, /* 7 */
SVt_PVBM, /* 8 */
SVt_PVLV, /* 9 */
SVt_PVAV, /* 10 */
SVt_PVHV, /* 11 */
SVt_PVCV, /* 12 */
SVt_PVGV, /* 13 */
SVt_PVFM, /* 14 */
SVt_PVIO /* 15 */
} svtype;

struct STRUCT_SV { /* struct sv { */
void* sv_any; /* pointer to something */
U32 sv_refcnt; /* how many references to us */
U32 sv_flags; /* what we are */
};

#define SVTYPEMASK 0xff
#define SvTYPE(sv) ((sv)->sv_flags & SVTYPEMASK)

which would seem to suggest that we have 240 more values left in the lowest two
bytes of sv_flags, right???

Tels

unread,
Aug 14, 2002, 12:57:26 PM8/14/02
to gb...@pobox.com, jpea...@rowman.com, perl5-...@perl.org
-----BEGIN PGP SIGNED MESSAGE-----

Moin,

John writes:

>$ perl -MScalar::Util=reftype -MMath::BigInt -l
>> $v = new Math::BigInt "12000";
>> print ref($v);
>> print reftype($v);
>> printf "0x%08X\n", $v;
>> __END__
>> Math::BigInt
>> HASH
>> 0x00002EE

Graham writes:

>Right. All we need is a new sub in Scalar::Util that overcomes any
>overloadness for the numeric conversion

But how would that change the printing of

printf "0x%08X\n", $v;

and why should it? The overload in MBI is there on purpose :)

I mean, printf "0x%08X",$v prints the same w/ or w/o Scalar::Util, for
non-overloaded and overloaded objects, respectively. So why would you
changed the overloaded case of printf %X back to a non-overloaded just
because someone loads Scalar::Util?

Sorry, I am just a bit confused here :)

Thanx for bearing with me.

Tels

- --
perl -MDev::Bollocks -e'print Dev::Bollocks->rand(),"\n"'
vitalistically e-enable cutting-edge services

http://bloodgate.com/perl My current Perl projects
PGP key available on http://bloodgate.com/tels.asc or via email

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.

iQEVAwUBPVqL53cLPEOTuEwVAQFhIQf9HajlY+qZkYVIzws20rM8QRNHol2q+WRY
Id4Jot/Wi1jydccEsG9x5Y/ISSwJOYNQrTDy+cDohmKnCxt1OGfYKpHwwDT3yjZ6
JnbDZo9V0fM8LOZudf9AyZ0Tk9SAqSXSJn2WBfjXHCJBWSyWGlwotclWN9sYTHxj
Udfi0tHekbc8GFSUZ+W2Qpp9yQb7hhvPlIxJguBzyIr0Q+7YhUzP/OUgETjoQ4gC
PBxqYffVu4wbsQzt0bDAxCLxBFeUZ5QCBKndYhAQJfX1CgBHC3MXTIGwehMz3JXO
mCGF/kdzGChOUWzWvfVPWrh7OCCcNiE8+btOrz+HTv2Qj9+nuvyoAA==
=Hlgn
-----END PGP SIGNATURE-----

Graham Barr

unread,
Aug 14, 2002, 12:56:39 PM8/14/02
to Tels, jpea...@rowman.com, perl5-...@perl.org

How can you tell if two MBI objects are in fact the same object ?

People often use the stringification of a reference to do that as it
contains the address of the referenced SV. But you cannot do that
of the object uses overload. refaddr() will always return
the addressed of the referenced SV, so it can be used to compare
two erferences and *know* that they are the same object.

Graham.

Tels

unread,
Aug 14, 2002, 1:04:10 PM8/14/02
to Graham Barr, perl5-...@perl.org, jpea...@rowman.com
-----BEGIN PGP SIGNED MESSAGE-----

Moin,

On 14-Aug-02 Graham Barr carved into stone:

So let me get that straight:

When using Scalar::Util, printf will still work with BigInt as usual, but
when you call reftype() it does *internally* bypass the overload to print
the address of the reference?

If so, please just forgot that I did ask :)

Cheers,

Tels

- --
perl -MDev::Bollocks -e'print Dev::Bollocks->rand(),"\n"'

preemptively market B2C architectures

http://bloodgate.com/perl My current Perl projects
PGP key available on http://bloodgate.com/tels.asc or via email

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.

iQEVAwUBPVqNgncLPEOTuEwVAQGhbAf9H0IzzdneNWJ2R24uFue7ZDP+me4Si05l
UtzeVHd66zdYtPixrJeYBDqyk5So+sN+Yf2cb8dj4QvbzhSdTh80lZnf46tUf0W6
GGA8HEq5jAe416KXy4yrcOea+za8dxjubfNhuJkeO0xTK/Sd2GmlCTnDJb3QQsaP
9eZPPDUQ2vQ9bzJEWffNiLTIx4E0Jtlc4h2TR9YI1oG9fvxam/6sK4SQQCGZeoHu
HM3ZKf6tOosZhePM4s6IGcV974bi73IJlcuK6kv5bKCbHkdKXzRR+O5xOgCUgVQ7
iWN7TwlwCtPPjB3nUSjT3uYt5KMQKJw666uaZ5EMXAKYypUMyy2bow==
=beTT
-----END PGP SIGNATURE-----

Graham Barr

unread,
Aug 14, 2002, 1:01:49 PM8/14/02
to Tels, perl5-...@perl.org, jpea...@rowman.com
On Wed, Aug 14, 2002 at 07:04:10PM +0200, Tels wrote:
> > How can you tell if two MBI objects are in fact the same object ?
> >
> > People often use the stringification of a reference to do that as it
> > contains the address of the referenced SV. But you cannot do that
> > of the object uses overload. refaddr() will always return
> > the addressed of the referenced SV, so it can be used to compare
> > two erferences and *know* that they are the same object.
>
> So let me get that straight:
>
> When using Scalar::Util, printf will still work with BigInt as usual, but
> when you call reftype() it does *internally* bypass the overload to print
^^^^^^^
refaddr()

> the address of the reference?

yes.

> If so, please just forgot that I did ask :)

Time for coffee :)

Graham.

Nick Ing-Simmons

unread,
Aug 14, 2002, 1:10:41 PM8/14/02
to jpea...@rowman.com, perl5-...@perl.org
John Peacock <jpea...@rowman.com> writes:
>Patch after sig. Count me frustrated, since my code works just fine except for
>the fact that ref() is never evaluated in a list context! This doesn't seem to
>do anything:
>
> if ( GIMME_V != G_ARRAY )
>
>even though I thought I had done what was needed to change OP_REF from a UNI to
>a LOP. E.g.

UNI vs LOP is how many args it takes, not how many values it returns.

--
Nick Ing-Simmons
http://www.ni-s.u-net.com/

Nick Ing-Simmons

unread,
Aug 14, 2002, 1:21:58 PM8/14/02
to jpea...@rowman.com, Graham Barr, Orton, Yves, perl5-...@perl.org
John Peacock <jpea...@rowman.com> writes:
>Orton, Yves wrote:
>> Well, the third one will only work if the reference is not blessed into a
>> package that overloads numeric operations. (for which there is no way to
>> bypass that I know of, unlike overload::StrVal for the stringification case)
>
>Actually, if you read the source for overload::StrVal, you'll see that it works
>is to bless the object into a class that does not have overloading, then get the
>stringified object ref, then rebless into the original class. If there is any
>other references to that same object, you will cause possibly irreparable harm.

Only if those references are used while the object is the-other-type.
(It is the _object_ that is blessed not the reference.)
This can only really be a problem in the case of threads::shared objects
which are _truely_ shared rather than cloned.

Nick Ing-Simmons

unread,
Aug 14, 2002, 1:31:34 PM8/14/02
to jpea...@rowman.com, Orton, Yves, perl5-...@perl.org
John Peacock <jpea...@rowman.com> writes:
>
>typedef enum {
> SVt_NULL, /* 0 */
> SVt_PVIO /* 15 */
>} svtype;
>
>struct STRUCT_SV { /* struct sv { */
> void* sv_any; /* pointer to something */
> U32 sv_refcnt; /* how many references to us */
> U32 sv_flags; /* what we are */
>};
>
>#define SVTYPEMASK 0xff
>#define SvTYPE(sv) ((sv)->sv_flags & SVTYPEMASK)
>
>which would seem to suggest that we have 240 more values left in the lowest two
>bytes of sv_flags, right???

Lowest _byte_ but yes. The 0xff case is sort-of-used for (limited)
"use of free-d value" checks.

Which is why if we are really serious about vstrings (and personally
if they went away and never came back I would be delighted) -
I suggested we could have a SVt_VV.

The _main_ snag with SvTYPE is the implicit assumption that
in the objects-in-C scheme perl uses any higher numbered type
has all the fields of any lower numbered type. The logic in sv_upgrade()
assumes this and various other places. So your new type has to either
be "large" or slotted into the list and the others shuffled up.
Latter is going to cause binary compatibility problems.


>
>John

Tels

unread,
Aug 14, 2002, 1:28:23 PM8/14/02
to Graham Barr, jpea...@rowman.com, perl5-...@perl.org
-----BEGIN PGP SIGNED MESSAGE-----

Moin,

On 14-Aug-02 Graham Barr carved into stone:

> On Wed, Aug 14, 2002 at 07:04:10PM +0200, Tels wrote:
>> > How can you tell if two MBI objects are in fact the same object ?
>> >
>> > People often use the stringification of a reference to do that as it
>> > contains the address of the referenced SV. But you cannot do that
>> > of the object uses overload. refaddr() will always return
>> > the addressed of the referenced SV, so it can be used to compare
>> > two erferences and *know* that they are the same object.
>>
>> So let me get that straight:
>>
>> When using Scalar::Util, printf will still work with BigInt as usual,
>> but
>> when you call reftype() it does *internally* bypass the overload to
>> print
> ^^^^^^^
> refaddr()

Ag, I did it again....where is my...what wos id wot I wanded?

>> the address of the reference?
> yes.
>> If so, please just forgot that I did ask :)
> Time for coffee :)

Ah, yes, coffee :)

Cheers,

Tels

- --
perl -MDev::Bollocks -e'print Dev::Bollocks->rand(),"\n"'

autoschediastically promote guinine patterns

http://bloodgate.com/perl My current Perl projects
PGP key available on http://bloodgate.com/tels.asc or via email

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.

iQEVAwUBPVqTMXcLPEOTuEwVAQFhPAf6AzcQ1SXFSQgijK2L1GpyAoi5NPF1JZf7
g4BDzm2la7/YkgIEMJbwBaGi4anR4cUmFqknznUVogW/RI2u+bHIVQWR+tlg1aVr
copVK3qh3gc6xbzQM4ZdUh+fjrd/9ZbKdYbM8UPSSd6fhw2udOHBdTXKMJcHpmW8
INHSh2v7luetLJ9bA4aR+MvT4T36ZuB93nd1EqUez5CWaLB/gJGr6AP0ueUvL0gJ
Gtfx2Fa8E+wUr04uGMChgBxF3XUmOJftxOhdbZ4D1KjGHqZONw1EGBlAGqLoCCmx
HzElckGMMCIqZJ4d4HkOHI7tEV0WR9FWQUo/tTkHfj/6AyA/W36jVA==
=oowJ
-----END PGP SIGNATURE-----

John Peacock

unread,
Aug 14, 2002, 1:42:56 PM8/14/02
to Nick Ing-Simmons, Orton, Yves, perl5-...@perl.org
Nick Ing-Simmons wrote:
> Which is why if we are really serious about vstrings (and personally
> if they went away and never came back I would be delighted) -
> I suggested we could have a SVt_VV.

Oh, I'm happy with the magic v-strings patch I already submitted. ;~) With it,
v-strings are ordinary scalars for all intents and purposes, yet I can also tell
they are v-strings and retrieve the original string. I was really speculating
whether there would be room to make regex's into a base type.

>
> The _main_ snag with SvTYPE is the implicit assumption that
> in the objects-in-C scheme perl uses any higher numbered type
> has all the fields of any lower numbered type. The logic in sv_upgrade()
> assumes this and various other places. So your new type has to either
> be "large" or slotted into the list and the others shuffled up.
> Latter is going to cause binary compatibility problems.

Hmmm, that could be a drawback, though.

John Peacock

unread,
Aug 14, 2002, 1:34:04 PM8/14/02
to Nick Ing-Simmons, Graham Barr, Orton, Yves, perl5-...@perl.org
Nick Ing-Simmons wrote:
> Only if those references are used while the object is the-other-type.
> (It is the _object_ that is blessed not the reference.)
> This can only really be a problem in the case of threads::shared objects
> which are _truely_ shared rather than cloned.
>

So there is a non-zero chance that the bless/rebless cycle could be a problem in
certain applications (enough weasel-words there?). Probably worth noting in the
docs, but not worth doing anything about.

However, to be absolutely sure that there is no effect, the portion of my patch
to sv_2pv_flags that made it possible to not trigger the AMAGIC could still be
considered:

--- perl.test/sv.c.orig Sat Aug 10 21:40:37 2002
+++ perl.test/sv.c Wed Aug 14 04:10:29 2002
@@ -2932,7 +2932,7 @@ Perl_sv_2pv_flags(pTHX_ register SV *sv,
if (SvTHINKFIRST(sv)) {
if (SvROK(sv)) {
SV* tmpstr;
- if (SvAMAGIC(sv) && (tmpstr=AMG_CALLun(sv,string)) &&
+ if (flags && SvAMAGIC(sv) && (tmpstr=AMG_CALLun(sv,string)) &&
(SvTYPE(tmpstr) != SVt_RV || (SvRV(tmpstr) != SvRV(sv))))
return SvPV(tmpstr,*lp);
sv = (SV*)SvRV(sv);

If flags==0, the AMAGIC does not fire and the standard "class=TYPE(0xADDRESS)"
is returned instead of the stringified contents.

Benjamin Goldberg

unread,
Aug 15, 2002, 11:15:38 PM8/15/02
to p5p
[All CCs changed to BCCs -- send me an email if you were in the CC
list, and aren't subscribed to p5p]

This *would* be true, except for that blessings aren't (AFAIK) passed
from one thread to another.

--
tr/`4/ /d, print "@{[map --$| ? ucfirst lc : lc, split]},\n" for
pack 'u', pack 'H*', 'ab5cf4021bafd28972030972b00a218eb9720000';

Arthur Bergman

unread,
Aug 16, 2002, 4:21:04 AM8/16/02
to Elizabeth Mattijsen, p5p

On fredag, augusti 16, 2002, at 10:15 , Elizabeth Mattijsen wrote:

>
> Blessings _are_ cloned. So if you have a blessed object that exists
> before a thread starts, that object gets cloned to the thread and then
> it should be able to re-bless the object (with the usual caveats)
> without interfering with any of the other (cloned) objects. Don't know
> whether that is a bug or a feature. It is a fact in 5.8.0.
>
>

More like not possible to do with current magic way, there is no way to
catch a bless, I tried adding it but it was vetoed out.

One could write a CORE::GLOBAL::bless that does it, but it would incur
performance slowdowns on non threaded code.

Arthur

Elizabeth Mattijsen

unread,
Aug 16, 2002, 4:15:14 AM8/16/02
to p5p
At 11:15 PM 8/15/02 -0400, Benjamin Goldberg wrote:
> > Only if those references are used while the object is the-other-type.
> > (It is the _object_ that is blessed not the reference.)
> > This can only really be a problem in the case of threads::shared
> > objects which are _truely_ shared rather than cloned.
>This *would* be true, except for that blessings aren't (AFAIK) passed
>from one thread to another.

Blessings _are_ cloned. So if you have a blessed object that exists before

a thread starts, that object gets cloned to the thread and then it should
be able to re-bless the object (with the usual caveats) without interfering
with any of the other (cloned) objects. Don't know whether that is a bug
or a feature. It is a fact in 5.8.0.

To my knowledge, there are no shared _objects_. Sharing is implemented as
a tie() currently, which implements a system of passing values around
between the threads for shared variables. You cannot pass around
references between threads. And since objects in perl are implemented as
blessed references, you cannot have truely shared objects (currently). You
_can_ have cloned blessed references to a shared hash e.g., but they cannot
be made after the thread has started.

Arthur or Nick, please correct me if I'm wrong here... ;-)


Liz

h...@crypt.org

unread,
Aug 21, 2002, 9:37:22 PM8/21/02
to Graham Barr, perl5-...@perl.org
Graham Barr <gb...@pobox.com> wrote:

Subject to Nicholas's reservations, I like it.

Hugo

Yitzchak Scott-Thoennes

unread,
Aug 28, 2002, 3:27:11 PM8/28/02
to perl5-...@perl.org, yves....@mciworldcom.de
On Wed, 14 Aug 2002 16:07:16 +0100, yves....@mciworldcom.de wrote:
>
>But i agree that changing the behaviour of ref() isnt the way to go. A new
>keyword that provided reliable and robust type introspection for all cases
>would be very useful I believe. (Also finding the name of a glob is
>annoying, as is finding out what pattern a reblessed qr// contains (the
>later is afaik impossible))

If I understand what you want, find the name of a glob like so:

sub globname { '*' . *{$_[0]}{PACKAGE} . '::' . *{$_[0]}{NAME} }
$foo = *FOO::BAR;
print globname($foo)

For detecting reblessed qr//, if the new class has @ISA = 'Regexp'
checking $foo->isa('Regexp') works fine. Otherwise try:

sub is_re { eval { use B; B::svref_2object($_[0])->MAGIC->TYPE eq 'r' } }
$foo = qr/\bfoo\b/;
bless $foo,'classname';
print is_re($foo) ? "Yup its a regexp" : "nope it aint a regexp";

and temporarily rebless it into 'Regexp' to get the original pattern.

(You could just look at B::svref_2object($qr)->MAGIC->precomp also,
but that needs some massaging to be usable...search sv.c for "msix".)

Though a reblessed qr// really should stringize as the pattern (unless
overloaded). Removing the unnecessary Regexp check should do that:

--- sv.c.orig Wed Aug 28 12:12:30 2002
+++ sv.c Wed Aug 28 12:13:00 2002
@@ -2946,7 +2946,6 @@
if ( ((SvFLAGS(sv) &
(SVs_OBJECT|SVf_OK|SVs_GMG|SVs_SMG|SVs_RMG))
== (SVs_OBJECT|SVs_RMG))
- && strEQ(s=HvNAME(SvSTASH(sv)), "Regexp")
&& (mg = mg_find(sv, PERL_MAGIC_qr))) {
regexp *re = (regexp *)mg->mg_obj;

End of Patch.

Also, adding a built-in stringize method to the Regexp class would be easy.
If that's desired, what should it be called?

Yitzchak Scott-Thoennes

unread,
Aug 29, 2002, 8:31:31 PM8/29/02
to perl5-...@perl.org
On Wed, 28 Aug 2002 12:27:11 -0700, stho...@efn.org wrote:
>
>Though a reblessed qr// really should stringize as the pattern (unless
>overloaded). Removing the unnecessary Regexp check should do that:

Here's a patch with a new test.

--- perl/sv.c.orig Fri Aug 23 03:58:16 2002
+++ perl/sv.c Thu Aug 29 13:45:26 2002
@@ -2948,7 +2948,6 @@


if ( ((SvFLAGS(sv) &
(SVs_OBJECT|SVf_OK|SVs_GMG|SVs_SMG|SVs_RMG))
== (SVs_OBJECT|SVs_RMG))
- && strEQ(s=HvNAME(SvSTASH(sv)), "Regexp")
&& (mg = mg_find(sv, PERL_MAGIC_qr))) {
regexp *re = (regexp *)mg->mg_obj;

--- perl/t/op/pat.t.orig Thu Aug 29 16:11:04 2002
+++ perl/t/op/pat.t Thu Aug 29 16:23:26 2002
@@ -6,7 +6,7 @@

$| = 1;

-print "1..922\n";
+print "1..924\n";

BEGIN {
chdir 't' if -d 't';
@@ -2902,3 +2902,11 @@
}

$test = 923;
+
+$a = bless qr/foo/;
+print(('goodfood' =~ $a ? '' : 'not '), "ok $test - reblessed qr// matches\n");
+++$test;
+
+print(($a eq '(?-xism:foo)' ? '' : 'not '),
+ "ok $test - reblessed qr// stringizes\n");
+++$test;
End of Patch.

h...@crypt.org

unread,
Aug 30, 2002, 9:29:44 AM8/30/02
to Yitzchak Scott-Thoennes, perl5-...@perl.org
stho...@efn.org (Yitzchak Scott-Thoennes) wrote:
:>Though a reblessed qr// really should stringize as the pattern (unless
:>overloaded). Removing the unnecessary Regexp check should do that:
:
:Here's a patch with a new test.

Thanks, applied as #17813.

I don't think the middle of sv_2pv_flags() is the best place for the
logic of stringifying a qr// object; most of this code should probably
be hived off to a separate function, though I'm not sure whether it
should be in reg* or mg.c.

Hugo

Yitzchak Scott-Thoennes

unread,
Sep 5, 2002, 12:09:01 AM9/5/02
to perl5-...@perl.org

I've looked at doing this, but ran into a problem. As far as I can
see sv_2pv_flags shouldn't ever return utf8 without setting the flag
on the passed sv. Right now, it does in two places: "" overloaded
objects and qr//'s with literal utf8.

Is there any reason we can't set the UTF8 flag on RVs whose SvPV has
UTF8? This is easy for the overload case; for qr//, is there a flag
that says precomp contains literal utf8?

For starters, here are some todo tests (though I couldn't get the last
op/pat.t test to succeed even changing sv_2pv_flags to alway set UTF8
for a qr// so the test may be wrong):

--- perl/t/op/pat.t.orig Thu Sep 5 11:06:44 2002
+++ perl/t/op/pat.t Thu Sep 5 11:53:44 2002


@@ -6,7 +6,7 @@

$| = 1;

-print "1..924\n";
+print "1..928\n";



BEGIN {
chdir 't' if -d 't';

@@ -2910,3 +2910,22 @@


print(($a eq '(?-xism:foo)' ? '' : 'not '),

"ok $test - reblessed qr// stringizes\n");

++$test;
+
+$x = "\x{3fe}";
+$a = qr/$x/;
+print(($x =~ $a ? '' : 'not '), "ok $test - utf8 interpolation in qr//\n");
+++$test;
+
+print(("a$a" =~ $x ? '' : 'not '),
+ "ok $test - stringifed qr// preserves utf8 # TODO\n");
+++$test;
+
+print(("a$x" =~ qr/a$a/ ? '' : 'not '),
+ "ok $test - interpolated qr// preserves utf8 # TODO\n");
+++$test;
+
+print(("a$x" =~ qr/a(??{$a})/ ? '' : 'not '),
+ "ok $test - postponed interpolation of qr// preserves utf8 # TODO\n");
+++$test;
+
+# last test 928
--- perl/lib/overload.t.orig Wed Sep 4 22:22:26 2002
+++ perl/lib/overload.t Thu Sep 5 11:02:50 2002
@@ -41,7 +41,7 @@

package main;

-$test = 0;
+our $test = 0;
$| = 1;
print "1..",&last,"\n";

@@ -1064,9 +1064,12 @@


my $utfvar = new utf8_o 200.2.1;
-test("$utfvar" eq 200.2.1); # 223
+test("$utfvar" eq 200.2.1); # 223 - overload via sv_copypv
+++$test;
+print(("a$utfvar" eq "a".200.2.1 ? "ok " : "not ok "),
+ $test, " - overload via sv_2pv_flags # TODO\n");

-# 224..226 -- more %{} tests. Hangs in 5.6.0, okay in later releases.
+# 225..227 -- more %{} tests. Hangs in 5.6.0, okay in later releases.
# Basically this example implements strong encapsulation: if Hderef::import()
# were to eval the overload code in the caller's namespace, the privatisation
# would be quite transparent.
@@ -1080,9 +1083,9 @@
package main;
my $a = Foo->new;
$a->xet('b', 42);
-print $a->xet('b') == 42 ? "ok 224\n" : "not ok 224\n";
-print defined eval { $a->{b} } ? "not ok 225\n" : "ok 225\n";
-print $@ =~ /zap/ ? "ok 226\n" : "not ok 226\n";
+print $a->xet('b') == 42 ? "ok 225\n" : "not ok 225\n";
+print defined eval { $a->{b} } ? "not ok 226\n" : "ok 226\n";
+print $@ =~ /zap/ ? "ok 227\n" : "not ok 227\n";

# Last test is:
-sub last {226}
+sub last {227}
End of Patch.

Yitzchak Scott-Thoennes

unread,
Sep 6, 2002, 12:23:03 PM9/6/02
to perl5-...@perl.org
On Wed, 04 Sep 2002 21:09:01 -0700, stho...@efn.org wrote:
>On Fri, 30 Aug 2002 14:29:44 +0100, h...@crypt.org wrote:
>>I don't think the middle of sv_2pv_flags() is the best place for the
>>logic of stringifying a qr// object; most of this code should probably
>>be hived off to a separate function, though I'm not sure whether it
>>should be in reg* or mg.c.
>
>I've looked at doing this, but ran into a problem. As far as I can
>see sv_2pv_flags shouldn't ever return utf8 without setting the flag
>on the passed sv. Right now, it does in two places: "" overloaded
>objects and qr//'s with literal utf8.
>
>Is there any reason we can't set the UTF8 flag on RVs whose SvPV has
>UTF8? This is easy for the overload case; for qr//, is there a flag
>that says precomp contains literal utf8?

Here's the fix for overload. Note that it simplifies sv_copypv.
Applies on top of previous TODO test patch.

--- perl/sv.h.orig Thu Aug 22 16:01:04 2002
+++ perl/sv.h Thu Sep 5 19:06:50 2002
@@ -207,7 +207,7 @@
#define SVp_POK 0x04000000 /* has valid non-public pointer value */
#define SVp_SCREAM 0x08000000 /* has been studied? */

-#define SVf_UTF8 0x20000000 /* SvPVX is UTF-8 encoded */
+#define SVf_UTF8 0x20000000 /* SvPV is UTF-8 encoded */

#define SVf_THINKFIRST (SVf_READONLY|SVf_ROK|SVf_FAKE)

--- perl/lib/overload.t.orig Fri Sep 6 09:17:50 2002
+++ perl/lib/overload.t Fri Sep 6 09:20:16 2002
@@ -1064,10 +1064,8 @@




my $utfvar = new utf8_o 200.2.1;

-test("$utfvar" eq 200.2.1); # 223 - overload via copypv
-++$test;
-print(("a$utfvar" eq "a".200.2.1 ? "ok " : "not ok "),
- $test, " - overload via sv_2pv_flags # TODO\n");
+test("$utfvar" eq 200.2.1); # 223 - stringify
+test("a$utfvar" eq "a".200.2.1); # 224 - overload via sv_2pv_flags



# 225..227 -- more %{} tests. Hangs in 5.6.0, okay in later releases.
# Basically this example implements strong encapsulation: if Hderef::import()

--- perl/sv.c.orig Thu Sep 5 15:30:22 2002
+++ perl/sv.c Thu Sep 5 20:32:54 2002
@@ -2935,8 +2935,14 @@
if (SvROK(sv)) {
SV* tmpstr;


if (SvAMAGIC(sv) && (tmpstr=AMG_CALLun(sv,string)) &&

- (SvTYPE(tmpstr) != SVt_RV || (SvRV(tmpstr) != SvRV(sv))))
- return SvPV(tmpstr,*lp);
+ (SvTYPE(tmpstr) != SVt_RV || (SvRV(tmpstr) != SvRV(sv)))) {
+ char *pv = SvPV(tmpstr, *lp);
+ if (SvUTF8(tmpstr))
+ SvUTF8_on(sv);
+ else
+ SvUTF8_off(sv);
+ return pv;
+ }
sv = (SV*)SvRV(sv);
if (!sv)
s = "NULLREF";
@@ -3193,28 +3199,16 @@
void
Perl_sv_copypv(pTHX_ SV *dsv, register SV *ssv)
{
- SV *tmpsv;
-
- if ( SvTHINKFIRST(ssv) && SvROK(ssv) && SvAMAGIC(ssv) &&
- (tmpsv = AMG_CALLun(ssv,string))) {
- if (SvTYPE(tmpsv) != SVt_RV || (SvRV(tmpsv) != SvRV(ssv))) {
- SvSetSV(dsv,tmpsv);
- return;
- }
- } else {
- tmpsv = sv_newmortal();
- }
- {
- STRLEN len;
- char *s;
- s = SvPV(ssv,len);
- sv_setpvn(tmpsv,s,len);
- if (SvUTF8(ssv))
- SvUTF8_on(tmpsv);
- else
- SvUTF8_off(tmpsv);
- SvSetSV(dsv,tmpsv);
- }
+ SV *tmpsv = sv_newmortal();
+ STRLEN len;
+ char *s;
+ s = SvPV(ssv,len);
+ sv_setpvn(tmpsv,s,len);
+ if (SvUTF8(ssv))
+ SvUTF8_on(tmpsv);
+ else
+ SvUTF8_off(tmpsv);
+ SvSetSV(dsv,tmpsv);
}

/*
End of Patch.

h...@crypt.org

unread,
Sep 8, 2002, 12:06:24 PM9/8/02
to Yitzchak Scott-Thoennes, perl5-...@perl.org
stho...@efn.org (Yitzchak Scott-Thoennes) wrote:
:For starters, here are some todo tests (though I couldn't get the last

:op/pat.t test to succeed even changing sv_2pv_flags to alway set UTF8
:for a qr// so the test may be wrong):
[...]

And in <nZNe9gzk...@efn.org>:


Here's the fix for overload. Note that it simplifies sv_copypv.
Applies on top of previous TODO test patch.

Thanks, both applied as #17864.

Hugo

Yitzchak Scott-Thoennes

unread,
Sep 12, 2002, 1:22:45 AM9/12/02
to perl5-...@perl.org

And here's the regex part.

Patch does the following:
1. Clean up some reused magic flags (mg.h, dump.c)
This isn't really related, but I ran across it while doing the other thing.
Not sure where to put a test for this, since to test TAINTEDDIR, I'd need
Devel::Peek under -T (maybe a fresh_perl_like() call in Peek.t?) and I
don't even know how to get the MINMATCH flag set.
2. Simplify sv_copypv further (last patch didn't do this as much as possible)
3. have sv_2pv_flags set the UTF8 flag if appropriate when given a qr//
4. set PMdf_DYN_UTF8 when compiling a utf8 regex in (??{$utf8})
Can someone explain the difference between PMdf_DYN_UTF8 and PMdf_UTF8?
And tell me if this should be dependent on DO_UTF8 or just SvUTF8?
5. switch regexec into (or out of) utf mode as appropriate for (??{}).
Don't know if there is a better way to save/restore PL_reg_flags & RF_UTF8.
Or if there are other places that should also do this. Or if everything
even works when UTF goes on and off within a regexp. Someone else's eyes
would be greatly appreciated.
6. de-TODO the op/pat.t tests I added and add more tests (some that failed
before and some to test that I didn't break things).

I'd still like to hear someone say that it is a workable idea to set the
UTF8 flag even when there is no PVX.

--- perl/mg.h.orig Thu Aug 22 16:00:38 2002
+++ perl/mg.h Wed Sep 11 11:43:16 2002
@@ -33,13 +33,12 @@
I32 mg_len;
};

-#define MGf_TAINTEDDIR 1
+#define MGf_TAINTEDDIR 1 /* PERL_MAGIC_envelem only */
+#define MGf_MINMATCH 1 /* PERL_MAGIC_regex_global only */
#define MGf_REFCOUNTED 2
#define MGf_GSKIP 4
#define MGf_COPY 8
#define MGf_DUP 16
-
-#define MGf_MINMATCH 1

#define MgTAINTEDDIR(mg) (mg->mg_flags & MGf_TAINTEDDIR)
#define MgTAINTEDDIR_on(mg) (mg->mg_flags |= MGf_TAINTEDDIR)
--- perl/dump.c.orig Thu Aug 22 15:59:20 2002
+++ perl/dump.c Wed Sep 11 11:39:44 2002
@@ -768,7 +768,7 @@
{ PERL_MAGIC_taint, "taint(t)" },
{ PERL_MAGIC_uvar_elem, "uvar_elem(v)" },
{ PERL_MAGIC_vec, "vec(v)" },
- { PERL_MAGIC_vstring, "v-string(V)" },
+ { PERL_MAGIC_vstring, "vstring(V)" },
{ PERL_MAGIC_substr, "substr(x)" },
{ PERL_MAGIC_defelem, "defelem(y)" },
{ PERL_MAGIC_ext, "ext(~)" },
@@ -842,13 +842,15 @@

if (mg->mg_flags) {
Perl_dump_indent(aTHX_ level, file, " MG_FLAGS = 0x%02X\n", mg->mg_flags);
- if (mg->mg_flags & MGf_TAINTEDDIR)
+ if (mg->mg_type == PERL_MAGIC_envelem &&
+ mg->mg_flags & MGf_TAINTEDDIR)
Perl_dump_indent(aTHX_ level, file, " TAINTEDDIR\n");
if (mg->mg_flags & MGf_REFCOUNTED)
Perl_dump_indent(aTHX_ level, file, " REFCOUNTED\n");
if (mg->mg_flags & MGf_GSKIP)
Perl_dump_indent(aTHX_ level, file, " GSKIP\n");
- if (mg->mg_flags & MGf_MINMATCH)
+ if (mg->mg_type == PERL_MAGIC_regex_global &&
+ mg->mg_flags & MGf_MINMATCH)
Perl_dump_indent(aTHX_ level, file, " MINMATCH\n");
}
if (mg->mg_obj) {
--- perl/sv.c.orig Mon Sep 9 16:18:12 2002
+++ perl/sv.c Wed Sep 11 11:41:18 2002
@@ -2894,7 +2894,7 @@
{
register char *s;
int olderrno;
- SV *tsv;
+ SV *tsv, *origsv;
char tbuf[64]; /* Must fit sprintf/Gconvert of longest IV/NV */
char *tmpbuf = tbuf;

@@ -2943,6 +2943,7 @@
SvUTF8_off(sv);
return pv;
}
+ origsv = sv;


sv = (SV*)SvRV(sv);
if (!sv)
s = "NULLREF";

@@ -3024,6 +3025,11 @@
mg->mg_ptr[mg->mg_len] = 0;
}
PL_reginterp_cnt += re->program[0].next_off;
+
+ if (re->reganch & ROPT_UTF8)
+ SvUTF8_on(origsv);
+ else
+ SvUTF8_off(origsv);
*lp = mg->mg_len;
return mg->mg_ptr;
}
@@ -3199,16 +3205,14 @@


void
Perl_sv_copypv(pTHX_ SV *dsv, register SV *ssv)
{

- SV *tmpsv = sv_newmortal();
STRLEN len;
char *s;


s = SvPV(ssv,len);
- sv_setpvn(tmpsv,s,len);

+ sv_setpvn(dsv,s,len);
if (SvUTF8(ssv))
- SvUTF8_on(tmpsv);
+ SvUTF8_on(dsv);


else
- SvUTF8_off(tmpsv);
- SvSetSV(dsv,tmpsv);

+ SvUTF8_off(dsv);
}

/*
--- perl/regexec.c.orig Tue Sep 10 10:16:40 2002
+++ perl/regexec.c Wed Sep 11 12:02:36 2002
@@ -2821,6 +2821,7 @@
MAGIC *mg = Null(MAGIC*);
re_cc_state state;
CHECKPOINT cp, lastcp;
+ int toggleutf;

if(SvROK(ret) || SvRMAGICAL(ret)) {
SV *sv = SvROK(ret) ? SvRV(ret) : ret;
@@ -2841,6 +2842,7 @@
I32 onpar = PL_regnpar;

Zero(&pm, 1, PMOP);
+ if (DO_UTF8(ret)) pm.op_pmdynflags |= PMdf_DYN_UTF8;
re = CALLREGCOMP(aTHX_ t, t + len, &pm);
if (!(SvFLAGS(ret)
& (SVs_TEMP | SVs_PADTMP | SVf_READONLY)))
@@ -2873,6 +2875,9 @@
*PL_reglastcloseparen = 0;
PL_reg_call_cc = &state;
PL_reginput = locinput;
+ toggleutf = ((PL_reg_flags & RF_utf8) != 0) ^
+ ((re->reganch & ROPT_UTF8) != 0);
+ if (toggleutf) PL_reg_flags ^= RF_utf8;

/* XXXX This is too dramatic a measure... */
PL_reg_maxiter = 0;
@@ -2887,6 +2892,7 @@
PL_regcc = state.cc;
PL_reg_re = state.re;
cache_re(PL_reg_re);
+ if (toggleutf) PL_reg_flags ^= RF_utf8;

/* XXXX This is too dramatic a measure... */
PL_reg_maxiter = 0;
@@ -2903,6 +2909,7 @@
PL_regcc = state.cc;
PL_reg_re = state.re;
cache_re(PL_reg_re);
+ if (toggleutf) PL_reg_flags ^= RF_utf8;

/* XXXX This is too dramatic a measure... */
PL_reg_maxiter = 0;
--- perl/t/op/pat.t.orig Thu Aug 22 16:01:08 2002
+++ perl/t/op/pat.t Wed Sep 11 22:02:04 2002


@@ -6,7 +6,7 @@

$| = 1;

-print "1..928\n";
+print "1..936\n";



BEGIN {
chdir 't' if -d 't';

@@ -2913,20 +2913,60 @@
++$test;

$x = "\x{3fe}";
+$z=$y = "\317\276"; # $y is byte representation of $x
+
$a = qr/$x/;


print(($x =~ $a ? '' : 'not '), "ok $test - utf8 interpolation in qr//\n");

++$test;



print(("a$a" =~ $x ? '' : 'not '),

- "ok $test - stringifed qr// preserves utf8 # TODO\n");
+ "ok $test - stringifed qr// preserves utf8\n");
+++$test;
+
+print(("a$x" =~ /^a$a\z/ ? '' : 'not '),
+ "ok $test - interpolated qr// preserves utf8\n");
+++$test;
+
+print(("a$x" =~ /^a(??{$a})\z/ ? '' : 'not '),
+ "ok $test - postponed interpolation of qr// preserves utf8\n");
+++$test;
+
+{ use re 'eval';
+
+print(("$x$x" =~ /^$x(??{$x})\z/ ? '' : 'not '),
+ "ok $test - postponed utf8 string in utf8 re matches utf8\n");
+++$test;
+
+print(("$y$x" =~ /^$y(??{$x})\z/ ? '' : 'not '),
+ "ok $test - postponed utf8 string in non-utf8 re matches utf8\n");
++$test;

-print(("a$x" =~ qr/a$a/ ? '' : 'not '),
- "ok $test - interpolated qr// preserves utf8 # TODO\n");
+print(("$y$x" !~ /^$y(??{$y})\z/ ? '' : 'not '),
+ "ok $test - postponed non-utf8 string in non-utf8 re doesn't match utf8\n");
++$test;

-print(("a$x" =~ qr/a(??{$a})/ ? '' : 'not '),
- "ok $test - postponed interpolation of qr// preserves utf8 # TODO\n");
+print(("$x$x" !~ /^$x(??{$y})\z/ ? '' : 'not '),
+ "ok $test - postponed non-utf8 string in utf8 re doesn't match utf8\n");
++$test;

-# last test 928
+print(("$y$y" =~ /^$y(??{$y})\z/ ? '' : 'not '),
+ "ok $test - postponed non-utf8 string in non-utf8 re matches non-utf8\n");
+++$test;
+
+print(("$x$y" =~ /^$x(??{$y})\z/ ? '' : 'not '),
+ "ok $test - postponed non-utf8 string in utf8 re matches non-utf8\n");
+++$test;
+$y = $z; # reset $y after upgrade
+
+print(("$x$y" !~ /^$x(??{$x})\z/ ? '' : 'not '),
+ "ok $test - postponed utf8 string in utf8 re doesn't match non-utf8\n");
+++$test;
+$y = $z; # reset $y after upgrade
+
+print(("$y$y" !~ /^$y(??{$x})\z/ ? '' : 'not '),
+ "ok $test - postponed utf8 string in non-utf8 re doesn't match non-utf8\n");
+++$test;
+
+} # no re 'eval'
+
+# last test 936
End of Patch.

h...@crypt.org

unread,
Sep 23, 2002, 12:14:40 PM9/23/02
to Yitzchak Scott-Thoennes, perl5-...@perl.org, j...@iki.fi
stho...@efn.org (Yitzchak Scott-Thoennes) wrote:
:And here's the regex part.

Filed to apply later, once I've caught up some more.

:Patch does the following:


:1. Clean up some reused magic flags (mg.h, dump.c)
: This isn't really related, but I ran across it while doing the other thing.
: Not sure where to put a test for this, since to test TAINTEDDIR, I'd need
: Devel::Peek under -T (maybe a fresh_perl_like() call in Peek.t?) and I
: don't even know how to get the MINMATCH flag set.

MINMATCH is set when we've matched zero width, to ensure we don't repeat
the same match; C< perl -wle 'print "$-[0]:$+[0]" while "banana" =~ /a*?/g' >
will set it several times.

:4. set PMdf_DYN_UTF8 when compiling a utf8 regex in (??{$utf8})


: Can someone explain the difference between PMdf_DYN_UTF8 and PMdf_UTF8?
: And tell me if this should be dependent on DO_UTF8 or just SvUTF8?

Not sure; I'll look into it.

:5. switch regexec into (or out of) utf mode as appropriate for (??{}).


: Don't know if there is a better way to save/restore PL_reg_flags & RF_UTF8.
: Or if there are other places that should also do this. Or if everything
: even works when UTF goes on and off within a regexp. Someone else's eyes
: would be greatly appreciated.

Not sure what you mean by "UTF goes on and off" - other than (??{...})
when else can it switch?

:I'd still like to hear someone say that it is a workable idea to set the


:UTF8 flag even when there is no PVX.

No idea on this one; perhaps Jarkko can advise.

Hugo

Jarkko Hietaniemi

unread,
Sep 23, 2002, 1:09:23 PM9/23/02
to h...@crypt.org, Yitzchak Scott-Thoennes, perl5-...@perl.org
> :4. set PMdf_DYN_UTF8 when compiling a utf8 regex in (??{$utf8})
> : Can someone explain the difference between PMdf_DYN_UTF8 and PMdf_UTF8?
> : And tell me if this should be dependent on DO_UTF8 or just SvUTF8?
>
> Not sure; I'll look into it.

I can't say that I'm in possession of full understanding of the
difference between those two, either. I *think* the difference is
that PMdf_DYN_UTF8 is more runtime and used if the regex pattern string
has UTF-8 in it, while PMdf_UTF8 is more compiletime. I once long ago
tried unifying them, but that attempt blew up. Maybe now that Unicode
and regexes are more robust a combination, a new attempt at unifying
would be easier. (Note: be sure to run both "make test" *and* "make
utest" after diddling with those. t/op/regexp.t and t/op/pat.t run
both without -Mutf8 and with will quickly tell one whether one got
it right.)

> :5. switch regexec into (or out of) utf mode as appropriate for (??{}).
> : Don't know if there is a better way to save/restore PL_reg_flags & RF_UTF8.
> : Or if there are other places that should also do this. Or if everything
> : even works when UTF goes on and off within a regexp. Someone else's eyes
> : would be greatly appreciated.
>
> Not sure what you mean by "UTF goes on and off" - other than (??{...})
> when else can it switch?
>
> :I'd still like to hear someone say that it is a workable idea to set the
> :UTF8 flag even when there is no PVX.

I don't understand this suggestion, sorry... need more data. Off-hand I'd
say setting the UTF8 flag when there's no PVX sounds like a bad idea. Why?
(How about setting SvIOK when there's no IVX?) The UTF-8 tells additional
information about the PVX; not sure what the flag without a PVX should mean.

> No idea on this one; perhaps Jarkko can advise.

--
Jarkko Hietaniemi <j...@iki.fi> http://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen

Yitzchak Scott-Thoennes

unread,
Sep 23, 2002, 4:07:08 PM9/23/02
to j...@iki.fi, h...@crypt.org, perl5-...@perl.org
On Mon, 23 Sep 2002 20:09:23 +0300, j...@iki.fi wrote:
>> :5. switch regexec into (or out of) utf mode as appropriate for (??{}).
>> : Don't know if there is a better way to save/restore PL_reg_flags & RF_UTF8.
>> : Or if there are other places that should also do this. Or if everything
>> : even works when UTF goes on and off within a regexp. Someone else's eyes
>> : would be greatly appreciated.
>>
>> Not sure what you mean by "UTF goes on and off" - other than (??{...})
>> when else can it switch?

I meant when (??{...}) switches into or out of UTF8 matching I don't know
if $&, capturing parens, etc. will always set the UTF8 flag properly.

>> :I'd still like to hear someone say that it is a workable idea to set the
>> :UTF8 flag even when there is no PVX.
>
>I don't understand this suggestion, sorry... need more data. Off-hand I'd
>say setting the UTF8 flag when there's no PVX sounds like a bad idea. Why?
>(How about setting SvIOK when there's no IVX?) The UTF-8 tells additional
>information about the PVX; not sure what the flag without a PVX should mean.

To indicate whether the SvPV (aka sv_2pv_flags) is UTF8. This can
happen with RVs in two cases: overloaded stringify, and stringified
qr//. This patch and the one before (applied as 17864) fix these two
cases to set the UTF8 flag on the RV.

This means SvPV should always be called before checking UTF8, but I
think that is already so in case of magic.

Jarkko Hietaniemi

unread,
Sep 23, 2002, 6:05:52 PM9/23/02
to Yitzchak Scott-Thoennes, h...@crypt.org, perl5-...@perl.org
> I meant when (??{...}) switches into or out of UTF8 matching I don't
> know if $&, capturing parens, etc. will always set the UTF8 flag
> properly.

Yechhh. I don't know that either.

> >> :I'd still like to hear someone say that it is a workable idea to
> >> set the :UTF8 flag even when there is no PVX.

> >I don't understand this suggestion, sorry... need more data.
> >Off-hand I'd say setting the UTF8 flag when there's no PVX sounds
> >like a bad idea. Why? (How about setting SvIOK when there's no
> >IVX?) The UTF-8 tells additional information about the PVX; not
> >sure what the flag without a PVX should mean.
>
> To indicate whether the SvPV (aka sv_2pv_flags) is UTF8. This can
> happen with RVs in two cases: overloaded stringify, and stringified
> qr//. This patch and the one before (applied as 17864) fix these two
> cases to set the UTF8 flag on the RV.

Ahhh, okay. That makes sense. But don't be surprised if having UTF-8
on RVs will smoke out some bugs from the woodwork (e.g. I'm not
certain whether dump.c/Devel::Peek will grok that completely).
Remember to test with "make utest" (there are some known failures
with that, so test "utest" also without your patch).

Yitzchak Scott-Thoennes

unread,
Sep 27, 2002, 1:16:39 AM9/27/02
to perl5-...@perl.org
On Mon, 23 Sep 2002 17:14:40 +0100, h...@crypt.org wrote:
>stho...@efn.org (Yitzchak Scott-Thoennes) wrote:
>:And here's the regex part.
>
>Filed to apply later, once I've caught up some more.
>
>:Patch does the following:
>:1. Clean up some reused magic flags (mg.h, dump.c)
>: This isn't really related, but I ran across it while doing the other thing.
>: Not sure where to put a test for this, since to test TAINTEDDIR, I'd need
>: Devel::Peek under -T (maybe a fresh_perl_like() call in Peek.t?) and I
>: don't even know how to get the MINMATCH flag set.
>
>MINMATCH is set when we've matched zero width, to ensure we don't repeat
>the same match; C< perl -wle 'print "$-[0]:$+[0]" while "banana" =~ /a*?/g' >
>will set it several times.

Thanks. Here's just this unrelated part of the previous patch, plus tests.

--- perl/ext/Devel/Peek/Peek.t.orig Mon Sep 23 22:04:32 2002
+++ perl/ext/Devel/Peek/Peek.t Mon Sep 23 22:56:18 2002
@@ -1,4 +1,4 @@
-#!./perl
+#!./perl -T



BEGIN {
chdir 't' if -d 't';

@@ -12,7 +12,7 @@

use Devel::Peek;

-print "1..19\n";
+print "1..21\n";

our $DEBUG = 0;
open(SAVERR, ">&STDERR") or die "Can't dup STDERR: $!";
@@ -31,7 +31,7 @@
print $pattern, "\n" if $DEBUG;
my $dump = <IN>;
print $dump, "\n" if $DEBUG;
- print "[$dump] vs [$pattern]\nnot " unless $dump =~ /$pattern/ms;
+ print "[$dump] vs [$pattern]\nnot " unless $dump =~ /\A$pattern\Z/ms;
print "ok $_[0]\n";
close(IN);
return $1;
@@ -391,6 +391,45 @@
CUR = 2
LEN = \\d+');
}
+
+my $x="";
+$x=~/.??/g;
+do_test(20,
+ $x,
+'SV = PVMG\\($ADDR\\) at $ADDR
+ REFCNT = 1
+ FLAGS = \\(PADBUSY,PADMY,SMG,POK,pPOK\\)
+ IV = 0
+ NV = 0
+ PV = $ADDR ""\\\0
+ CUR = 0
+ LEN = 1
+ MAGIC = $ADDR
+ MG_VIRTUAL = &PL_vtbl_mglob
+ MG_TYPE = PERL_MAGIC_regex_global\\(g\\)
+ MG_FLAGS = 0x01
+ MINMATCH');
+
+do_test(21,
+ $ENV{PATH}=@ARGV, # scalar(@ARGV) is a handy known tainted value
+'SV = PVMG\\($ADDR\\) at $ADDR
+ REFCNT = 1
+ FLAGS = \\(GMG,SMG,RMG,pIOK,pPOK\\)
+ IV = 0
+ NV = 0
+ PV = $ADDR "0"\\\0
+ CUR = 1
+ LEN = 175
+ MAGIC = $ADDR
+ MG_VIRTUAL = &PL_vtbl_envelem
+ MG_TYPE = PERL_MAGIC_envelem\\(e\\)
+ MG_FLAGS = 0x01
+ TAINTEDDIR
+ MG_LEN = 4
+ MG_PTR = $ADDR "PATH"
+ MAGIC = $ADDR
+ MG_VIRTUAL = &PL_vtbl_taint
+ MG_TYPE = PERL_MAGIC_taint\\(t\\)');

END {
1 while unlink("peek$$");
End of Patch.

h...@crypt.org

unread,
Oct 1, 2002, 5:21:33 AM10/1/02
to Yitzchak Scott-Thoennes, perl5-...@perl.org, h...@crypt.org
stho...@efn.org (Yitzchak Scott-Thoennes) wrote:
:And here's the regex part.

Thanks, applied as #17947.

Hugo

h...@crypt.org

unread,
Oct 2, 2002, 11:02:18 AM10/2/02
to Yitzchak Scott-Thoennes, perl5-...@perl.org
stho...@efn.org (Yitzchak Scott-Thoennes) wrote:
:Thanks. Here's just this unrelated part of the previous patch, plus tests.

Thanks; the unrelated parts were already in, but I added the tests with
one small fix: I changed the expected "LEN = 175" to "LEN = \d+". (Not
sure if this value is supposed to be more closely constrained: when I
ran it, I had 57.) Applied as #17956.

Hugo

Yitzchak Scott-Thoennes

unread,
Oct 7, 2002, 9:05:09 PM10/7/02
to perl5-...@perl.org
On Wed, 11 Sep 2002 22:22:45 -0700, stho...@efn.org wrote:
> Zero(&pm, 1, PMOP);
>+ if (DO_UTF8(ret)) pm.op_pmdynflags |= PMdf_DYN_UTF8;
> re = CALLREGCOMP(aTHX_ t, t + len, &pm);

I meant to say so at the time, but forgot: This is really a kludge
(having to fake up a PMOP just to pass to pregcomp). pregcomp
should probably just look like this:

regexp *Perl_pregcomp(pTHX_ SV *re, U32 flags)

where flags are the PMf_COMPILETIME flags. precomp also should
probably be a copy of the sv, not a char *.

Just something to keep in mind for the Great Regexp Rewrite.

Arthur Bergman

unread,
Oct 8, 2002, 2:33:28 AM10/8/02
to Yitzchak Scott-Thoennes, perl5-...@perl.org

On tisdag, okt 8, 2002, at 03:05 Europe/Stockholm, Yitzchak
Scott-Thoennes wrote:

>
> I meant to say so at the time, but forgot: This is really a kludge
> (having to fake up a PMOP just to pass to pregcomp). pregcomp
> should probably just look like this:
>
> regexp *Perl_pregcomp(pTHX_ SV *re, U32 flags)
>
> where flags are the PMf_COMPILETIME flags. precomp also should
> probably be a copy of the sv, not a char *.
>
> Just something to keep in mind for the Great Regexp Rewrite.

Even better, we should introduce a RE * structure instead of hanging it
of a SV*.

Arthur

h...@crypt.org

unread,
Oct 17, 2002, 8:50:15 AM10/17/02
to Yitzchak Scott-Thoennes, perl5-...@perl.org
stho...@efn.org (Yitzchak Scott-Thoennes) wrote:

:On Wed, 11 Sep 2002 22:22:45 -0700, stho...@efn.org wrote:
:> Zero(&pm, 1, PMOP);
:>+ if (DO_UTF8(ret)) pm.op_pmdynflags |= PMdf_DYN_UTF8;
:> re = CALLREGCOMP(aTHX_ t, t + len, &pm);
:
:I meant to say so at the time, but forgot: This is really a kludge
:(having to fake up a PMOP just to pass to pregcomp). pregcomp
:should probably just look like this:
:
:regexp *Perl_pregcomp(pTHX_ SV *re, U32 flags)
:
:where flags are the PMf_COMPILETIME flags. precomp also should
:probably be a copy of the sv, not a char *.

Yes, this has irritated me for a while; we'd probably have to
support the existing signature at least through a deprecation
cycle, by faking up an SV instead.

Hugo

0 new messages