Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

submatch scoping in while

0 views
Skip to first unread message

Julian Bradfield

unread,
Sep 23, 2006, 4:14:28 PM9/23/06
to
Consider the following:

@x = ( 'aaa','bbb');
while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";

The loop terminates at $i == 1 when 'bbb' matches ^(.)b
The enclosing block for the match construct is the whole file.
Therefore $1 should be 'b'.

But it isn't (in Perl 5.8.5).

What am I missing?

Compare

@x = ( 'aaa','bbb');
if ( $x[$i] !~ /^(.)a/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";

which behaves as expected.

Mumia W. (reading news)

unread,
Sep 23, 2006, 6:09:43 PM9/23/06
to

It seems that the match variables are localized to the while block. The
"if" statement does not do this.

If you want to use while, store the value ($1) elsewhere when you get a
match:

use strict;
use warnings;
my ($i, $c, @x) = (0, 0);

@x = ('aaa', 'bbb');

while (! ($x[$i] =~ /^(.)b/ && ($c = $1))) {
$i++;
}
print "\$c is *$c*, i is $i\n";

__END__

TMTOWTDI

If you don't need to know which element matched, this should work:

use strict;
use warnings;
my ($i, $c, @x) = (0, 0);

@x = ('aaa', 'rbb', 'bbb', 'cbb', 'xxy');
if ((join ':', @x) =~ /(^|:)(.)b/) {
print "\$2 is $2\n";
}

__END__

Here is some gratuitous use of s/// :-)

use strict;
use warnings;
my ($i, $c, @x) = (0);

@x = ('aaa', 'rbb', 'bbb', 'cbb', 'xxy');
s/^(.)b/$c ||= $1, $&/e for (@x);
print "\$c is $c\n";


__KEWL__


--
paduille.4...@earthlink.net


Gunnar Hjalmarsson

unread,
Sep 23, 2006, 6:25:25 PM9/23/06
to
Julian Bradfield wrote:
> Consider the following:
>
> @x = ( 'aaa','bbb');
> while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
> print "\$1 is *$1*, i is $i\n";
>
> The loop terminates at $i == 1 when 'bbb' matches ^(.)b
> The enclosing block for the match construct is the whole file.
> Therefore $1 should be 'b'.

No. Even if 'bbb' matches ^(.)b at the last loop iteration, you don't
test that. You test whether they *do not* match, so the test fails, and
$1 is not set.

> What am I missing?

Combining capturing parentheses and the !~ operator is not a good idea.
In addition to that, as Mumia pointed out, the dollar-digit variables
(when set) seem to be scoped to the while block.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Mirco Wahab

unread,
Sep 23, 2006, 6:20:57 PM9/23/06
to
Thus spoke Julian Bradfield (on 2006-09-23 22:14):

> Consider the following:
>
> @x = ( 'aaa','bbb');
> while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
> print "\$1 is *$1*, i is $i\n";
>
> The loop terminates at $i == 1 when 'bbb' matches ^(.)b
> The enclosing block for the match construct is the whole file.
> Therefore $1 should be 'b'.

Right As Mumia noted already, there is of somehow a
block, which is: { $i++; }, where the $1 is localized to.

see:

$i = 0;


@x = ('aaa','bbb');

$i++ while ( $x[$i] !~/^(.)b/ && $i<@x ) ;

print "\$1 is *$1*, i is $i\n";


Regards

Mirco

Dr.Ruud

unread,
Sep 23, 2006, 6:20:27 PM9/23/06
to
Julian Bradfield schreef:

> Consider the following:
>
> @x = ( 'aaa','bbb');
> while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
> print "\$1 is *$1*, i is $i\n";
>
> The loop terminates at $i == 1 when 'bbb' matches ^(.)b
> The enclosing block for the match construct is the whole file.
> Therefore $1 should be 'b'.

(Perl 5.8.6)
It seems that $1 is localized inside a while(...){...} or for(...){...}.

#!/usr/bin/perl
use warnings ;
use strict ;

'X' =~ /(.)/ ; # sets $1 to 'X'

my @x = ('aaa', 'bbb') ;

for (@x) {
/^(.)a/ and print "1:$1:\n" and last ;
}
print "2:$1:\n";

--
Affijn, Ruud

"Gewoon is een tijger."


Brian McCauley

unread,
Sep 23, 2006, 7:17:48 PM9/23/06
to

Julian Bradfield wrote:
> Consider the following:
>
> @x = ( 'aaa','bbb');
> while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
> print "\$1 is *$1*, i is $i\n";
>
> Compare
>
> @x = ( 'aaa','bbb');
> if ( $x[$i] !~ /^(.)a/ && $i <= $#x ) { $i++; }
> print "\$1 is *$1*, i is $i\n";
>
> which behaves as expected.

Actually, for what it's worth I find while's behaviour is expected and
if's is not.

With while() lexical variables, dynamic (package) variables and the
match capture variables behave consistantly. With if() the lexical
variable is inconsistant with the other two. But of course lexical
variables are and order of magnitude more prevalent in Perl programming
so my intuative expectation is based on their behaviour.

use strict;
use warnings;

'unchanged' =~ /(.*)/; # Assign $1
our $pkg = 'unchanged';
my $lex = 'unchanged';

while ( 'x'=~/(.*)/ and my $lex='y' and local $pkg='z' and 0 ) { die }
print "$1 $lex $pkg\n"; # unchanged unchanged unchanged

if ( 'x'=~/(.*)/ and my $lex='y' and local $pkg='z' and 0 ) { die }
print "$1 $lex $pkg\n"; # x unchanged y

__END__

Mumia W. (reading news)

unread,
Sep 23, 2006, 9:22:47 PM9/23/06
to
On 09/23/2006 05:20 PM, Mirco Wahab wrote:
> [...]

> $i = 0;
> @x = ('aaa','bbb');
>
> $i++ while ( $x[$i] !~/^(.)b/ && $i<@x ) ;
>
> print "\$1 is *$1*, i is $i\n";
>
>
> Regards
>
> Mirco

That's pretty good. It's just an inverted version of the OP's code; I
wish I'd tried it. This idea crossed my mind for about ź a second, but I
assumed that the same blocking problem would be there.

Oh well, at least I got to use the s/// operator where it isn't needed :-)


--
paduille.4...@earthlink.net

Julian Bradfield

unread,
Sep 24, 2006, 3:08:10 AM9/24/06
to
In article <4nlqetF...@individual.net>,
Gunnar Hjalmarsson <nor...@gunnar.cc> wrote:

>> @x = ( 'aaa','bbb');
>> while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
>> print "\$1 is *$1*, i is $i\n";

...


>No. Even if 'bbb' matches ^(.)b at the last loop iteration, you don't
>test that. You test whether they *do not* match, so the test fails, and
>$1 is not set.

Wrong. As demonstrated by the if example later in my post, match variables
are set by a !~ . (Otherwise $a !~ /foo/ would not be equivalent to
! ($a =~ /foo/) !)

>In addition to that, as Mumia pointed out, the dollar-digit variables
>(when set) seem to be scoped to the while block.

This seems to be the case, but it's not what the manual says.
So there's a bug either in Perl or in the manual.

Charles DeRykus

unread,
Sep 24, 2006, 7:29:31 AM9/24/06
to
Mumia W. (reading news) wrote:
> On 09/23/2006 05:20 PM, Mirco Wahab wrote:
>> [...]
>> $i = 0;
>> @x = ('aaa','bbb');
>>
>> $i++ while ( $x[$i] !~/^(.)b/ && $i<@x ) ;
>>
>> print "\$1 is *$1*, i is $i\n";
>>
>>
>> Regards
>>
>> Mirco
>
> That's pretty good. It's just an inverted version of the OP's code; I
> wish I'd tried it. This idea crossed my mind for about ź a second, but I
> assumed that the same blocking problem would be there.
>

The inverted version is a statement modifier which doesn't create a
local block scope for $1.


$ perl -le 'while ("aa" =~ /^(.)a/) {last;};print $1' # undef
$ perl -le 'print $1 and exit while ("aa" =~ /^(.)a/)' # a


I didn't find this explained in perlsyn although maybe it's hiding
elsewhere.

--
Charles DeRykus

Gunnar Hjalmarsson

unread,
Sep 24, 2006, 9:02:16 AM9/24/06
to
Julian Bradfield wrote:
> In article <4nlqetF...@individual.net>,
> Gunnar Hjalmarsson <nor...@gunnar.cc> wrote:
>>
>>>
>>>@x = ( 'aaa','bbb');
>>>while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
>>>print "\$1 is *$1*, i is $i\n";
>>
>>No. Even if 'bbb' matches ^(.)b at the last loop iteration, you don't
>>test that. You test whether they *do not* match, so the test fails, and
>>$1 is not set.
>
> Wrong. As demonstrated by the if example later in my post, match variables
> are set by a !~ . (Otherwise $a !~ /foo/ would not be equivalent to
> ! ($a =~ /foo/) !)

Hmm.. Yes, apparently I was wrong. Don't remember how I reached that
faulty conclusion. Sorry for the confusion. :(

anno...@radom.zrz.tu-berlin.de

unread,
Sep 25, 2006, 3:22:00 PM9/25/06
to
Julian Bradfield <j...@inf.ed.ac.uk> wrote in comp.lang.perl.misc:

My advice is to avoid the match variables whenever possible. It is safer
and saner to match in list context and catch the results in normal Perl
variables with no surprises, and meaningful names to boot.

To do so, first rewrite the loop control to use =~ instead of !~

while ( ! ( $x[$i] =~ /^(.)b/) && $i <= $#x ) { $i++ }

That doesn't change the behavior. Now catch the match:

while ( ! ( ( $capt) = $x[$i] =~ /^(.)b/) && $i <= $#x ) { $i++ }
print "\$capt is *$capt*, i is $i\n";

That gives you the expected capture of "b" without fuss.

BTW, your loop control is slightly off. If no match occurs, you'll
increase the index beyond the array and try that element.

Check the index first.

while ( $i <= $#x && ! ( ( $capt) = $x[$i] =~ /^(.)b/)) { $i++ }

Now the access is protected by the condition. That's the beauty of
short-circuiting booleans.

Anno

Michele Dondi

unread,
Sep 26, 2006, 10:27:56 AM9/26/06
to
On 25 Sep 2006 19:22:00 GMT, anno...@radom.zrz.tu-berlin.de wrote:

>That doesn't change the behavior. Now catch the match:
>
> while ( ! ( ( $capt) = $x[$i] =~ /^(.)b/) && $i <= $#x ) { $i++ }
> print "\$capt is *$capt*, i is $i\n";

[snip]


>Check the index first.
>
> while ( $i <= $#x && ! ( ( $capt) = $x[$i] =~ /^(.)b/)) { $i++ }

Whatever, I understand that the OP's focus was on an unexpected
behaviour -which generated an interesting discussion- rather on the
actual technique, but AIUI all this is about getting the first element
of an array that matches something, in which case a canonical C<for>
loop (either around 0..$#x or @x depending on whether the index was
really needed or not) with a C<last> would have been just fine.


Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
.'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,

Ben Morrow

unread,
Sep 25, 2006, 5:12:50 PM9/25/06
to

Quoth anno...@radom.zrz.tu-berlin.de:

>
> while ( $i <= $#x && ! ( ( $capt) = $x[$i] =~ /^(.)b/)) { $i++ }

or

until ( $i > $#x || ($capt) = $x[$i] =~ /^(.)b/ ) { $i++ }

or (cleaner IMHO)

use List::MoreUtils qw/firstidx/;

my $capt;
my $i = firstidx { ($capt) = /^(.)b/ } @x;

Ben

--
"Awww, I'm going to miss her."
"Don't you hate her?"
"Yes, with a fiery vengeance."
[benm...@tiscali.co.uk]

0 new messages