my $re = qr/ (?<ALT1>pattern) | (?<ALT2>pattern) | ...
One of the alternations happens to be nested:
my $foo = qr{
(?<CODEBEGIN>
\{
(?<CODE>
(?:
(?> [^{}\n]+ ) # Non-parens without
backtracking
|
(?&CODEBEGIN) # Recurse to start of
pattern
)*
)
\}
)
}x;
However, when I ask for the keys of %+, I only get back CODEBEGIN yet
the CODE capture is there when I ask for it. My hope was to use the
keys to determine what I matched so I didn't have to do a series of
tests on %+, but apparently I will have to continue doing this since
this method won't work.
This is Perl 5.10.0.
Thanks,
-Clint
>Maybe this is a bug, maybe not. I am using the named capture buffers
>to reduce bugs as I change grouping of my regular expressions over
>time. In a lexical analysis application, I'm using it over a series
>of alternations.
>
>my $re = qr/ (?<ALT1>pattern) | (?<ALT2>pattern) | ...
>
>One of the alternations happens to be nested:
>
>my $foo = qr{
> (?<CODEBEGIN>
> \{
> (?<CODE>
> (?:
> (?> [^{}\n]+ ) # Non-parens without
^^
This is not good here, "\n" is never consumed and most likely
the result is a non-match.
This can also be written more effectively as [^{}]++
>backtracking
> |
> (?&CODEBEGIN) # Recurse to start of
>pattern
> )*
> )
> \}
> )
> }x;
>
>However, when I ask for the keys of %+, I only get back CODEBEGIN yet
>the CODE capture is there when I ask for it. My hope was to use the
>keys to determine what I matched so I didn't have to do a series of
>tests on %+, but apparently I will have to continue doing this since
>this method won't work.
>
>This is Perl 5.10.0.
>
>Thanks,
>
>-Clint
You are right, it probably is a bug. However, %+ seems to be private
within recursion the way you have it because acording to the docs
CODEBEGIN can't know about CODE and visa-versa.
That $+{CODE} can be tested and contain a value outside of CODEBEGIN
is a mystery and worrysome. You can of course maintain your own private
hash to store results.
Below, shows this behavior in more detail. Let me know if you find
a satisfactory answer to this.
-sln
---------
use strict;
use warnings;
use Devel::Peek;
use Data::Dumper;
my %CodeAll = ();
my $container = '';
my $string = " func { subfunc { some {code }; more code } {last block}";
my $foo = qr/
(?<CODEBEGIN>
\{
(?<CODE>
(?:
[^{}]++ # Non-parens without backtracking
|
(?&CODEBEGIN) # Recurse to start of pattern
)*
)
(?{ print " * ",Dumper(\%+);
$container = $+{CODE};
})
\}
)
(?{ print ">>* ",Dumper(\%+);
$CodeAll{CODEBEGIN} = $+{CODEBEGIN};
$CodeAll{CODE} = $+{CODE};
})
/x;
print "______________________\n\n";
while ($string =~ /$foo/g)
{
print "\n\n====================\n";
Dump \%+;
print "\n( \%+ )\n",Dumper(\%+);
print "( \%CodeAll )\n",Dumper(\%CodeAll),"\n";
print "______________________\n\n";
}
__END__
Yes, I ended up simplifying my life and using this before I saw your
post:
my $code = qr{
(?<CODEBEGIN>
\{
(?<CODE>
(?:
(?> [^{}]+ ) # Non-curly without
backtracking
|
(?&CODEBEGIN) # Recurse to start of
pattern
)*
)
\}
)
}x;
Then I go back and split the token on '\\\n' to weed out the escaped
newlines. My hope was to avoid re-scanning any string, but the RE and
concatenation rules just became unmanageable at some point and I
decided to cut my losses. I'm not familiar with the '++', but I will
look that up as an alternative to using (?> ). So far you are the
only person that has responded to this post, so I'm not hopeful that
I'll get a satisfactory answer from anyone as to what's happening
here.
Thanks,
-Clint