Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Dynamic regexp

7 views
Skip to first unread message

Winston Smith

unread,
Nov 13, 2003, 11:54:23 PM11/13/03
to
Hi everybody,

I'm looking for a way to make a batch of s/// substitutions. As a code
sample is worth a thousand words, let see what me code is presently :

---

my @rules = (
['^HELLO (.*)$', 'BONJOUR $1'],
# ... lots of other rules
);

foreach my $rule (@rules) {
if ($string =~ s/$rule->[0]/$rule->[1]/ei) {
last;
}
}

---

With this code, 'HELLO WINSTON' becomes 'BONJOUR $1' and not 'BONJOUR
WINSTON' as I'd like.

I tried to put double quotes in the table of strings but it makes no
difference.

I also tried to put a second e option to the s/// operator so that the
string 'BONJOUR $1' is reinterpolated. Then I get the message 'Use of
uninitialized value in substitution iterator ...' as if $1 is not
defined. But if I put a
print $1;
instruction just before the instruction
last;
it works and actually print 'WINSTON' if I use the same exemple than
previously.

Thank you in advance for your help.

Klaus Johannes Rusch

unread,
Nov 14, 2003, 11:12:39 AM11/14/03
to
Winston Smith wrote:

> I'm looking for a way to make a batch of s/// substitutions. As a code
> sample is worth a thousand words, let see what me code is presently :
>
> ---
>
> my @rules = (
> ['^HELLO (.*)$', 'BONJOUR $1'],
> # ... lots of other rules
> );
>
> foreach my $rule (@rules) {
> if ($string =~ s/$rule->[0]/$rule->[1]/ei) {
> last;
> }
> }
>
> ---
>
> With this code, 'HELLO WINSTON' becomes 'BONJOUR $1' and not 'BONJOUR
> WINSTON' as I'd like.

my @rules = (
['^HELLO (.*)$', '"BONJOUR $1"'],
);


foreach my $rule (@rules) {
if ($string =~ s/$rule->[0]/eval($rule->[1])/ei) {
last;
}
}


--
Klaus Johannes Rusch
Klaus...@atmedia.net
http://www.atmedia.net/KlausRusch/

Greg Bacon

unread,
Nov 14, 2003, 8:54:42 AM11/14/03
to
[Newsgroups field trimmed to remove comp.lang.perl.]

In article <2eZsb.44803$xI2.9...@news20.bellglobal.com>,
Winston Smith <winsto...@linuxmail.org> wrote:

: my @rules = (


: ['^HELLO (.*)$', 'BONJOUR $1'],
: # ... lots of other rules
: );
:
: foreach my $rule (@rules) {
: if ($string =~ s/$rule->[0]/$rule->[1]/ei) {
: last;
: }
: }
:
: ---
:
: With this code, 'HELLO WINSTON' becomes 'BONJOUR $1' and not 'BONJOUR
: WINSTON' as I'd like.

It requires a little chicanery because you can't use the FAQ answer
for "How can I expand variables in text strings?" -- you'd zap the
value of $1 you want to interpolate. Note how I had to modify your
rule and make s/// do a double-eval:

$ cat try
#! /usr/local/bin/perl

use warnings;
use strict;

my @rules = (
['^HELLO (.*)$', 'qq{BONJOUR $1}'],


# ... lots of other rules
);

my $string = "HELLO WINSTON";

print "before: [$string]\n";

foreach my $rule (@rules) {
if ($string =~ s/$rule->[0]/$rule->[1]/eei) {
last;
}
}

print "after: [$string]\n";

$ ./try
before: [HELLO WINSTON]
after: [BONJOUR WINSTON]

Hope this helps,
Greg
--
This is the great illusion of our age, the idea that a certain class of
people [i.e., government] is exempt from the moral judgments that apply
to the rest of us.
-- Gene Callahan

Greg Bacon

unread,
Nov 14, 2003, 10:50:48 AM11/14/03
to
In article <2eZsb.44803$xI2.9...@news20.bellglobal.com>,
Winston Smith <winsto...@linuxmail.org> wrote:

: my @rules = (


: ['^HELLO (.*)$', 'BONJOUR $1'],
: # ... lots of other rules
: );
:
: foreach my $rule (@rules) {
: if ($string =~ s/$rule->[0]/$rule->[1]/ei) {
: last;

: }
: }

Here's a cleaner version than that in my other followup:

#! /usr/local/bin/perl

use warnings;
use strict;

my @rules = (


['^HELLO (.*)$', 'BONJOUR $1'],
# ... lots of other rules
);

my $string = "HELLO WINSTON";

print "before: [$string]\n";

foreach my $rule (@rules) {
if ($string =~ s/$rule->[0]/'qq{' . $rule->[1] . '}'/eei) {
last;
}
}

print "after: [$string]\n";

Hope this helps,
Greg
--
Sufficiently advanced political correctness is indistinguishable
from irony.
-- unknown

Eric Joanis

unread,
Nov 14, 2003, 5:28:07 AM11/14/03
to
Dear Winston,

Winston Smith <winsto...@linuxmail.org> wrote:
>I'm looking for a way to make a batch of s/// substitutions. As a code
>sample is worth a thousand words, let see what me code is presently :
>

> my @rules = ( ['^HELLO (.*)$', 'BONJOUR $1'], ... );


> foreach my $rule (@rules) {
> if ($string =~ s/$rule->[0]/$rule->[1]/ei) {

The problem with this code is that $rule->[1] is only interpolated once
and thus the "$1" it contains is not itself interpolated to the contents of the
$1 variable. To fix this, you need to get Perl to evaluation the replacement
string twice. After some trial and error, I found that this works:

if ($string =~ s/$rule->[0]/eval qq("$rule->[1]") /ei) {

If $rule->[1] is 'BONJOUR $1', then qq("$rule->[1]") yields "BONJOUR $1".
When this is evaluated again using eval, $1 is interpolate as you want it
to be.

Note that
eval '"' . $rule->[1] . '"'
would be equivalent to
eval qq("$rule->[1]")

Warning: I expect this code to be fairly slow, because Perl has to
recompile the expression at every iteration. I'd be happy to see a more
elegant solution to force perl to perform interpolation twice on a string,
if anyone has one, but I couldn't come up with one myself.

Cheers,

Eric

Malcolm Dew-Jones

unread,
Nov 14, 2003, 2:50:14 PM11/14/03
to
Winston Smith (winsto...@linuxmail.org) wrote:
: Hi everybody,

: I'm looking for a way to make a batch of s/// substitutions. As a code
: sample is worth a thousand words, let see what me code is presently :

: ---

: my @rules = (
: ['^HELLO (.*)$', 'BONJOUR $1'],
: # ... lots of other rules
: );

: foreach my $rule (@rules) {
: if ($string =~ s/$rule->[0]/$rule->[1]/ei) {

Two ways to do this (well there's more than two but this is enough)

1
($string =~ s/$rule->[0]/"\"$rule->[1]\""/eei)
^^^ ^^^ ^^

OR

2

'"BONJOUR $1"']
^ ^

($string =~ s/$rule->[0]/$rule->[1]/eei)
^

Brian McCauley

unread,
Nov 14, 2003, 3:03:52 PM11/14/03
to
gba...@hiwaay.net (Greg Bacon) writes:

> It requires a little chicanery because you can't use the FAQ answer
> for "How can I expand variables in text strings?"

This, of course, is because the FAQ answer is $EXPLETIVE!

If the FAQ gave the true answer - rather than pretenting that a
different question was asked then you could use the FAQ answer.

I've tried several times to get the FAQ amended but the maintainers
are more concerned that the FAQ should not expose readers to
potentially dangerous techniques than that they actually answer the
questions honestly.

The honest answer to "How can I expand variables in text strings?" is:

chop( $string = eval "<<__EOS__\n$string\n__EOS__\n" );

There are very good reasons why often the above is a bad idea. (Let's
not discuss them here - we all know what they are).

However there is no good reason not to mention it in the FAQ. When
someone wants to learn how to fell trees you tell them about chainsaws
and you tell them about the dangers of chainsaws. You don't just tell
them some much less effective but less dangerous way and hope that
they won't discover chainsaws. Of course they will discover
chainsaws and then:

1) they'll not have had any training in their safe use.
2) they'll never trust you again as a mentor.

--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\

Brian McCauley

unread,
Nov 14, 2003, 3:13:11 PM11/14/03
to
gba...@hiwaay.net (Greg Bacon) writes:

> my @rules = (
> ['^HELLO (.*)$', 'BONJOUR $1'],

> );

It is better to use the natural representation of things.

The 1st element of each rule is natuarally regex.

The 2nd element is natuarally code.

So the natural way to express this is:

my @rules = (
[ qr/^HELLO (.*)$/i, sub { "BONJOUR $1" } ],
);


> if ($string =~ s/$rule->[0]/'qq{' . $rule->[1] . '}'/eei) {

Using the natural representation this becomes:

> if ($string =~ s/$rule->[0]/$rule->[1]->()/e) {

Of course you could argue the the natural representation of the whole
rule is simply CODE.

my @rules = (
sub{ s/^HELLO (.*)$/BONJOUR $1/i },
);

for ( $string ) {
foreach my $rule (@rules) {
if ( &$rule ) {
last;

Brian McCauley

unread,
Nov 14, 2003, 3:22:23 PM11/14/03
to
joa...@cs.toronto.edu (Eric Joanis) writes:

> if ($string =~ s/$rule->[0]/eval qq("$rule->[1]") /ei) {
>
> If $rule->[1] is 'BONJOUR $1', then qq("$rule->[1]") yields "BONJOUR $1".
> When this is evaluated again using eval, $1 is interpolate as you want it
> to be.

This falls appart if $rule->[1] contains double-quote characters.

There is a simple if rather ugly way around this:

> if ($string =~ s/$rule->[0]/my $r = eval "<<__EOS__\n$rule->[1]\n__EOS__\n"; chop $r; $r /ei) {

The problem is that just about everyone (myself included) who looks at
this problem comes up with the qq() solution first rather than the
here-doc solution.

This is just another reason why I think this should be in the FAQ.
But the FAQ maintainers won't have it. They consider it forbidden
knowledge that would harm the souls of lesser mortals.

nadim

unread,
Nov 14, 2003, 2:02:50 AM11/14/03
to
Hi,

use strict ;
use warnings ;

for my $input ('HELLO WINSTON', 'hello smallcase', 'HELLOX')
{
if($input =~ qr/(HELLO)(.*)/)
{
for my $output_template ('match => $1', 'rest => $2', 'result =>
BONJOUR$2' )
{
my $result;
eval qq~ \$result = "$output_template" ;~ ;

print "$input: $result\n" ;
}
}
}

Gaal Yahas

unread,
Nov 15, 2003, 5:45:42 AM11/15/03
to
On Fri, Nov 14, 2003 at 08:22:23PM +0000, Brian McCauley wrote:
> > if ($string =~ s/$rule->[0]/eval qq("$rule->[1]") /ei) {
> >
> > If $rule->[1] is 'BONJOUR $1', then qq("$rule->[1]") yields "BONJOUR $1".
> > When this is evaluated again using eval, $1 is interpolate as you want it
> > to be.
>
> This falls appart if $rule->[1] contains double-quote characters.
>
> There is a simple if rather ugly way around this:
>
> > if ($string =~ s/$rule->[0]/my $r = eval "<<__EOS__\n$rule->[1]\n__EOS__\n"; chop $r; $r /ei) {
>
> The problem is that just about everyone (myself included) who looks at
> this problem comes up with the qq() solution first rather than the
> here-doc solution.

This also falls apart in the (perverse) case of the string containing
a substring that matches /^__EOS__$/m ? (And, say, a ";BEGIN{ print
'Executed' }" immediately after that). Or was that part of your point
about security implications?

--
Gaal Yahas <ga...@forum2.org>
http://gaal.livejournal.com/

0 new messages