Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

regexp problem

81 views
Skip to first unread message

Kenny McCormack

unread,
Feb 8, 2024, 4:32:11 AMFeb 8
to
For some reason, when I use "regexp -inline" and try to capture the output,
it fails. Observe:

First, run expect, capture the output - which is the digit string (this works):
expect 1.8> regexp -inline \[0-9]+ [time {sleep 5}]
Result: 5000267

Now, try to capture that result in variable foo:
expect 1.9> set foo [regexp -inline \[0-9]+ [time {sleep 5}]]
Error: wrong # args: should be "regexp ?-option ...? exp string ?matchVar? ?subMatchVar ...?"

Now, try to do the same thing, using lindex, but get same result:
expect 1.11> puts "Time: [lindex [regexp -inline \[0-9]+ [time {sleep 5}]] 0]"
Error: wrong # args: should be "regexp ?-option ...? exp string ?matchVar? ?subMatchVar ...?"

Why?

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/RoyDeLoon

Harald Oehlmann

unread,
Feb 8, 2024, 7:16:34 AMFeb 8
to
Am 08.02.2024 um 10:32 schrieb Kenny McCormack:
> For some reason, when I use "regexp -inline" and try to capture the output,
> it fails. Observe:
>
> First, run expect, capture the output - which is the digit string (this works):
> expect 1.8> regexp -inline \[0-9]+ [time {sleep 5}]
> Result: 5000267
>
> Now, try to capture that result in variable foo:
> expect 1.9> set foo [regexp -inline \[0-9]+ [time {sleep 5}]]
> Error: wrong # args: should be "regexp ?-option ...? exp string ?matchVar? ?subMatchVar ...?"
>
> Now, try to do the same thing, using lindex, but get same result:
> expect 1.11> puts "Time: [lindex [regexp -inline \[0-9]+ [time {sleep 5}]] 0]"
> Error: wrong # args: should be "regexp ?-option ...? exp string ?matchVar? ?subMatchVar ...?"
>
> Why?
>

Try:

set foo [regexp -inline {[0-9]+} [time {sleep 5}]]

or

set foo [regexp -inline \[0-9\]+ [time {sleep 5}]]

You close the opening "[" with the RE. This does not play any role, if
you don't open a "[".

Harald

Kenny McCormack

unread,
Feb 8, 2024, 8:35:25 AMFeb 8
to
In article <uq2gmt$1v2vf$1...@dont-email.me>,
Harald Oehlmann <wort...@yahoo.com> wrote:
...
>> Why?
>>
>
>Try:
>
>set foo [regexp -inline {[0-9]+} [time {sleep 5}]]
>
>or
>
>set foo [regexp -inline \[0-9\]+ [time {sleep 5}]]
>
>You close the opening "[" with the RE. This does not play any role, if
>you don't open a "[".

I'm actually more interested in "why", than in alternatives.
(But note: Yes, both of your alternatives do work)

So, maybe an insight as to why?
The point is why does it work OK when you just print the result, but fails
when you wrap it inside a "set" to capture the result?

I suppose it is an instance of a common/general problem with TCL - that it
encourages "slop" - coding that isn't correct, but usually works. I've
gotten into the habit - with all kinds of Unix "regular expression"-using
programs - that you only have to escape the first [ (and not the closing ])
because if the first one is escaped, then the second one is sort of
'auto-escaped'. You'd be surprised how often this works OK (even though, I
suppose, it shouldn't).

This kind of "slop" is why some people say disparaging things about TCL as
a programming language. Well, one of the common reasons; there are others.
I'm not really a TCL programmer (I just fake it), but I've been programming
Expect for decades now.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/Rorschach

Ralf Fassel

unread,
Feb 8, 2024, 8:53:37 AMFeb 8
to
* gaz...@shell.xmission.com (Kenny McCormack)
| >set foo [regexp -inline {[0-9]+} [time {sleep 5}]]
| >
| >or
| >
| >set foo [regexp -inline \[0-9\]+ [time {sleep 5}]]
| >
| >You close the opening "[" with the RE. This does not play any role, if
| >you don't open a "[".
>
| I'm actually more interested in "why", than in alternatives.
| (But note: Yes, both of your alternatives do work)
>
| So, maybe an insight as to why?
>
| The point is why does it work OK when you just print the result, but fails
| when you wrap it inside a "set" to capture the result?

The point is not the 'set' itself, but the additional pair of [].

In
regexp -inline \[0-9]+ [time {sleep 5}]

the closing "]" is not matched by an opening "[", so TCL just 'uses' it,
as in

set x ]

However, in

set x [regexp -inline \[0-9]+ [time {sleep 5}]]

The "]" after the '9' closes the opening "[" of the 'set' command, so
regexp is actually called with

regexp -inline \[0-9

which is clearly not enough arguments :-). (Note that if it were to
succeed, then 'set' would complain about too many arguments
(the "+ [time {sleep 5}]]")

HTH
R'

Kenny McCormack

unread,
Feb 8, 2024, 10:38:46 AMFeb 8
to
In article <ygasf23...@panther.akutech-local.de>,
Ralf Fassel <ral...@gmx.de> wrote:
...
>The point is not the 'set' itself, but the additional pair of [].
>
>In
> regexp -inline \[0-9]+ [time {sleep 5}]
>
>the closing "]" is not matched by an opening "[", so TCL just 'uses' it,
>as in
>
> set x ]
>
>However, in
>
> set x [regexp -inline \[0-9]+ [time {sleep 5}]]
>
>The "]" after the '9' closes the opening "[" of the 'set' command, so
>regexp is actually called with
>
> regexp -inline \[0-9
>
>which is clearly not enough arguments :-). (Note that if it were to
>succeed, then 'set' would complain about too many arguments
>(the "+ [time {sleep 5}]]")

Yes, interesting. Thanks.

I guess the moral is: Don't be sloppy!

--
"They say if you play a Microsoft CD backwards, you hear satanic messages.
Thats nothing, cause if you play it forwards, it installs Windows."

Luc

unread,
Feb 8, 2024, 11:35:37 AMFeb 8
to
On Thu, 8 Feb 2024 09:32:06 -0000 (UTC), Kenny McCormack wrote:

>For some reason, when I use "regexp -inline" and try to capture the output,
>it fails. Observe:
>
>First, run expect, capture the output - which is the digit string (this
>works):
> expect 1.8> regexp -inline \[0-9]+ [time {sleep 5}]
> Result: 5000267
>
>Now, try to capture that result in variable foo:
> expect 1.9> set foo [regexp -inline \[0-9]+ [time {sleep 5}]]
> Error: wrong # args: should be "regexp ?-option ...? exp
> string ?matchVar? ?subMatchVar ...?"
>
>Now, try to do the same thing, using lindex, but get same result:
>expect 1.11> puts "Time: [lindex [regexp -inline \[0-9]+ [time {sleep 5}]]
>0]" Error: wrong # args: should be "regexp ?-option ...? exp
>string ?matchVar? ?subMatchVar ...?"
>
>Why?
**************************


I don't know what the output of [time {sleep 5}] is. Can you please
provide some string I can work on? I would like to make a few tests.


--
Luc
>>

et99

unread,
Feb 8, 2024, 4:22:36 PMFeb 8
to
Based on his working example, the output is the output of [time] and his [sleep 5] must be 5 seconds. I have a wait proc that does that in ms, so,

% regexp -inline \[0-9\]+ [time {wait 5000}]
5002580
% time {wait 5000}
5012879 microseconds per iteration
%

Rich

unread,
Feb 8, 2024, 4:50:17 PMFeb 8
to
et99 <et...@rocketship1.me> wrote:
> On 2/8/2024 8:35 AM, Luc wrote:
>> On Thu, 8 Feb 2024 09:32:06 -0000 (UTC), Kenny McCormack wrote:
>>
>>> For some reason, when I use "regexp -inline" and try to capture the
>>> output, it fails. Observe:
>>>
>>> First, run expect, capture the output - which is the digit string
>>
>> I don't know what the output of [time {sleep 5}] is. Can you please
>> provide some string I can work on? I would like to make a few
>> tests.
>>
> Based on his working example, the output is the output of [time] and
> his [sleep 5] must be 5 seconds. I have a wait proc that does that
> in ms, so,
>
> % regexp -inline \[0-9\]+ [time {wait 5000}]
> 5002580
> % time {wait 5000}
> 5012879 microseconds per iteration
> %

Kenny's programming Tcl via Expect.

Expect includes a 'sleep' proc as a built in, which works (per the
expect manpage) the same as the Unix/Linux 'sleep' command. The big
difference from Tcl's 'after' is Expect's 'sleep' is documented as
allowing Tk events to be processed while it sleeps.

et99

unread,
Feb 8, 2024, 10:14:06 PMFeb 8
to
When I capture the microseconds of a time command, I just use [lindex [time ...] 0] rather than a regex. And if you use the repeat count > 1, that value can be a decimal so the regex above would stop before the decimal point.

FWIW, my [wait] proc uses after and vwait with uniquely generated variable names to wait on. So, it can be used at script level or inside events; however, they will be nested, and need to unwind inside-out even if outer timers expire first. But they don't block the event loop, like plain [after].

However, because of that limitation, I've lately turned to using threads with fifo queuing, if I need in-line delays.

Here's my wait, a bit ugly, but it does work as described.

proc wait { ms } {
set uniq [incr ::__sleep__tmp__counter]
set ::__sleep__tmp__$uniq 0
after $ms set ::__sleep__tmp__$uniq 1
vwait ::__sleep__tmp__$uniq
unset ::__sleep__tmp__$uniq
}


Luc

unread,
Feb 8, 2024, 11:39:45 PMFeb 8
to
On Thu, 8 Feb 2024 15:38:41 -0000 (UTC), Kenny McCormack wrote:

>I guess the moral is: Don't be sloppy!

I no longer remember how but I specifically remember that I once
learned to always enclose regular expressions in curly brackets.
Always. Forget the whole escaping business. Just make sure to
enclose your regular expressions in curly brackets.

And I never had a problem again.

--
Luc
>>

Harald Oehlmann

unread,
Feb 9, 2024, 2:44:34 AMFeb 9
to
+1

et99

unread,
Feb 9, 2024, 3:35:41 AMFeb 9
to
I use a tool called regexbuddy4. It's a commercial product, but well worth the small price. The best part is that it can write code and it has templates for 20+ languages, including tcl. Here's a sample, with a comment tree:

# ([0-9]+)
#
# Options: Case sensitive; Exact spacing; Dot doesn’t match line breaks; ^$ match at line breaks
#
# Match the regex below and capture its match into backreference number 1 «([0-9]+)»
# Match a single character in the range between “0” and “9” «[0-9]+»
# Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»


regexp -linestop -lineanchor {([0-9]+)} $input -> number


I selected "get the part of a string matched by the first capturing group" as the coding goal.


Kenny McCormack

unread,
Feb 9, 2024, 7:56:58 AMFeb 9
to
In article <uq459p$2etnh$1...@dont-email.me>, et99 <et...@rocketship1.me> wrote:
...
>When I capture the microseconds of a time command, I just use [lindex [time ...]
>0] rather than a regex.

Thanks for that. It never occurred to me that I could extract out the
number via "lindex". Six of one, I suppose.

But I still have to push it through "expr" to convert it to seconds (i.e.,
divide it by a million).

--
Which of these is the crazier bit of right wing lunacy?
1) We've just had another mass shooting; now is not the time to be talking about gun control.

2) We've just had a massive hurricane; now is not the time to be talking about climate change.

Kenny McCormack

unread,
Feb 9, 2024, 7:58:27 AMFeb 9
to
In article <20240208133...@lud1.home>, Luc <l...@sep.invalid> wrote:
...
>I don't know what the output of [time {sleep 5}] is. Can you please
>provide some string I can work on? I would like to make a few tests.

Yes, as others have noted, "sleep" is a builtin in Expect.

--
On the subject of racism being depicted in the media, the far right and the far left have
met up in agreement (sort of like how plus infinity meets up with minus infinity).
The far left doesn't want it, because they are afraid it will make people racist.
The far right doesn't want it, because they are afraid it will make people feel bad about being racist.

Rich

unread,
Feb 9, 2024, 8:38:58 AMFeb 9
to
Kenny McCormack <gaz...@shell.xmission.com> wrote:
> In article <uq459p$2etnh$1...@dont-email.me>, et99 <et...@rocketship1.me> wrote:
> ...
>>When I capture the microseconds of a time command, I just use [lindex [time ...]
>>0] rather than a regex.
>
> Thanks for that. It never occurred to me that I could extract out the
> number via "lindex". Six of one, I suppose.

'lindex' will work here (because the output from 'time' is both
consistent and compatible). But be careful applying lindex to any
arbitrary string, doing so can become the source of 'weird' bugs that
only appear months/years later when "just the right string" happens
through.

I.e.:

$ rlwrap tclsh
% set s "string a b { c d e"
string a b { c d e
% lindex $s 0
unmatched open brace in list
%


> But I still have to push it through "expr" to convert it to seconds (i.e.,
> divide it by a million).

True, as it returns microseconds. But that is not too hard to
accomodate if you would prefer not having to write out the division at
every usage:

proc time_s {script {count 1}} {
return [expr {[lindex [time $script $count] 0] / 1000000.0}]
}

% time_s {after 500}
0.500647
%

Or, if you really want to get 'fancy':

rename time ::tcl::time
proc time {script {count 1}} {
return [expr {[lindex [::tcl::time $script $count] 0] / 1000000.0}]
}

% time {after 500}
0.500636
%

Although renaming the real time could mess with any package/modules
that also /might/ use [time]. So be careful with this 'fancy' method.

et99

unread,
Feb 10, 2024, 8:56:59 PMFeb 10
to
Here's mine, because at 4 in the morning, my eyes need a break and neatness helps.

proc timems {args} {
set result [uplevel 1 time $args]
set number [format %.3f [expr {( [lindex $result 0] / 1000. )}]]
set number [regsub -all {\d(?=(\d{3})+($|\.))} $number {\0,}]
return "[format %12s $number ] milliseconds"
}


% timems {wait 1}
1.042 milliseconds
% timems {wait 5432}
5,432.084 milliseconds
% timems {math::fibonacci 1000000}
49,629.625 milliseconds

Luc

unread,
Feb 11, 2024, 11:15:31 PMFeb 11
to
On Fri, 9 Feb 2024 00:35:35 -0800, et99 wrote:

>I use a tool called regexbuddy4. It's a commercial product, but well worth
>the small price. The best part is that it can write code and it has
>templates for 20+ languages, including tcl.

I've been using Visual Regexp for more than 20 years.

http://laurent.riesterer.free.fr/regexp/

It's written in Tcl and is mostly focused on Tcl, but it's usable for
other situations. I still use it for sed and used it for Perl and PHP
back in the day.


--
Luc
>>

0 new messages