Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

regexp tcl help

111 views
Skip to first unread message

SANKY

unread,
Jun 14, 2017, 12:46:33 AM6/14/17
to
Hi,

I need a help for getting correct regexp for the following requirement:

I have a string with value "hi hello bye hello".

I want to get first match for string before "hello" which is "hi".

I wrote a code but its not getting correct regexp.

Snippet:
=========

% set a "hi hello bye hello"
hi hello bye hello
% regexp "(.*)hello" $a d s
1
% puts $d
hi hello bye hello
% puts $s
hi hello bye
%


I need "hi" as output of variable s, but getting "hi hello bye".

Appreciate your inputs.

Thanks,
Sankar.

Arjen Markus

unread,
Jun 14, 2017, 3:12:21 AM6/14/17
to
Your example is not quite in line with your requirement. If regexp would match the first occurrence of "hello" then with the given capturing subexpression you would get "hi ". If that is what you want, you can use:

regexp {^(.*)hello} $string d s

However, then it would be simpler to:
- search for the first occurrence of "hello": set position [string first "hello" $string]
- then take the substring: set s [string range $string 0 $position-1]

The reason for your regular expression to fail is that regexp looks for the longest substring that matches. You might get by with the non-greedy variant,

regexp {^(.*?)hello} $string d s

but I am not sure it will always work (well, in this case it would, if I interpret the manual page correctly).

Regards,

Arjen

Mike Griffiths

unread,
Jun 14, 2017, 4:55:48 PM6/14/17
to
It should work fine. But since you only care about the string up to the first "hello", and not the "hello" itself, you can refine the regexp a little (and remove the need for submatches) using a positive lookahead:

% set a "hi hello bye hello"
% set r {^.+?(?=hello)}
% regexp -inline $r $a
{hi }

The .+? matches at least one character, but as few as possible while still making sure the overall regexp matches, and (?=hello) tells it that it should see (but not capture in any way) "hello" immediately afterwards. So the entire pattern just matches "hi ".

sled...@gmail.com

unread,
Jun 15, 2017, 12:26:18 AM6/15/17
to
Seems like some are making it more difficult than it needs to be:
set theword [regexp -inline -- {\w+} $a]

Doesn't get much simpler than that.

Arjen Markus

unread,
Jun 15, 2017, 3:09:34 AM6/15/17
to
Where is the "hello" in there?

Regards,

Arjen

sled...@gmail.com

unread,
Jun 15, 2017, 11:21:32 AM6/15/17
to
;"I want to get first match for string before "hello" which is "hi".

Sort of ambiguous...

BTW: really appreciate the contributions you have made to the tcl community...

I have developed some speech rec apps that I would like to use to 'give back' to those whose tcl code I have learned from...
However, I am not what you would call a programmer with a "P", so I'm afraid it would be a little embarrassing...
0 new messages