Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

My suggestion

0 views
Skip to first unread message

Googie

unread,
Feb 13, 2003, 4:41:03 PM2/13/03
to
Hi
Here's my suggestion for tcl in next releases:
Let add '%' char to GLOB-style matching. It should match exactly one word
(exactly - one or more chars, but no one char can be whitespace). Some
people could say: it's not needed, but I know it is. I need it lot times.
Some time ago I was scripting in EPIC4, and there it was and was very
helpfuly. For example: now I'm writting some application with IRC protocol
support and matching messages from ircd is very difficult without '%'.
Don't tell me to use regexp. It's fu**ing hard to learn and (as I think)
can't match some strings just like glob-style with '%'.

I'm done :) Waiting for your opinions...
--
Googie

Glenn Jackman

unread,
Feb 13, 2003, 3:56:17 PM2/13/03
to

I'm telling you to use regexp.
# match the first word following the colon
set str {this is a sentence: zebra is the first word after colon}
regexp {:\s*(\S+)} $str -> first ;# ==> $first eq "zebra"

--
Glenn Jackman
NCF Sysadmin
gle...@ncf.ca

Googie

unread,
Feb 13, 2003, 5:17:50 PM2/13/03
to
Glenn Jackman wrote:

It's enough (reason for make it) that glob-style matching is more easy and
I'm not thinking for my use (of course too, but not only), but for users,
which can write some scripts with 'ON' reactions on server input and there
is more user friendly to use glob-style matching (user can match event by
simple: on {:* KICK * :*} {code}, not by migical strings which are needed
by regexp, but there is missing above char).

--
Googie

Googie

unread,
Feb 13, 2003, 5:21:07 PM2/13/03
to
Glenn Jackman wrote:

...and one more - you don't understeadn what I meant. I don't want to search
some word in string. I want to check if my mask matches string.

--
Googie

Derk Gwen

unread,
Feb 13, 2003, 4:53:24 PM2/13/03
to
Googie <do...@want.a.spam.org> wrote:
# Hi
# Here's my suggestion for tcl in next releases:
# Let add '%' char to GLOB-style matching. It should match exactly one word
# (exactly - one or more chars, but no one char can be whitespace). Some
# people could say: it's not needed, but I know it is. I need it lot times.
# Some time ago I was scripting in EPIC4, and there it was and was very
# helpfuly. For example: now I'm writting some application with IRC protocol
# support and matching messages from ircd is very difficult without '%'.
# Don't tell me to use regexp. It's fu**ing hard to learn and (as I think)
# can't match some strings just like glob-style with '%'.

You can also convert your globs in Tcl regexps with something like
proc glob% pattern {
return ^[string map {* .* ? . % [^ \t\n]+} $pattern]$
}
...
array get a -regexp [glob% abc%*]

--
Derk Gwen http://derkgwen.250free.com/html/index.html
Death is the worry of the living. The dead, like myself,
only worry about decay and necrophiliacs.

WL

unread,
Feb 13, 2003, 6:06:37 PM2/13/03
to
In article <b2gvqd$bkf$1...@atlantis.news.tpi.pl>,

Hm...I've always thought glob style matching is sugar for regexp...

Anyways, I'm pretty sure any string sequence described by glob
style matching can be described by regular expressions. It does
seem awfully handy, but a simplistic translation into RE doesn't
seem that difficult.

proc slob {pattern str} {

# super glob --> slob :)
# returns 1 if pattern matches string; 0 otherwise
# mimics glob style matching, with % to mean "words"
# will probably have troubles with ranges...

if {[string match *%* $pattern]} {
set newpat "^[string map {* .* ? . % \\S+} $pattern]$"
#puts $newpat
return [regexp -- $newpat $str]
} else {
return [string match $pattern $str]
}
}

% slob {hey, % there!} "hey, you there!"
1
% slob {*hey, % there!*} "blah hey, you there! blah"
1
% slob {hey, % there!} "hey, there!"
0
% slob {hey, % there!} "hey, 3-4 there!"
1
% slob {hey, % there!} "hey, 3(4) there!"
1
% slob {hey, * there!} "hey, 3(4) there!"
1
% slob {hey, * there!} "hey, 3(4) there!"

WL

>I'm done :) Waiting for your opinions...
>--
>Googie
>


--
real email: w l i a o @ s d f . l o n e s t a r . o r g

Glenn Jackman

unread,
Feb 13, 2003, 7:07:36 PM2/13/03
to
WL <m...@privacy.net> wrote:
> proc slob {pattern str} {
>
> # super glob --> slob :)
> # returns 1 if pattern matches string; 0 otherwise
> # mimics glob style matching, with % to mean "words"
> # will probably have troubles with ranges...
>
> if {[string match *%* $pattern]} {
> set newpat "^[string map {* .* ? . % \\S+} $pattern]$"
> #puts $newpat
> return [regexp -- $newpat $str]
> } else {
> return [string match $pattern $str]
> }
> }

Nice!

the [string map] list could be enhanced to protect regexp special chars,
like '.', but that's left as an exercise for the reader.

Donald Arseneau

unread,
Feb 13, 2003, 8:35:05 PM2/13/03
to
xx...@freenet.carleton.ca (Glenn Jackman) writes:

> WL <m...@privacy.net> wrote:
> > proc slob {pattern str} {
> >
> > # super glob --> slob :)
> > # returns 1 if pattern matches string; 0 otherwise
> > # mimics glob style matching, with % to mean "words"
> > # will probably have troubles with ranges...
> >
> > if {[string match *%* $pattern]} {
> > set newpat "^[string map {* .* ? . % \\S+} $pattern]$"
> > #puts $newpat
> > return [regexp -- $newpat $str]
> > } else {
> > return [string match $pattern $str]
> > }
> > }
>
> Nice!

The other nice feature is in allowing [^ ] style matching,
which I have wanted to use in [glob] (specifically, in
[array names] to return all but those with certain tags).

Donald Arseneau as...@triumf.ca


bo...@aol.com

unread,
Feb 13, 2003, 11:14:49 PM2/13/03
to
Googie <do...@want.a.spam.org> wrote in message news:<b2gvqd$bkf$1...@atlantis.news.tpi.pl>...

I recommend trying the 'regular expression visualizer', which is a
tool I found in this ng. It makes it easy to write an re. You drop a
sample search string into the top window. Then type in your re in the
middle window. As you type it in, you will see the matches in the
bottom window. Click the 'explore' button and when you move the
cursor around your re, the matching parts of the search string are
highlighted. When you're done, cut the re and paste it in your
program. It's a great way to proof test an re before putting it into
your code.

I don't have the url for this, but you can google for it and find it.

Hope this helps.

bob

Googie

unread,
Feb 14, 2003, 3:40:08 AM2/14/03
to
Donald Arseneau wrote:

You still don't understeand me :)
I'll explain it on example. Let get 2 strings:
:server.irc MODE nick :craps
:server.irc 461 nick MODE +l :craps
Now I can use following mask for match first:
:* MODE * :*
It works but it also matches second string, what isn't wanted and
if I use mask ':* 461 * MODE +l :*' after above mask, then first mask is
matched and no more masks used to matching.
In EPIC4 (which supper %) it could looks like:
:% MODE % :*
Here is no way to match both messages by one mask. It's pretty easy and
useful.
Please, don't tell me to use regexp. I know that it can give really nice
results, but glob-style matching is much easier, much more userfriendly and
known by more people, becouse of using it by bash, etc.

--
Googie

jul...@precisium.com.au

unread,
Feb 14, 2003, 3:47:32 AM2/14/03
to

I like this.
With the addition of escapes for the glob chars, and _ to match
arbitrary whitespace - it works for me :)


set newpat "^[string map {\\* [*] * .* \\? [?] ? . \\% [%] % \\S+ \\_
[_] _ \\s+} $pattern]$"


I guess though that adding the % char etc to the built-in glob may be
nice from a standardization point of view - so that there aren't all
these roll-your-own globs/slobs with gratuitously different behaviour...
but is there any widely recognised standard behaviour in other systems
anyway? ...
I suspect the answer is that the standard is that *,?,[] matching is in,
and for anything more complex you use regular expressions - otherwise
people will keep finding things to add to glob until it's getting nearly
as complex as regexes anyway!

JMN

Steve Cassidy

unread,
Feb 14, 2003, 5:42:19 AM2/14/03
to
Googie wrote:
> Donald Arseneau wrote:
>>xx...@freenet.carleton.ca (Glenn Jackman) writes:
>>>WL <m...@privacy.net> wrote:
>>>
>>>>proc slob {pattern str} {
>>>>

>

> You still don't understeand me :)

Or do we? The suggestion is that you don't need tcl's glob
to change in order to achieve what you want, just write
something like slob shown here and use it in your application
instead.

In general, you don't need to settle for any matching mechanism
provided by Tcl (or whatever language you use), just write a matcher
that fits your task.

Steve

Donal K. Fellows

unread,
Feb 14, 2003, 10:08:25 AM2/14/03
to
WL wrote:
> Hm...I've always thought glob style matching is sugar for regexp...

They're similar but not the same. [glob]-style matching is much faster but far
less capable and the glob language looks similar to the untrained eye to the RE
language. But:
"*" maps to ".*"
"?" maps to "."
"[...]" is much more limited in globs
"\" only serves one task in globs

They're done as separate matching engines (though some REs are compiled to glob
matches in some limited cases.)

Donal.
--
Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ donal....@man.ac.uk
-- This may scare your cat into premature baldness, but Sun are not the only
sellers of Unix. -- Anthony Ord <n...@rollingthunder.clara.co.uk>

David N. Welton

unread,
Feb 14, 2003, 3:00:55 PM2/14/03
to
Googie <do...@want.a.spam.org> writes:

> I'll explain it on example. Let get 2 strings:

> :server.irc MODE nick :craps
> :server.irc 461 nick MODE +l :craps

> In EPIC4 (which supper %) it could looks like: :% MODE % :*

> Here is no way to match both messages by one mask. It's pretty easy
> and useful. Please, don't tell me to use regexp. I know that it can
> give really nice results, but glob-style matching is much easier,
> much more userfriendly and known by more people, becouse of using it
> by bash, etc.

String matching on that is just going to lead you down a bad path, in
my opinion. If you want, you should use regexp. Taking the time to
learn it will benefit you in the future.

IRC is a messy protocol. If you want to process it with Tcl, you
ought to have a look at irc.tcl, in the Tcl standard library. Since
it's relatively new, suggestions and ideas for API changes are
welcome.

--
David N. Welton
Consulting: http://www.dedasys.com/
Personal: http://www.dedasys.com/davidw/
Free Software: http://www.dedasys.com/freesoftware/
Apache Tcl: http://tcl.apache.org/

David N. Welton

unread,
Feb 14, 2003, 3:01:45 PM2/14/03
to
m...@privacy.net (WL) writes:

> Hm...I've always thought glob style matching is sugar for regexp...

No! [string match] is potentially a lot faster than regexp. See the
recent thread(s) on the tcllib mailing list, as well as this newsgroup
a few months back.

Donald Arseneau

unread,
Feb 14, 2003, 6:39:46 PM2/14/03
to
Googie <do...@want.a.spam.org> writes:

> Donald Arseneau wrote:
>
> > xx...@freenet.carleton.ca (Glenn Jackman) writes:
> >
> >> WL <m...@privacy.net> wrote:
> >> > proc slob {pattern str} {

> > The other nice feature is in allowing [^ ] style matching,

> You still don't understeand me :)

I did, and I think everybody else did too.

> I'll explain it on example. Let get 2 strings:
> :server.irc MODE nick :craps
> :server.irc 461 nick MODE +l :craps
> Now I can use following mask for match first:
> :* MODE * :*
> It works but it also matches second string, what isn't wanted and

This would work if glob-style matching allowed exclusionary matches
("[^ ]"). If it did, you could use

:[^ ] MODE * :*

for the match.

Glob-style does *not* provide that extension (even though I would
also welcome the addition) because it is not hard to redefine
commands using more powerful regexp matching (as slob for glob).

> Please, don't tell me to use regexp.

Life's hard.

By the way, did you consider

:*.irc MODE * :*


Donald Arseneau as...@triumf.ca

Donald Arseneau

unread,
Feb 14, 2003, 7:10:18 PM2/14/03
to
"jul...@precisium.com.au" <jul...@precisium.com.au> writes:

> I guess though that adding the % char etc to the built-in glob may be
> nice from a standardization point of view - so that there aren't all
> these roll-your-own globs/slobs with gratuitously different
> behaviour...

Hmmm... Not just roll-your-own, but there are three (more?)
matching styles in Tcl itself: regexp (regsub), string match
(glob, array names), and scan. It occurs to me that [^ ]
matches could be added to [string match] etc to make them
more compatible with [scan], which does allow such exclusion
matches.

I wonder how often glob patterns use [^] to match circumflex
characters.

Donald Arseneau as...@triumf.ca

lvi...@yahoo.com

unread,
Feb 15, 2003, 5:05:51 AM2/15/03
to

According to Googie <do...@want.a.spam.org>:
:You still don't understeand me :)

:Here is no way to match both messages by one mask. It's pretty easy and
:useful.

:Please, don't tell me to use regexp. I know that it can give really nice
:results, but glob-style matching is much easier, much more userfriendly and
:known by more people, becouse of using it by bash, etc.

The answer to your question is the feature does not currently exist,
Your choices are to write a function or extension to provide that function,
hack the core to provide that function, or find someone else who will do one
of these two things.

--
Tcl - The glue of a new generation. <URL: http://wiki.tcl.tk/ >
Even if explicitly stated to the contrary, nothing in this posting
should be construed as representing my employer's opinions.
<URL: mailto:lvi...@yahoo.com > <URL: http://www.purl.org/NET/lvirden/ >

Donal K. Fellows

unread,
Feb 17, 2003, 8:12:00 AM2/17/03
to
Donald Arseneau wrote:
> Hmmm... Not just roll-your-own, but there are three (more?)
> matching styles in Tcl itself: regexp (regsub), string match
> (glob, array names), and scan. It occurs to me that [^ ]
> matches could be added to [string match] etc to make them
> more compatible with [scan], which does allow such exclusion
> matches.

If you do this, you'll need to submit a patch that modifies two places. One is
Tcl_StringCaseMatch in generic/tclUtil.c and the other is Tcl_UniCharCaseMatch
in generic/tclUtf.c (one works with 'char *' strings and the other with
'Tcl_UniChar *' strings; for altering the language accepted, the changes
required should be recognisably similar to the two functions.)

> I wonder how often glob patterns use [^] to match circumflex
> characters.

I wonder how often glob patterns use [] to match sets of characters. :^)

-- There are worse futures that burning in hell. Imagine aeons filled with
rewriting of your apps as WinN**X API will mutate through eternity...
-- Alexander Nosenko <n...@titul.ru>

lvi...@yahoo.com

unread,
Feb 18, 2003, 12:21:29 PM2/18/03
to

According to Donal K. Fellows <donal.k...@man.ac.uk>:
:I wonder how often glob patterns use [] to match sets of characters. :^)

you mean things like

glob *.[cho]

I use that kind of thing a lot...

Donal K. Fellows

unread,
Feb 19, 2003, 8:22:13 AM2/19/03
to
lvi...@yahoo.com wrote:
> you mean things like
> glob *.[cho]

Yes.

> I use that kind of thing a lot...

I use that sort of thing a lot in shells (particularly *.[ch] in relation to
find and grep), but not in Tcl. Mind you, I'd really like Tcl's [glob] to
support the {thisword,thatword,theother} alternative selection syntax as well.
Indeed, I'd *really* welcome a TIP to add that, though it'd probably need to be
a Tcl9 change...

-- OK, there is the MFC, but it only makes the chaos object orientated.
-- Thomas Nellessen <nell...@gmx.de>

Jeffrey Hobbs

unread,
Feb 19, 2003, 12:35:21 PM2/19/03
to
Donal K. Fellows wrote:
> I use that sort of thing a lot in shells (particularly *.[ch] in relation to
> find and grep), but not in Tcl. Mind you, I'd really like Tcl's [glob] to
> support the {thisword,thatword,theother} alternative selection syntax as well.
> Indeed, I'd *really* welcome a TIP to add that, though it'd probably need to be
> a Tcl9 change...

You mean:

() 50 % glob *.{txt,tcl}
sig.txt dirs.tcl

The {,} handling is done specially by glob at a higher level, then
the rest is passed to Tcl_StringCaseMatch.

--
Jeff Hobbs The Tcl Guy
Senior Developer http://www.ActiveState.com/
Tcl Support and Productivity Solutions

0 new messages