I'm done :) Waiting for your opinions...
--
Googie
I'm telling you to use regexp.
# match the first word following the colon
set str {this is a sentence: zebra is the first word after colon}
regexp {:\s*(\S+)} $str -> first ;# ==> $first eq "zebra"
--
Glenn Jackman
NCF Sysadmin
gle...@ncf.ca
It's enough (reason for make it) that glob-style matching is more easy and
I'm not thinking for my use (of course too, but not only), but for users,
which can write some scripts with 'ON' reactions on server input and there
is more user friendly to use glob-style matching (user can match event by
simple: on {:* KICK * :*} {code}, not by migical strings which are needed
by regexp, but there is missing above char).
--
Googie
...and one more - you don't understeadn what I meant. I don't want to search
some word in string. I want to check if my mask matches string.
--
Googie
You can also convert your globs in Tcl regexps with something like
proc glob% pattern {
return ^[string map {* .* ? . % [^ \t\n]+} $pattern]$
}
...
array get a -regexp [glob% abc%*]
--
Derk Gwen http://derkgwen.250free.com/html/index.html
Death is the worry of the living. The dead, like myself,
only worry about decay and necrophiliacs.
Hm...I've always thought glob style matching is sugar for regexp...
Anyways, I'm pretty sure any string sequence described by glob
style matching can be described by regular expressions. It does
seem awfully handy, but a simplistic translation into RE doesn't
seem that difficult.
proc slob {pattern str} {
# super glob --> slob :)
# returns 1 if pattern matches string; 0 otherwise
# mimics glob style matching, with % to mean "words"
# will probably have troubles with ranges...
if {[string match *%* $pattern]} {
set newpat "^[string map {* .* ? . % \\S+} $pattern]$"
#puts $newpat
return [regexp -- $newpat $str]
} else {
return [string match $pattern $str]
}
}
% slob {hey, % there!} "hey, you there!"
1
% slob {*hey, % there!*} "blah hey, you there! blah"
1
% slob {hey, % there!} "hey, there!"
0
% slob {hey, % there!} "hey, 3-4 there!"
1
% slob {hey, % there!} "hey, 3(4) there!"
1
% slob {hey, * there!} "hey, 3(4) there!"
1
% slob {hey, * there!} "hey, 3(4) there!"
WL
>I'm done :) Waiting for your opinions...
>--
>Googie
>
--
real email: w l i a o @ s d f . l o n e s t a r . o r g
Nice!
the [string map] list could be enhanced to protect regexp special chars,
like '.', but that's left as an exercise for the reader.
> WL <m...@privacy.net> wrote:
> > proc slob {pattern str} {
> >
> > # super glob --> slob :)
> > # returns 1 if pattern matches string; 0 otherwise
> > # mimics glob style matching, with % to mean "words"
> > # will probably have troubles with ranges...
> >
> > if {[string match *%* $pattern]} {
> > set newpat "^[string map {* .* ? . % \\S+} $pattern]$"
> > #puts $newpat
> > return [regexp -- $newpat $str]
> > } else {
> > return [string match $pattern $str]
> > }
> > }
>
> Nice!
The other nice feature is in allowing [^ ] style matching,
which I have wanted to use in [glob] (specifically, in
[array names] to return all but those with certain tags).
Donald Arseneau as...@triumf.ca
I recommend trying the 'regular expression visualizer', which is a
tool I found in this ng. It makes it easy to write an re. You drop a
sample search string into the top window. Then type in your re in the
middle window. As you type it in, you will see the matches in the
bottom window. Click the 'explore' button and when you move the
cursor around your re, the matching parts of the search string are
highlighted. When you're done, cut the re and paste it in your
program. It's a great way to proof test an re before putting it into
your code.
I don't have the url for this, but you can google for it and find it.
Hope this helps.
bob
You still don't understeand me :)
I'll explain it on example. Let get 2 strings:
:server.irc MODE nick :craps
:server.irc 461 nick MODE +l :craps
Now I can use following mask for match first:
:* MODE * :*
It works but it also matches second string, what isn't wanted and
if I use mask ':* 461 * MODE +l :*' after above mask, then first mask is
matched and no more masks used to matching.
In EPIC4 (which supper %) it could looks like:
:% MODE % :*
Here is no way to match both messages by one mask. It's pretty easy and
useful.
Please, don't tell me to use regexp. I know that it can give really nice
results, but glob-style matching is much easier, much more userfriendly and
known by more people, becouse of using it by bash, etc.
--
Googie
I like this.
With the addition of escapes for the glob chars, and _ to match
arbitrary whitespace - it works for me :)
set newpat "^[string map {\\* [*] * .* \\? [?] ? . \\% [%] % \\S+ \\_
[_] _ \\s+} $pattern]$"
I guess though that adding the % char etc to the built-in glob may be
nice from a standardization point of view - so that there aren't all
these roll-your-own globs/slobs with gratuitously different behaviour...
but is there any widely recognised standard behaviour in other systems
anyway? ...
I suspect the answer is that the standard is that *,?,[] matching is in,
and for anything more complex you use regular expressions - otherwise
people will keep finding things to add to glob until it's getting nearly
as complex as regexes anyway!
JMN
>
> You still don't understeand me :)
Or do we? The suggestion is that you don't need tcl's glob
to change in order to achieve what you want, just write
something like slob shown here and use it in your application
instead.
In general, you don't need to settle for any matching mechanism
provided by Tcl (or whatever language you use), just write a matcher
that fits your task.
Steve
They're similar but not the same. [glob]-style matching is much faster but far
less capable and the glob language looks similar to the untrained eye to the RE
language. But:
"*" maps to ".*"
"?" maps to "."
"[...]" is much more limited in globs
"\" only serves one task in globs
They're done as separate matching engines (though some REs are compiled to glob
matches in some limited cases.)
Donal.
--
Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ donal....@man.ac.uk
-- This may scare your cat into premature baldness, but Sun are not the only
sellers of Unix. -- Anthony Ord <n...@rollingthunder.clara.co.uk>
> I'll explain it on example. Let get 2 strings:
> :server.irc MODE nick :craps
> :server.irc 461 nick MODE +l :craps
> In EPIC4 (which supper %) it could looks like: :% MODE % :*
> Here is no way to match both messages by one mask. It's pretty easy
> and useful. Please, don't tell me to use regexp. I know that it can
> give really nice results, but glob-style matching is much easier,
> much more userfriendly and known by more people, becouse of using it
> by bash, etc.
String matching on that is just going to lead you down a bad path, in
my opinion. If you want, you should use regexp. Taking the time to
learn it will benefit you in the future.
IRC is a messy protocol. If you want to process it with Tcl, you
ought to have a look at irc.tcl, in the Tcl standard library. Since
it's relatively new, suggestions and ideas for API changes are
welcome.
--
David N. Welton
Consulting: http://www.dedasys.com/
Personal: http://www.dedasys.com/davidw/
Free Software: http://www.dedasys.com/freesoftware/
Apache Tcl: http://tcl.apache.org/
> Hm...I've always thought glob style matching is sugar for regexp...
No! [string match] is potentially a lot faster than regexp. See the
recent thread(s) on the tcllib mailing list, as well as this newsgroup
a few months back.
> Donald Arseneau wrote:
>
> > xx...@freenet.carleton.ca (Glenn Jackman) writes:
> >
> >> WL <m...@privacy.net> wrote:
> >> > proc slob {pattern str} {
> > The other nice feature is in allowing [^ ] style matching,
> You still don't understeand me :)
I did, and I think everybody else did too.
> I'll explain it on example. Let get 2 strings:
> :server.irc MODE nick :craps
> :server.irc 461 nick MODE +l :craps
> Now I can use following mask for match first:
> :* MODE * :*
> It works but it also matches second string, what isn't wanted and
This would work if glob-style matching allowed exclusionary matches
("[^ ]"). If it did, you could use
:[^ ] MODE * :*
for the match.
Glob-style does *not* provide that extension (even though I would
also welcome the addition) because it is not hard to redefine
commands using more powerful regexp matching (as slob for glob).
> Please, don't tell me to use regexp.
Life's hard.
By the way, did you consider
:*.irc MODE * :*
Donald Arseneau as...@triumf.ca
> I guess though that adding the % char etc to the built-in glob may be
> nice from a standardization point of view - so that there aren't all
> these roll-your-own globs/slobs with gratuitously different
> behaviour...
Hmmm... Not just roll-your-own, but there are three (more?)
matching styles in Tcl itself: regexp (regsub), string match
(glob, array names), and scan. It occurs to me that [^ ]
matches could be added to [string match] etc to make them
more compatible with [scan], which does allow such exclusion
matches.
I wonder how often glob patterns use [^] to match circumflex
characters.
Donald Arseneau as...@triumf.ca
:Here is no way to match both messages by one mask. It's pretty easy and
:useful.
:Please, don't tell me to use regexp. I know that it can give really nice
:results, but glob-style matching is much easier, much more userfriendly and
:known by more people, becouse of using it by bash, etc.
The answer to your question is the feature does not currently exist,
Your choices are to write a function or extension to provide that function,
hack the core to provide that function, or find someone else who will do one
of these two things.
--
Tcl - The glue of a new generation. <URL: http://wiki.tcl.tk/ >
Even if explicitly stated to the contrary, nothing in this posting
should be construed as representing my employer's opinions.
<URL: mailto:lvi...@yahoo.com > <URL: http://www.purl.org/NET/lvirden/ >
If you do this, you'll need to submit a patch that modifies two places. One is
Tcl_StringCaseMatch in generic/tclUtil.c and the other is Tcl_UniCharCaseMatch
in generic/tclUtf.c (one works with 'char *' strings and the other with
'Tcl_UniChar *' strings; for altering the language accepted, the changes
required should be recognisably similar to the two functions.)
> I wonder how often glob patterns use [^] to match circumflex
> characters.
I wonder how often glob patterns use [] to match sets of characters. :^)
Donal.
--
Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ donal....@man.ac.uk
-- There are worse futures that burning in hell. Imagine aeons filled with
rewriting of your apps as WinN**X API will mutate through eternity...
-- Alexander Nosenko <n...@titul.ru>
you mean things like
glob *.[cho]
I use that kind of thing a lot...
Yes.
> I use that kind of thing a lot...
I use that sort of thing a lot in shells (particularly *.[ch] in relation to
find and grep), but not in Tcl. Mind you, I'd really like Tcl's [glob] to
support the {thisword,thatword,theother} alternative selection syntax as well.
Indeed, I'd *really* welcome a TIP to add that, though it'd probably need to be
a Tcl9 change...
Donal.
--
Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ donal....@man.ac.uk
-- OK, there is the MFC, but it only makes the chaos object orientated.
-- Thomas Nellessen <nell...@gmx.de>
You mean:
() 50 % glob *.{txt,tcl}
sig.txt dirs.tcl
The {,} handling is done specially by glob at a higher level, then
the rest is passed to Tcl_StringCaseMatch.
--
Jeff Hobbs The Tcl Guy
Senior Developer http://www.ActiveState.com/
Tcl Support and Productivity Solutions