Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

docs incorrectly mention pattern matching works like pathname expansion

7 views
Skip to first unread message

Stormy

unread,
Mar 14, 2018, 6:14:45 PM3/14/18
to bug-...@gnu.org
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu' -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib   -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic
uname output: Linux testc170 4.1.12-112.14.15.el7uek.x86_64 #2 SMP Thu Feb 8 09:58:19 PST 2018 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-redhat-linux-gnu

Bash Version: 4.2
Patch Level: 46
Release Status: release

Description:
 Section of 'case' in bash's man page says:

 case word in [ [(] pattern [ | pattern ] ... ) list ;; ] ... esac
              A  case  command  first expands word, and tries to match it against each pattern in turn, using the same matching
              rules as for pathname expansion (see Pathname Expansion below).

but that is not correct, the matching here does NOT follow pathname expansion, the treatment of "/" is not the same.
Man page should explain that '/' is treated specially in pathname expansion but not in case pattern
matching.  For example '/test/*' will match even '/test/one/two/three' in case matching, but NOT in path name matching
where "/" is a "separator" and stops matching.


Repeat-By:
  case matching on '/test/*' also matches '/test/one/two/three', if it was true pathname matching, it should not match, only /test/
*/*/* would match in that case.

Fix:
        either change case to do pathname matching, like doc says, but that would probably break many existing scripts. OR, correct
 the docs and explain that pathname matching is not possible in bash, i.e. there is no 'fnmatch' builtin to bash.
~                                                                                                                                 

Chet Ramey

unread,
Mar 15, 2018, 3:26:44 PM3/15/18
to Stormy, bug-...@gnu.org, chet....@case.edu
On 3/14/18 1:43 PM, Stormy wrote:

> Bash Version: 4.2
> Patch Level: 46
> Release Status: release
>
> Description:
>  Section of 'case' in bash's man page says:
>
>  case word in [ [(] pattern [ | pattern ] ... ) list ;; ] ... esac
>               A  case  command  first expands word, and tries to match it against each pattern in turn, using the same matching
>               rules as for pathname expansion (see Pathname Expansion below).
>
> but that is not correct, the matching here does NOT follow pathname expansion, the treatment of "/" is not the same.

The description of Pathname Expansion says, in part:

"When matching a
pathname, the slash character must always be matched explicitly."

but I can expand that to also say that it other contexts it does not need
to be.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU ch...@case.edu http://tiswww.cwru.edu/~chet/

Stormy

unread,
Mar 15, 2018, 4:16:03 PM3/15/18
to bug-...@gnu.org, Chet Ramey
Thanks for the reply.  I'm not sure we are talking about the same thing.. maybe..does this example help?
# case /test/test2/dir1/file in  /test/*) echo 'match';; *) echo 'nomatch';; esac
match

here, the expectation is to NOT match, since '/test/*' in normal shell, i.e. "ls", would NOT match that long path.  by path name expansion, man page is hinting it behaves like "ls", but clearly it does not.
in summary, it seems bash has no internal fnmatch(3) implementation.

PePa

unread,
Mar 15, 2018, 4:32:57 PM3/15/18
to bug-...@gnu.org
It is clear that the matching in 'case' does general pattern matching
but not pathname matching. I think the bash man page should say so,
clearly distinguishing different ways of matching in bash.

Peter

Chet Ramey

unread,
Mar 15, 2018, 6:29:15 PM3/15/18
to Stormy, bug-...@gnu.org, chet....@case.edu
On 3/15/18 12:15 PM, Stormy wrote:
> Thanks for the reply.  I'm not sure we are talking about the same thing.. maybe..does this example help?
> # case /test/test2/dir1/file in  /test/*) echo 'match';; *) echo 'nomatch';; esac
> match
>
> here, the expectation is to NOT match, since '/test/*' in normal shell, i.e. "ls", would NOT match that long path.  by path name expansion, man page is hinting it behaves like "ls", but clearly it does not.

No, we're talking about the same thing.

There is a behavior difference between pathname and non-pathname matching
contexts: when matching a pathname, a slash in the pathname has to be
matched by a slash in the pattern.

The text I quoted, which appears in the "Pathname Expansion" section of
the man page (referenced in the description of `case') says exactly that.

What it doesn't say is the converse: that when not matching a pathname,
the slash doesn't need to be matched explicitly. The second part of my
comment proposes adding text to say that.

What I'll probably end up doing is to change some places to refer directly
to pattern matching instead of referring to pathname expansion and letting
that section refer to pattern matching.

> in summary, it seems bash has no internal fnmatch(3) implementation.

Nothing you've said implies that. In fact, the opposite is true.

Stormy

unread,
Mar 15, 2018, 6:50:38 PM3/15/18
to bug-...@gnu.org, Chet Ramey
Chet,
ok, replacing 'pathname expansion' with 'pattern matching' is the right solution, otherwise u end up with a lot of confusing explanations :)

if u think bash has builtin 'fnmatch' functionality, do u have an example?  clearly we see that 'case' above is not doing it.  Similarily, [[ "$text" == $pattern ]]  does not (also =~ is not relevant since pathname is not pure regex).. These all do pattern matching, but not PATH NAME matching (as what "ls" would do)..  I'm not sure what I said to make it sound like bash does have it.. 
Sure, one can use "ls" to expand a path, but what if it does not exist or u just want to match paths directly in bash?

I ended up writing ~40 lines of code that implement fnmatch directly in bash, hence ran into this doc issue.
you can also search online 'fnmatch' bash, and see, folks cannot figure it out :)
anyways, I'm not complaining or anything, just thought it would be a good idea to bring to your attention, frankly I'm pleasantly surprised with the swift attention and replies.
Good luck.Stormy.

Greg Wooledge

unread,
Mar 15, 2018, 6:53:45 PM3/15/18
to bug-...@gnu.org
On Thu, Mar 15, 2018 at 06:50:24PM +0000, Stormy wrote:
> if u think bash has builtin 'fnmatch' functionality, do u have an example?

echo *

Chet Ramey

unread,
Mar 15, 2018, 7:17:56 PM3/15/18
to Stormy, bug-...@gnu.org, chet....@case.edu
On 3/15/18 2:50 PM, Stormy wrote:
> Chet,
>
> ok, replacing 'pathname expansion' with 'pattern matching' is the right
> solution, otherwise u end up with a lot of confusing explanations :)
>
> if u think bash has builtin 'fnmatch' functionality, do u have an example
> clearly we see that 'case' above is not doing it.  Similarily, [[ "$text"
> == $pattern ]]  does not (also =~ is not relevant since pathname is not
> pure regex).. These all do pattern matching, but not PATH NAME matching (as
> what "ls" would do)..  I'm not sure what I said to make it sound like bash
> does have it..

I think you're confusing fnmatch with the FNM_PATHNAME option and fnmatch's
default behavior. When fnmatch is used to match a pathname, you pass the
FNM_PATHNAME flag to get the special behavior for slash (and, if desired,
the FNM_PERIOD flag to get the behavior of matching a leading `.'
explicitly).

The instances (case, [[, the pattern replacement and removal word
expansions) where bash's fnmatch equivalent isn't matching a pathname are
where it doesn't pass the FNM_PATHNAME or FNM_PERIOD options.

> Sure, one can use "ls" to expand a path, but what if it does not exist or u
> just want to match paths directly in bash?

What is it you want to do? Turn the equivalent of the FNM_PATHNAME flag
on when running `case'? That's not how `case' works.

If you want to expand filenames or match filenames using a pattern, you'll
have to use the patterns in a context where filename expansion is
performed. The man page is pretty good about listing those cases (e.g., the
arguments to a simple command).

Stormy

unread,
Mar 15, 2018, 7:26:30 PM3/15/18
to bug-...@gnu.org, Chet Ramey
ok, I'm not that proficient with the inner workings.
I have a list of 'paths' as well as a list of 'patterns', and the bash script needs to decide if/what matches..
for example list of paths (these are highly simplified, real data can be anything under the sun):

/test/dir1/dir2/file2/test/dir1/dir4/file3/test/dir1/dir5/file4
list of patterns:
/test/dir?/dir?/file[a-z]/test/dir?/dir?/file*/test/dir*/test/dir?/dir?/file[0-9]
again, patterns can be anything.  The script needs to "predict" correctly which paths match which pattern.
I'm not fixated on 'case' or anything, but could not find a way to do it simply in bash.  folks suggest to do 'cd $path' but clearly that is not relevant in this case b/c these paths may not even exist on the system that runs this script :) :)

like I said, I've already implemented, roughly 40 lines in bash, and it seems to work, but if there is some builtin option 'shopt' or similar that can turn the right flags you mentioned, I'm all for testing it :)
Cheers.
Stormy.

Chet Ramey

unread,
Mar 15, 2018, 7:44:46 PM3/15/18
to Stormy, bug-...@gnu.org, chet....@case.edu
On 3/15/18 3:26 PM, Stormy wrote:

> like I said, I've already implemented, roughly 40 lines in bash, and it
> seems to work, but if there is some builtin option 'shopt' or similar that
> can turn the right flags you mentioned, I'm all for testing it :)

There isn't. Pathname expansion is done in the specific circumstances Posix
says it should be (and historical shells perform). The other contexts use
straight pattern matching.

Stormy

unread,
Mar 15, 2018, 10:52:34 PM3/15/18
to bug-...@gnu.org, Chet Ramey
ok, thanks for the confirmation.  now u see what I meant before.. when saying bash does not have a builtin way to call fnmatch (I meant: for path name matching), clearly bash calls fnmatch, that is obvious, but there is no way to make it do pathname matching internally. (cd, ls, will surely do it, external to bash though)..

anyways, thanks for all the help..

PePa

unread,
Mar 16, 2018, 1:03:23 AM3/16/18
to Stormy, bug-...@gnu.org
I think bash's echo does this, it doesn't do the pattern matching like
case, the slashes need to be there. You might need/want `shopt -s
dotglob nullglob`

Peter

Stormy

unread,
Mar 16, 2018, 9:38:59 AM3/16/18
to bug-...@gnu.org, PePa
Thanks, however, I'm not aware how 'echo' can be used as a comparison tool in bash..  i thought it only prints text..  see my other email showing examples of inputs.. from searching online, folks said that bash does not have 'fnmatch' functionality for pathname expansion, it only has standard pattern matching, i.e. 'case' '[[ $a == $b ]]', etc.
anyways, I don't expect you guys to help beyond what u've already did.. if someone knows a solution, sure, otherwise, it's all good, my small fnmatch function seems to do the right thing thus far... :)

Robert Elz

unread,
Mar 16, 2018, 9:43:34 AM3/16/18
to Stormy, bug-...@gnu.org
Date: Thu, 15 Mar 2018 22:52:24 +0000 (UTC)
From: Stormy <storm...@yahoo.com>
Message-ID: <68229887.149730...@mail.yahoo.com>

| but there is no way to make it do pathname matching internally. (cd, ls, will surely do it,

No, they surely don't - the pathname expansion that you seem to see in them is
done in the shell before they run. "find" (and a few other commands) does
matching, but there are only a few like that.

What might make things clearer is that in the normal case (I think bash has
mechanisms that extend this) pathname matching is just pattern matching
applied to each component of the path name, one at a time.

That is, when pathname expansion is done (and it is attempted on every
unquoted argument to every command, unless turned off by "set -f") it works
by separating the word into components at the /s and then doing matching
using the list of patterns created by this, starting at the first, matching
against the current directory (or the root directory if the path started with
'/') matching the patten from next component against the file names in
that directory - then as it proceeds along the path name components, it
looks in every directory already found by earlier matching.

Whether the actual implementation works exactly like that or not does
not matter - there are some optimisations that can be used to speed
things up, but conceptually that is what happens.

The pattern matching that happens is identical here as it is in case
statements - the '/' does not match as it is not there in the patterns that
are used in the pathname match - the slashed were used to separate the
word into the list of component patterns, and removed (and because no
directory entry can contain a '/' character, so inventing some method to
allow it would be pointless).

Other matching, such as in case statements, simply uses the pattern
word as the pattern, regardless of what characters (including / are in it)
and matches it against some other string, which also might have / chars
in it. The matching is identical, here / is just the same as any other
ordinary character - it is only special in path names.

You're right that if you want to test to see if a string will match another
string, as if one is a pathname pattern, and the other is a potential
file name, then you have to write code to do that (it is not something that
most people ever need to do...)

But something based upon

fnmatch()
{
local IFS PATTERN PATHNAME A -

PATTERN="$1"
PATHNAME="$2"

set -f
IFS=/
set -- $PATTERN

for A in $PATHNAME
do
[ $# -eq 0 ] && return 1

case "$A" in
$1) ;;
*) return 1;;
esac
shift
done
[ $# -eq 0 ]
}

is a few less than 40 lines, and should be approximately what is wanted
(note this has not been tested much - it is just intended as a hint)
Also note that since this uses "local" and particularly "local -" it is not
even slightly portable (though more than one shell can run it.)

usage is:

fnmatch pattern potential-pathname

where you will usually need to quote the pattern arg (at least) to
prevent pathname expansion from altering it.

The result is its exit status. (so it can be used like

if fnmatch ...
then
....
fi

kre


Stormy

unread,
Mar 16, 2018, 10:04:25 AM3/16/18
to Robert Elz, bug-...@gnu.org
Robert, all,you are such friendly crowd.... maybe one day I'll write you about my car problems :) :) just kidding...
bash-only... :)
Thanks for the confirmation regarding the need to write code. and also about 'cd' 'ls', yeah, it's what I sort of mean, bash does when passing args to them, but there is no other way to 'access' this type of parsing..

What you describe about splitting is what I ended up doing in my function, splitting on the "/", however, both paths need to be normalized fully before anything will work, like '/data//dir1' is same as '/data/dir1', so that is another effort/function (to normalize), again, cannot 'cd'/pwd to the location, b/c it may not exist.
also I'm completely unaware of the "-" in local, it gave me:
./fnmatch.sh: line 3: local: `-': not a valid identifier

took it out, and script seems to work, with the exception of the normalization.. anyways, I'm good, I'll get out of your way for now...
Thanks for the help, next time I can't figure out the most trivial bash issue, no more reading the FAQ, just straight to this email list :) :) :)
accept that as a compliment :)

Stormy..
PS: I was kidding of course - and u knew it.. so.. go with your instincts :)

Robert Elz

unread,
Mar 16, 2018, 10:47:59 AM3/16/18
to Stormy, bug-...@gnu.org
Date: Fri, 16 Mar 2018 10:04:07 +0000 (UTC)
From: Stormy <storm...@yahoo.com>
Message-ID: <1098519022.17507...@mail.yahoo.com>

| however, both paths need to be normalized fully before

Yes, you're right, I did not consider that, but skipping empty
pathname components should be easy enough to add
(and pattern components if you need to)

I also did not include code to make sure /fff only matches /???
and that fff only matches ??? (etc) - ie: to check that both
refer to the root, or neither does.

| also I'm completely unaware of the "-" in local, it gave me:
| ./fnmatch.sh: line 3: local: `-': not a valid identifier

If you're using bash that means you're using an older version.
If you're not using bash, that is to be expected (the chances that
you're using the NetBSD or FreeBSD shell is small, I am not sure
if dash includes "local -" or whether it split from the other ash
derived shells before that was added.)

| took it out, and script seems to work,

Without "local -" you would need to restore the setting of
the 'f' option before the function returns. Something like

OPTS="$(set +o)"
in the init code (where OPTS is yet another local var...)
and then
eval "$OPTS"
just before each "return" (which will mean adding { } in
one case) and before the final [ ] command that sets
the exit code for when the function falls off the bottom.

This is kind of overkill for just -f and there are other ways
to save and restore just that option, but this is the easiest.

kre


Stormy

unread,
Mar 16, 2018, 11:34:41 AM3/16/18
to Robert Elz, bug-...@gnu.org
thanks for the help, yeah, i see, bash 4.2.26 is "old" :) :) found the 4.4:
https://tiswww.case.edu/php/chet/bash/NEWS
s. The `local' builtin takes a new argument: `-', which will cause it to save
and the single-letter shell options and restore their previous values at
function return.
ok, thanks for the help all.

Greg Wooledge

unread,
Mar 16, 2018, 12:24:41 PM3/16/18
to Stormy, bug-...@gnu.org
On Fri, Mar 16, 2018 at 10:04:07AM +0000, Stormy wrote:
> Thanks for the confirmation regarding the need to write code. and also about 'cd' 'ls', yeah, it's what I sort of mean, bash does when passing args to them, but there is no other way to 'access' this type of parsing..

You are still quite confused. Globbing (filename expansion) is done by
bash when a glob is evaluated in several different contexts. There are
ways to use this. Real ways. This is not just esoteric bullshit.

Context #1 is when the glob is part of an argument list. I gave this
example previously in this thread:

echo *

Do you not understand how this works? Bash expands the glob (*) into
a list of filename argument-words. Each argument is passed to echo
as part of echo's argv[] array. Then, echo prints each of them, because
that's what echo does.

Passing an argument list of unbounded size to a command may run into
system-specific length limits. See
<https://www.in-ulm.de/~mascheck/various/argmax/> for details.

Context #2 is when the glob is part of an array assignment, like this:

files=(*)

In this case, the glob is expanded into a list of filenames, which are
then used as the argument-words to stuff into the array. The result
is an array containing all of the filenames that matched the glob, one
filename per array element, one array element per filename.

Since this is not a command's argv[] there is no limit on the length,
other than however much memory bash can allocate.

Context #3 is when the glob is one of the words following
"for varname in".

for f in *

The glob is expanded to a list of filenames which is stored internally
and used as the iteration list of the for loop. You don't have direct
access to the complete list of filenames at any point (use an array if
that's what you want), but you do get access to one filename at a time
as the for loop iterates. Each filename becomes the content of the
variable named "f", one by one.

Again, there is no limit on the length of this internal filename list,
other than the amount of memory bash can allocate.

Stormy

unread,
Mar 16, 2018, 1:50:14 PM3/16/18
to Greg Wooledge, bug-...@gnu.org
Greg,
Thanks for all the detailed explanation, i think all your examples relate to existing files.

Maybe it was not clear from the start, so let me say that again.  In the input to my script, both the pattern and the filenames do not relate to actual files on any existing filesystem -- they are just 'strings', hence something like /data/dir1/* will not expand to anything, b/c that filesystem does not exist (heck, it may exist, but that is not a given and should not impact the decision/execution).. same for pathnames.
the goal was, given two sets of inputs one patterns, like: /data/dir1/*, /data/dir2/*/?/[0-9], etc. and another fixed full file names, i.e: /data/dir1/test, /data/dir2/test//test2   one has to print no/match for all these combinations; if bash had 'fnmatch' "builtin" it would be a one liner like [[ $a =/ $b ]] :)  I just invented an imaginary comparison operator :) :)

I do know how globbing works in "echo *", maybe the language I use is not as precise as you guys , excuse me for that, I'm just a simple end user of bash that wanted to report the original 'doc issue/bug' with regards to pattern/matching...

Cheers.
Stormy.
0 new messages