Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

why is it OK for grep to find files?

57 views
Skip to first unread message

Ed Morton

unread,
Oct 1, 2023, 7:40:24 AM10/1/23
to
A recent thread titled "Get a list of files that contain a string"
produced an answer suggesting the OP use "grep -rl regexp" with "-l
string" being the GNU grep option to only output the file name when
"regexp" matches a string in a file, and "-r" being the GNU grep option
to recursively find files.

I'm trying to understand why it's OK for GNU grep to have options to
find files:

-d, --directories=ACTION how to handle directories;
-r, --recursive like --directories=recurse
-R, --dereference-recursive likewise, but follow all symlinks
--include=GLOB search only files that match GLOB (a file
pattern)
--exclude=GLOB skip files that match GLOB
--exclude-from=FILE skip files that match any file pattern from
FILE
--exclude-dir=GLOB skip directories that match GLOB

when traditionally "grep" exists to "g/re/p" (the "ed" commands to
Globally match a Regular Expression within files and Print the result),
the tool "find" exists to "find" files, no other text processing
commands have options to find files, and #1 in the Unix philosophy
(https://en.wikipedia.org/wiki/Unix_philosophy) is to "Make each program
do one thing well. To do a new job, build afresh rather than complicate
old programs by adding new "features"."

If grep should have options to find files then maybe they should also:

a) Give "grep" additional options to "sort" it's output, "tr"anslate
characters, "paste" results from multiple files, etc.
b) Give "sed", "awk", "tr", "cut", "paste" etc. the same options as GNU
grep now has so they can also "find" files.

I see grep commands these days that are a relatively long, complicated
mixture of options, some to find files and others to search within
files, e.g.:

grep -r --include='*.html' --include='*.php' --include='*.htm' -Fxl
'regexp' /some/path/
grep -R --include='*.{html,php,htm}' -Fxl 'regexp' /some/path

when "find ... -exec grep ... {} +":

find /some/path \( -name '*.html' -o -name '*.php' -o -name '*.htm'
\) -exec grep -Fxl 'regexp' {} +
find /some/path -regextype egrep -regex '.*\.(html|php|htm)$' -exec
grep -Fxl 'regexp' {} +

or similar would do the job about as briefly and efficiently as well as
making it easier to replace just the grep command with sed or awk
if/when the "search within files" part in future became more complex
than made sense for "grep".

So:

1) Is there some reason why the GNU folks having added options to find
files onto grep was a reasonable thing to do rather than flying in the
face of the Unix philosophy and unnecessarily complicating the interface
of grep?

2) Can we expect GUN grep to get additional options in future to do
other things that other Unix commands currently do?

3) Can we expect GNU sed, awk, etc. to also get options to find files
for consistency with GNU grep?

Ed.

Richard Kettlewell

unread,
Oct 1, 2023, 8:19:35 AM10/1/23
to
Ed Morton <morto...@gmail.com> writes:
> 1) Is there some reason why the GNU folks having added options to find
> files onto grep was a reasonable thing to do rather than flying in the
> face of the Unix philosophy and unnecessarily complicating the
> interface of grep?

I expect they thought they would be useful; if so they were right.

You don’t have to use them if you don’t like them...

--
https://www.greenend.org.uk/rjk/

Chris Elvidge

unread,
Oct 1, 2023, 9:37:45 AM10/1/23
to
On 01/10/2023 12:40, Ed Morton wrote:
> 1) Is there some reason why the GNU folks having added options to find
> files onto grep was a reasonable thing to do rather than flying in the
> face of the Unix philosophy and unnecessarily complicating the interface
> of grep?

Grep doesn't find files; the shell does that. Grep only searches the
files given for the "regular expression".

SYNOPSIS
grep [OPTION...] PATTERNS [FILE...]
grep [OPTION...] -e PATTERNS ... [FILE...]
grep [OPTION...] -f PATTERN_FILE ... [FILE...]

DESCRIPTION
grep searches for PATTERNS in each FILE.


--
Chris Elvidge, England
THE PRINCIPAL'S TOUPEE IS NOT A FRISBEE

Lew Pitcher

unread,
Oct 1, 2023, 10:03:12 AM10/1/23
to
On Sun, 01 Oct 2023 14:37:37 +0100, Chris Elvidge wrote:

> On 01/10/2023 12:40, Ed Morton wrote:
>> 1) Is there some reason why the GNU folks having added options to find
>> files onto grep was a reasonable thing to do rather than flying in the
>> face of the Unix philosophy and unnecessarily complicating the
>> interface of grep?
>
> Grep doesn't find files; the shell does that. Grep only searches the
> files given for the "regular expression".

Oh???

>
> SYNOPSIS
> grep [OPTION...] PATTERNS [FILE...]
> grep [OPTION...] -e PATTERNS ... [FILE...] grep [OPTION...] -f
> PATTERN_FILE ... [FILE...]
>
> DESCRIPTION
> grep searches for PATTERNS in each FILE.




GREP(1) General Commands Manual GREP(1)

NAME
grep, egrep, fgrep - print lines matching a pattern

SYNOPSIS
grep [OPTIONS] PATTERN [FILE...]
grep [OPTIONS] [-e PATTERN | -f FILE] [FILE...]

...

File and Directory Selection

...

-r, --recursive
Read all files under each directory, recursively, following
symbolic links only if they are on the command line. Note
that if no file operand is given, grep searches the working
directory. This is equivalent to the -d recurse option.
...

BUGS
Reporting Bugs
Email bug reports to the bug-reporting address
<bug-...@gnu.org>. An email archive
<http://lists.gnu.org/mailman/listinfo/bug-grep>
and a bug tracker
<http://debbugs.gnu.org/cgi/pkgreport.cgi?package=grep>
are available.

--
Lew Pitcher
"In Skills We Trust"

Ed Morton

unread,
Oct 1, 2023, 10:43:52 AM10/1/23
to
On 10/1/2023 7:19 AM, Richard Kettlewell wrote:
> Ed Morton <morto...@gmail.com> writes:
>> 1) Is there some reason why the GNU folks having added options to find
>> files onto grep was a reasonable thing to do rather than flying in the
>> face of the Unix philosophy and unnecessarily complicating the
>> interface of grep?
>
> I expect they thought they would be useful; if so they were right.

Something being useful is a long way from being the only criteria for
adding it to a command. Finding files would be useful for awk, sed,
sort, paste, and every other command that operates on files so by that
criteria all commands should have options to find files in addition to
their existing options. Sorting output would be useful for all commands
that operate on text files too so that "it's useful" criteria all
commands should have options to sort content in addition to their
existing options. Down that path lies the conclusion we should just have
1 command with all options to do everything that all existing commands
do since everything is useful.

>
> You don’t have to use them if you don’t like them...
>

But I do have to deal with existing software that uses them, I don't
have the luxury of pretending they don't exist.

Ed.

Ed Morton

unread,
Oct 1, 2023, 10:48:31 AM10/1/23
to
On 10/1/2023 8:37 AM, Chris Elvidge wrote:
> On 01/10/2023 12:40, Ed Morton wrote:
>> 1) Is there some reason why the GNU folks having added options to find
>> files onto grep was a reasonable thing to do rather than flying in the
>> face of the Unix philosophy and unnecessarily complicating the
>> interface of grep?
>
> Grep doesn't find files; the shell does that. Grep only searches the
> files given for the "regular expression".
>
> SYNOPSIS
>        grep [OPTION...] PATTERNS [FILE...]
>        grep [OPTION...] -e PATTERNS ... [FILE...]
>        grep [OPTION...] -f PATTERN_FILE ... [FILE...]
>
> DESCRIPTION
>        grep searches for PATTERNS in each FILE.
>
>

I'm surprised to see you say that since it was your recent answer at
https://groups.google.com/g/comp.unix.shell/c/9mFNQ27MI0A/m/zwGln8V6AAAJ:

> Forget find, use grep -rl (or -Rl) (recursive, list)

where `-r` is the option to recursively find files, that inspired me to
start this thread.

Ed.

Richard Kettlewell

unread,
Oct 1, 2023, 11:25:12 AM10/1/23
to
Ed Morton <morto...@gmail.com> writes:
> On 10/1/2023 7:19 AM, Richard Kettlewell wrote:
>> Ed Morton <morto...@gmail.com> writes:
>>> 1) Is there some reason why the GNU folks having added options to find
>>> files onto grep was a reasonable thing to do rather than flying in the
>>> face of the Unix philosophy and unnecessarily complicating the
>>> interface of grep?
>> I expect they thought they would be useful; if so they were right.
>
> Something being useful is a long way from being the only criteria for
> adding it to a command.

That’s very situational. For a volunteer “useful and fun to do” might be
sufficent, for example.

--
https://www.greenend.org.uk/rjk/

David W. Hodgins

unread,
Oct 1, 2023, 11:57:34 AM10/1/23
to
The find command is used to find files within a directory tree based on
the file name.

While find can be used to invoke grep on each file it returns, when searching
for text in all files in a directory tree, or in a list of files (as returned
by shell expansion), it's more efficient to only invoke grep once and let
it do all parts of the searching.

Regards, Dave Hodgins

Chris Elvidge

unread,
Oct 1, 2023, 12:58:34 PM10/1/23
to
OK. Perhaps wrong terminology.

[FILE...] is a list of files - could be * (all files) - shell expands it
to the list of files.

PATTERNS may be a straight character string or a regular expression (see
what I said above) or several, or read from a file.

-R/-r = follow the list of files into subdirectories (doesn't do this by
default)

Of course this has the effect of finding files in the list containing
the pattern. If the list is *, all files are searched.


--
Chris Elvidge, England
I AM SO VERY TIRED

Kaz Kylheku

unread,
Oct 1, 2023, 1:02:50 PM10/1/23
to
On 2023-10-01, Ed Morton <morto...@gmail.com> wrote:
> A recent thread titled "Get a list of files that contain a string"
> produced an answer suggesting the OP use "grep -rl regexp" with "-l
> string" being the GNU grep option to only output the file name when
> "regexp" matches a string in a file, and "-r" being the GNU grep option
> to recursively find files.
>
> I'm trying to understand why it's OK for GNU grep to have options to
> find files:

People often need to recursively grep a tree of files. They do it
interactively.

Using find together with a flat grep is annoyingly verbose for the task,
like:

find . -type f -exec grep foo {} /dev/null \;

Remember your /dev/null because grep behaves differently with one
file argument: won't print the file name.

Each user would end up writing a function or alias for this.

GNU Bash has a ** glob pattern (double star) which can replace uses
of find, and can potentially run into environmental passing limits
if used like this:

grep -l foo **/*.c

If the expansion is large you need either xargs, which has problems
with spaces in file names withtout GNU extensions, or write a loop:

for x in **/*.c; do grep -l foo "$x" /dev/null; done

Also an annoying mouthful for something you need often.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazi...@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

Lew Pitcher

unread,
Oct 1, 2023, 1:38:48 PM10/1/23
to
On Sun, 01 Oct 2023 17:02:44 +0000, Kaz Kylheku wrote:

> On 2023-10-01, Ed Morton <morto...@gmail.com> wrote:
>> A recent thread titled "Get a list of files that contain a string"
>> produced an answer suggesting the OP use "grep -rl regexp" with "-l
>> string" being the GNU grep option to only output the file name when
>> "regexp" matches a string in a file, and "-r" being the GNU grep option
>> to recursively find files.
>>
>> I'm trying to understand why it's OK for GNU grep to have options to
>> find files:
>
> People often need to recursively grep a tree of files. They do it
> interactively.
>
> Using find together with a flat grep is annoyingly verbose for the task,
> like:
>
> find . -type f -exec grep foo {} /dev/null \;
>
> Remember your /dev/null because grep behaves differently with one file
> argument: won't print the file name.

Unless you give grep(1) the -l argument[1]:
find . -type f -exec grep -l foo {} \;

[1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html

[snip]


HTH

Christian Weisgerber

unread,
Oct 1, 2023, 3:30:11 PM10/1/23
to
On 2023-10-01, Ed Morton <morto...@gmail.com> wrote:

> I'm trying to understand why it's OK for GNU grep to have options to
> find files:

It all started when people added the -C flag for columnized output
to ls(1) and enabled it by default if stdout was a tty--instead of
manually invoking "ls | column" as God intended...

--
Christian "naddy" Weisgerber na...@mips.inka.de

Janis Papanagnou

unread,
Oct 1, 2023, 5:04:17 PM10/1/23
to
On 01.10.2023 13:40, Ed Morton wrote:
> A recent thread titled "Get a list of files that contain a string"
> produced an answer suggesting the OP use "grep -rl regexp" with "-l
> string" being the GNU grep option to only output the file name when
> "regexp" matches a string in a file, and "-r" being the GNU grep option
> to recursively find files.
>
> I'm trying to understand why it's OK for GNU grep to have options to
> find files: [...]

When I read that other post's statement
"Forget find, use grep -rl (or -Rl) (recursive, list)"
I had some (partly similar) thoughts. That's why (in my post) I
emphasized my wording to distinguish the cases by the necessity
to "find" files or not. (Has that wording triggered your post?)

My random thoughts (in no particular order or any valuation) were...
a) Do Unix tools start implementing each function (like DOS does e.g.
with expansion of * and other "wildcards" in each tool vs. shell)
duplicated in every program now, just for (sort of) "convenience"?
Why is there a deviation from the "Separation of Duties" principle?
b) Pipe expressions may grow, it may be simpler to use one process.
Modern xargs or find syntaxes support efficient processing.
Implementing it in 'grep' may simplify logic in certain cases.
There are already deviations from the "Separation of Duties" in
other tools; see 'ls' with its sort extensions and GNU 'ls' yet
with more sort options and a lot other options.
c) Which program shall at what stage do the directory tree walk?
We have a specialized 'find' (or a more powerful/modern 'tw').
We have it supported as ** in newer shell (ksh, zsh, bash)
(but exec buffer limit might pop in here, so safe options are
desirable, or using it in safe ways with built-ins only; ksh).
Now we see it in 'grep'. Will we see it in other programs as well.
(See point a)
d) It is non-standard.
Many folks don't care about standards, they want functionality
(provided by tools that are sometimes designed by "featurities").
It will probably result in a wild proliferation of software versions
(as seen in the public domain sector).
e) Is it an ideological issue? Should I really care?
If it's no burden (in space or time complexity) but helps in some
cases, why not use it?

Janis

Ed Morton

unread,
Oct 1, 2023, 6:35:58 PM10/1/23
to
find ... -exec grep 'regexp' {} +

doesn't call grep once per file, it calls grep on groups of files so as
to not exceed ARG_MAX for each group.

So, yes it's probably a bit more efficient to call grep once on all
files but not enough so as to justify given grep it's own set of
file-finding arguments and if it did make sense for grep then it'd also
make sense for awk, sed, and every other tool that runs on text files
(or even giving find options to grep in files instead of the other way
around!).

Ed.

>
> Regards, Dave Hodgins

Ed Morton

unread,
Oct 1, 2023, 6:53:36 PM10/1/23
to
On 10/1/2023 12:02 PM, Kaz Kylheku wrote:
> On 2023-10-01, Ed Morton <morto...@gmail.com> wrote:
>> A recent thread titled "Get a list of files that contain a string"
>> produced an answer suggesting the OP use "grep -rl regexp" with "-l
>> string" being the GNU grep option to only output the file name when
>> "regexp" matches a string in a file, and "-r" being the GNU grep option
>> to recursively find files.
>>
>> I'm trying to understand why it's OK for GNU grep to have options to
>> find files:
>
> People often need to recursively grep a tree of files. They do it
> interactively.

I find myself writing find+grep commands much less frequently than
find+sed or find+awk but YMMV I suppose.

> Using find together with a flat grep is annoyingly verbose for the task,
> like:
>
> find . -type f -exec grep foo {} /dev/null \;
>
> Remember your /dev/null because grep behaves differently with one
> file argument: won't print the file name.

If you're using GNU grep (which you'd need for the file-finding options)
give it the `-H` argument and it'll print file names as well as the
matching string without having to add /dev/null:

find . -type f -exec grep -H foo {} +

I changed the `\;` to `+` so grep gets called on multiple files at a
time instead of one at a time.

>
> Each user would end up writing a function or alias for this.
>
> GNU Bash has a ** glob pattern (double star) which can replace uses
> of find, and can potentially run into environmental passing limits
> if used like this:
>
> grep -l foo **/*.c
>
> If the expansion is large you need either xargs, which has problems
> with spaces in file names withtout GNU extensions, or write a loop:
>
> for x in **/*.c; do grep -l foo "$x" /dev/null; done
>
> Also an annoying mouthful for something you need often.

sed, awk, and every other Unix tool have the same behavior, writing
`find . -type f -exec grep -l foo {} +` is just not a big deal, avoids
the problems you mentioned with "**/*.c" and "xargs", and it's good for
people to know how to find files with "find" for when they need to use
any other Unix tool instead of grep on the resultant files.

So I don't see any of that as justifying adding a bunch of options to
grep to do something other than it's primary purpose of g/re/p and
making it different from all other text processing tools in that regard.

Ed.

John D Groenveld

unread,
Oct 1, 2023, 7:01:46 PM10/1/23
to
In article <ufct99$2hrnj$1...@dont-email.me>,
Ed Morton <morto...@gmail.com> wrote:
>If you're using GNU grep (which you'd need for the file-finding options)
>give it the `-H` argument and it'll print file names as well as the
>matching string without having to add /dev/null:

illumos, FreeBSD and OpenBSD grep(1) also include -H:
<URL:https://illumos.org/man/1/grep>
<URL:https://man.freebsd.org/cgi/man.cgi?grep(1)>
<URL:https://man.openbsd.org/grep>

YMMV
John
groe...@acm.org

Kaz Kylheku

unread,
Oct 2, 2023, 12:21:12 AM10/2/23
to
Looking at just the OpenBSD one at the bottom, I see also that it
includes a -R recursive option.

Richard Kettlewell

unread,
Oct 2, 2023, 3:10:00 AM10/2/23
to
Ed Morton <morto...@gmail.com> writes:
> So I don't see any of that as justifying adding a bunch of options to
> grep to do something other than it's primary purpose of g/re/p and
> making it different from all other text processing tools in that
> regard.

Why do you think it needs justifying? It’s their code, they can add
whatever they like to it.

--
https://www.greenend.org.uk/rjk/

Kenny McCormack

unread,
Oct 2, 2023, 5:30:32 AM10/2/23
to
In article <wwvr0md...@LkoBDZeT.terraraq.uk>,
Richard Kettlewell <inv...@invalid.invalid> wrote:
>Ed Morton <morto...@gmail.com> writes:
>> So I don't see any of that as justifying adding a bunch of options to
>> grep to do something other than it's primary purpose of g/re/p and
>> making it different from all other text processing tools in that
>> regard.
>
>Why do you think it needs justifying? Its their code, they can add
>whatever they like to it.

1) "Justifying" was, perhaps, too strong of a word. Obviously, there's no
need to do so, in a strict, legalistic sense. Though (total aside coming
up), it reminds me of when lawyers get uppity when they see computer
messages like "illegal instruction" - when they know that, in their terms
and frame of reference, there's nothing illegal about it.

2) There will always be a tension between wanting programs to be as useful
as possible and being in conformance with the idea that there should be
limits on functionality (i.e., the concept of "feature creep"). The
standard cliche on this topic is "All programs expand to eventually include
email". We will know we've lost the battle when grep expands to the point
where it can email you the results of the search.

Anyway, this tension always exists and almost always functionality wins
out over conservatism. In the specific case under discussion (recursive
searching in grep), it is pretty common that when you are introducing grep
to a newuser, one of their first comments/requests will be "And, of course
I want to search all the files (meaning, all the subdirectories and files)
So, of course, your tool will do that, right?"

And, finally, let me add:

3) I personally find the "find" command archaic and hard to use (meaning:
If it were being designed today, it wouldn't be like it is). From a
usability standpoint, it is much better to be able to just include "-r" on
the grep command line, than to have to write out a long, ugly "find"
invocation.

--
To my knowledge, Jacob Navia is not a Christian.

- Rick C Hodgin -

Janis Papanagnou

unread,
Oct 2, 2023, 7:06:30 AM10/2/23
to
On 02.10.2023 11:30, Kenny McCormack wrote:
> And, finally, let me add:
>
> 3) I personally find the "find" command archaic and hard to use (meaning:
> If it were being designed today, it wouldn't be like it is). From a
> usability standpoint, it is much better to be able to just include "-r" on
> the grep command line, than to have to write out a long, ugly "find"
> invocation.

What about the 'tw' (tree walk) command from AT&T; unfortunately I
cannot find it online at the moment. Basic usage is simple (less
"archaic" syntax) but it can get as complicated as a programming
language with complex actions possible.

I only found a man page lying around on my disk and put it here
http://volatile.gridbug.de/tw.out
With usage examples at the bottom of that file.
I found also an old Linux binary from 2006 that I put there
http://volatile.gridbug.de/tw
(Haven't used it regularly, though; was used to archaic 'find'.)

Janis

Ed Morton

unread,
Oct 2, 2023, 7:59:36 AM10/2/23
to
The GNU people added functionality to grep in a way that is
contradictory to the Unix philosophy and makes it inconsistent with all
other text processing tools.

So far the only suggestion I've heard for why they did that is so people
could do:

grep -r 'regexp'

instead of:

find . -type f -exec grep -H 'regexp' {} +

which saves us about 20 simple, common characters over find+grep but
does nothing for find+awk, find+sed, etc.

If reducing how much typing we need to do for that was their only goal,
they could have introduced a separate tool named something like "ftf"
that does the equivalent of "Find -Type F" but takes the same arguments
they added to grep for finding files if they feel those are better than
the equivalent "find" arge, and calls whatever command is also provided
in the args with the resulting files, and then we could do:

ftf grep 'regexp'

which is about the same number of chars as using "-r", is consistent
with the Unix philosophy, and has the huge benefit that we could use it
for every other text processing tool too:

ftf sed '...'
ftf awk '...'
etc.

I assume they didn't spend their time and effort
designing/coding/testing/documenting/supporting the grep functionality
for finding files on a whim and I think it's reasonable to ask what the
rationale was for doing that, especially when they could have either
done nothing (20 simple chars - who cares?) or introduced a far more
generally useful separate tool for the purpose. Of the possibilities
they had, the one they chose to implement is concerning for other
potential changes that they might introduce in future.

Ed.



Damien Wyart

unread,
Oct 2, 2023, 8:26:13 AM10/2/23
to
* Janis Papanagnou <janis_pap...@hotmail.com> in comp.unix.shell:
> What about the 'tw' (tree walk) command from AT&T; unfortunately I cannot find
> it online at the moment. Basic usage is simple (less "archaic" syntax) but it
> can get as complicated as a programming language with complex actions possible.

I found this https://github.com/att/ast/tree/master/src/cmd/tw
but it seems some work would be needed to compile it on a recent Linux.

--
DW

Janis Papanagnou

unread,
Oct 2, 2023, 9:21:11 AM10/2/23
to
On 02.10.2023 14:26, Damien Wyart wrote:
> * Janis Papanagnou <janis_pap...@hotmail.com> in comp.unix.shell:
>> What about the 'tw' (tree walk) command from AT&T; unfortunately I cannot find
>> it online at the moment. Basic usage is simple (less "archaic" syntax) but it
>> can get as complicated as a programming language with complex actions possible.
>
> I found this https://github.com/att/ast/tree/master/src/cmd/tw

(I cannot even get or clone it from here.)

> but it seems some work would be needed to compile it on a recent Linux.

The AST software tools had their own build process, so probably you
have to get the whole AST tree with its build-system to create it.

Janis

Damien Wyart

unread,
Oct 2, 2023, 9:39:25 AM10/2/23
to
* Janis Papanagnou <janis_pap...@hotmail.com> in comp.unix.shell:
> (I cannot even get or clone it from here.)

Sorry, just wanted to point out "tw" in a precise way.
The root of the repository is here: https://github.com/att/ast

--
DW

Janis Papanagnou

unread,
Oct 2, 2023, 9:40:54 AM10/2/23
to
Ah, thanks!

Janis

Richard Kettlewell

unread,
Oct 3, 2023, 3:00:40 AM10/3/23
to
Ed Morton <morto...@gmail.com> writes:
> On 10/2/2023 2:09 AM, Richard Kettlewell wrote:
>> Ed Morton <morto...@gmail.com> writes:
>>> So I don't see any of that as justifying adding a bunch of options to
>>> grep to do something other than it's primary purpose of g/re/p and
>>> making it different from all other text processing tools in that
>>> regard.
>> Why do you think it needs justifying? It’s their code, they can add
>> whatever they like to it.
>
> The GNU people added functionality to grep in a way that is
> contradictory to the Unix philosophy and makes it inconsistent with
> all other text processing tools.

Why does that matter?

> So far the only suggestion I've heard for why they did that is so
> people could do:
>
> grep -r 'regexp'
>
> instead of:
>
> find . -type f -exec grep -H 'regexp' {} +
>
> which saves us about 20 simple, common characters over find+grep but
> does nothing for find+awk, find+sed, etc.

Sounds good to me. Less typing = better. Given the uptake of the GNU
tools I suspect my opinion is widely shared.

--
https://www.greenend.org.uk/rjk/

Kaz Kylheku

unread,
Oct 3, 2023, 3:26:54 AM10/3/23
to
On 2023-10-02, Ed Morton <morto...@gmail.com> wrote:
> On 10/2/2023 2:09 AM, Richard Kettlewell wrote:
>> Ed Morton <morto...@gmail.com> writes:
>>> So I don't see any of that as justifying adding a bunch of options to
>>> grep to do something other than it's primary purpose of g/re/p and
>>> making it different from all other text processing tools in that
>>> regard.
>>
>> Why do you think it needs justifying? It’s their code, they can add
>> whatever they like to it.
>>
>
> The GNU people added functionality to grep in a way that is
> contradictory to the Unix philosophy and makes it inconsistent with all
> other text processing tools.

- GNU stands for GNU is Not Unix.

- The GNU coding standards document says:

The GNU Project regards standards published by other organizations
as suggestions, not orders. We consider those standards, but we do
not “obey” them. In developing a GNU program, you should implement
an outside standard’s specifications when that makes the GNU system
better overall in an objective sense. When it doesn’t, you shouldn’t.

- Unix programs have recursion built in: from memory I can think of
ls, rm, cp, mv, chgrp, chmod and chown.

- BSD Unixes have a -R option on grep; some, like FreeBSD, have an -r
option as well as a rgrep utility.

Janis Papanagnou

unread,
Oct 3, 2023, 11:04:09 AM10/3/23
to
On 03.10.2023 09:26, Kaz Kylheku wrote:
>
> - Unix programs have recursion built in: from memory I can think of
> ls, rm, cp, mv, chgrp, chmod and chown.

When I saw in the 'tw' examples command calls like 'tw chmod go-w' I
wondered whether it would have been a good design idea to use such a
paradigm generally; tw ls; tw rm, etc., instead of implementing some
-r (or -R or other) tree walks in each program separately. This ship
may have sailed already but the idea seems sensible. (And 'tw' seems
more powerful than 'find', yet with a less "archaic" syntax.) Options
to 'tw' would control the tree walk and the commands would have their
own options; a good separation of control, no multiple duplications.
Of course we have such a beast already as standard with find -exec +,
but the 'tw' interface seems smarter. I wonder why 'tw' didn't get in
wider use.

Janis

Janis Papanagnou

unread,
Oct 3, 2023, 11:20:05 AM10/3/23
to
On 02.10.2023 13:59, Ed Morton wrote:
> [...]
>
> If reducing how much typing we need to do for that was their only goal,
> they could have introduced a separate tool named something like "ftf"
> that does the equivalent of "Find -Type F" [...]

Have a look at 'tw' (man page: http://volatile.gridbug.de/tw.out) which
is not restricted to '-type f' but can be parameterized with all options
necessary to control the search.

tw -tw-options... cmd -cmd-options... cmd-args...

All file subsets are determined by the 'tw' options and the commands use
just their own options. (The commands need not implement the tree walk
and the options to control it.) And each command used with 'tw' just
cares about its own command specific functional options.

Janis

Kaz Kylheku

unread,
Oct 3, 2023, 11:20:55 AM10/3/23
to
On 2023-10-03, Janis Papanagnou <janis_pap...@hotmail.com> wrote:
> On 03.10.2023 09:26, Kaz Kylheku wrote:
>>
>> - Unix programs have recursion built in: from memory I can think of
>> ls, rm, cp, mv, chgrp, chmod and chown.
>
> When I saw in the 'tw' examples command calls like 'tw chmod go-w' I
> wondered whether it would have been a good design idea to use such a
> paradigm generally; tw ls; tw rm, etc., instead of implementing some
> -r (or -R or other) tree walks in each program separately. This ship
> may have sailed already but the idea seems sensible. (And 'tw' seems
> more powerful than 'find', yet with a less "archaic" syntax.)

In the case of chmod, you have to be careful about order. E.g. if you're
*removing* read/execute permissions from everything, you have to go
bottom up. If adding, then top down.

> to 'tw' would control the tree walk and the commands would have their
> own options; a good separation of control, no multiple duplications.
> Of course we have such a beast already as standard with find -exec +,
> but the 'tw' interface seems smarter. I wonder why 'tw' didn't get in
> wider use.

Performance? Launch a new chmod process for every visited object.

That could be addressed by allowing executables to be loadable as shared
libraries that expose the main function.

Then tw could find chmod in PATH, try to dlopen it, and if that works
and a "main" symbol is found, repeatedly call that function rather
than fork + exec.

Janis Papanagnou

unread,
Oct 3, 2023, 11:32:43 AM10/3/23
to
On 03.10.2023 17:20, Kaz Kylheku wrote:
> On 2023-10-03, Janis Papanagnou <janis_pap...@hotmail.com> wrote:
>> On 03.10.2023 09:26, Kaz Kylheku wrote:
>>>
>>> - Unix programs have recursion built in: from memory I can think of
>>> ls, rm, cp, mv, chgrp, chmod and chown.
>>
>> When I saw in the 'tw' examples command calls like 'tw chmod go-w' I
>> wondered whether it would have been a good design idea to use such a
>> paradigm generally; tw ls; tw rm, etc., instead of implementing some
>> -r (or -R or other) tree walks in each program separately. This ship
>> may have sailed already but the idea seems sensible. (And 'tw' seems
>> more powerful than 'find', yet with a less "archaic" syntax.)
>
> In the case of chmod, you have to be careful about order. E.g. if you're
> *removing* read/execute permissions from everything, you have to go
> bottom up. If adding, then top down.

My expectation is that a tree walk function would be controllable
(by options) in which order the tree nodes should be visited.[*]

>
>> to 'tw' would control the tree walk and the commands would have their
>> own options; a good separation of control, no multiple duplications.
>> Of course we have such a beast already as standard with find -exec +,
>> but the 'tw' interface seems smarter. I wonder why 'tw' didn't get in
>> wider use.
>
> Performance? Launch a new chmod process for every visited object.

I don't see how that would be an unavoidable implementation. I mean
find -exec + will also collect more than one item. There could even
be a yet more flexible and sophisticated implementation than find.
(Not sure about what 'tw' does.) There are only a couple of tree walk
strategies (you could choose from by options).

(It may be worth studying 'tw' in more detail and making some tests.)

>
> That could be addressed by allowing executables to be loadable as shared
> libraries that expose the main function.
>
> Then tw could find chmod in PATH, try to dlopen it, and if that works
> and a "main" symbol is found, repeatedly call that function rather
> than fork + exec.

Janis

[*] Disclaimer: I haven't spent any time studying details of 'tw'.

Computer Nerd Kev

unread,
Oct 3, 2023, 5:29:59 PM10/3/23
to
Kaz Kylheku <864-11...@kylheku.com> wrote:
> - Unix programs have recursion built in: from memory I can think of
> ls, rm, cp, mv, chgrp, chmod and chown.

It's also interesting to compare tar and cpio. Tar also has
recursion built in, and filtering functionality duplicated with
find. Cpio instead takes a list of files to archive from stdin,
which also avoids the maximum argument number and maximum command
length limits that complicate using command-line arguments.

GNU cpio actually supports creating tar archives, but it's notable
that people (including me) don't seem to use it for that, even
though its behaviour is more UNIXy than tar. POSIX apparantly
recommends pax, which supports doing things both ways, but everyone
ignores that too.

--
__ __
#_ < |\| |< _#

Ed Morton

unread,
Oct 3, 2023, 5:43:35 PM10/3/23
to
Yup, that looks like the kind of command we should be using instead of
grep having been given arguments to find files. Thanks for that link and
discussion.

Ed.

Ed Morton

unread,
Oct 4, 2023, 1:50:24 PM10/4/23
to
On 10/3/2023 2:00 AM, Richard Kettlewell wrote:
> Ed Morton <morto...@gmail.com> writes:
>> On 10/2/2023 2:09 AM, Richard Kettlewell wrote:
>>> Ed Morton <morto...@gmail.com> writes:
>>>> So I don't see any of that as justifying adding a bunch of options to
>>>> grep to do something other than it's primary purpose of g/re/p and
>>>> making it different from all other text processing tools in that
>>>> regard.
>>> Why do you think it needs justifying? It’s their code, they can add
>>> whatever they like to it.
>>
>> The GNU people added functionality to grep in a way that is
>> contradictory to the Unix philosophy and makes it inconsistent with
>> all other text processing tools.
>
> Why does that matter?

For the same reasons that cohesion and consistency matter in any system.
Google "software" with those 2 terms to find more information on how
those concepts apply to software.

If Unix were being invented today and the design decision being
presented in an attempt to answer the question "how do we call a text
processing tool on every file found under a directory?" was:

IF (tool == grep) && (provider == GNU) THEN
tool -r '...'
ELSE
find . -type f tool '...' {} +
ENDIF

instead of just:

find . -type f tool '...' {} +

the person suggesting it would be ridiculed and sent back to the drawing
board.

>
>> So far the only suggestion I've heard for why they did that is so
>> people could do:
>>
>> grep -r 'regexp'
>>
>> instead of:
>>
>> find . -type f -exec grep -H 'regexp' {} +
>>
>> which saves us about 20 simple, common characters over find+grep but
>> does nothing for find+awk, find+sed, etc.
>
> Sounds good to me. Less typing = better. Given the uptake of the GNU
> tools I suspect my opinion is widely shared.
>

The GNU providers have made many worthwhile contributions to many tools,
people are not adopting GNU tools because they added "-r" to grep.

Ed.

Javier

unread,
Oct 4, 2023, 2:59:39 PM10/4/23
to
tar added later in its history the option to work as a filter

find . | tar --create --files-from=- --file=- > archive.tar

Had the feature not been added cpio or pax would have higher popularity.

That is at least the case of GNU/tar. I just checked the OpenBSD/tar
manpage and the feature to get files from stdin does not appear.

Christian Weisgerber

unread,
Oct 4, 2023, 5:30:10 PM10/4/23
to
On 2023-10-04, Javier <inv...@invalid.invalid> wrote:

> find . | tar --create --files-from=- --file=- > archive.tar
>
> That is at least the case of GNU/tar. I just checked the OpenBSD/tar
> manpage and the feature to get files from stdin does not appear.

Actually, it does: -I <file>

Where <file> can be "-" for stdin.

Richard Kettlewell

unread,
Oct 5, 2023, 1:40:48 PM10/5/23
to
Ed Morton <morto...@gmail.com> writes:
> On 10/3/2023 2:00 AM, Richard Kettlewell wrote:
>> Ed Morton <morto...@gmail.com> writes:
>>> The GNU people added functionality to grep in a way that is
>>> contradictory to the Unix philosophy and makes it inconsistent with
>>> all other text processing tools.
>> Why does that matter?
>
> For the same reasons that cohesion and consistency matter in any
> system. Google "software" with those 2 terms to find more information
> on how those concepts apply to software.

Prioritizing consistency over usability, then.

> If Unix were being invented today and the design decision being
> presented in an attempt to answer the question "how do we call a text
> processing tool on every file found under a directory?" was:
>
> IF (tool == grep) && (provider == GNU) THEN
> tool -r '...'
> ELSE
> find . -type f tool '...' {} +
> ENDIF
>
> instead of just:
>
> find . -type f tool '...' {} +
>
> the person suggesting it would be ridiculed and sent back to the
> drawing board.

Are you somehow under the impression that the existence of “grep -r”
prevents the second answer from working? If not then why on earth would
you imagine anyone would give the first answer?

> The GNU providers have made many worthwhile contributions to many
> tools, people are not adopting GNU tools because they added "-r" to
> grep.

Well it’s demonstrably not an obstacle to many people.

--
https://www.greenend.org.uk/rjk/

Richard Kettlewell

unread,
Oct 5, 2023, 1:42:57 PM10/5/23
to
gaz...@shell.xmission.com (Kenny McCormack) writes:
> Richard Kettlewell <inv...@invalid.invalid> wrote:
>>Ed Morton <morto...@gmail.com> writes:
>>> So I don't see any of that as justifying adding a bunch of options to
>>> grep to do something other than it's primary purpose of g/re/p and
>>> making it different from all other text processing tools in that
>>> regard.
>>
>>Why do you think it needs justifying? Its their code, they can add
>>whatever they like to it.
>
> 1) "Justifying" was, perhaps, too strong of a word. Obviously, there's no
> need to do so, in a strict, legalistic sense. Though (total aside coming
> up), it reminds me of when lawyers get uppity when they see computer
> messages like "illegal instruction" - when they know that, in their terms
> and frame of reference, there's nothing illegal about it.

It’s the OP’s choice of word...

> 3) I personally find the "find" command archaic and hard to use (meaning:
> If it were being designed today, it wouldn't be like it is). From a
> usability standpoint, it is much better to be able to just include "-r" on
> the grep command line, than to have to write out a long, ugly "find"
> invocation.

Quite.

--
https://www.greenend.org.uk/rjk/

Kenny McCormack

unread,
Oct 5, 2023, 2:32:15 PM10/5/23
to
In article <wwvv8bl...@LkoBDZeT.terraraq.uk>,
Richard Kettlewell <inv...@invalid.invalid> wrote:
...
>>>Why do you think it needs justifying? Its their code, they can add
>>>whatever they like to it.
>>
>> 1) "Justifying" was, perhaps, too strong of a word. Obviously, there's no
>> need to do so, in a strict, legalistic sense. Though (total aside coming
>> up), it reminds me of when lawyers get uppity when they see computer
>> messages like "illegal instruction" - when they know that, in their terms
>> and frame of reference, there's nothing illegal about it.
>
>It's the OPs choice of word...

Indeed. I was basically a) rebuking OP for using too strong of a word, and
b) re-assuring you that your reaction to his wording was reasonable.

>> 3) I personally find the "find" command archaic and hard to use (meaning:
>> If it were being designed today, it wouldn't be like it is). From a
>> usability standpoint, it is much better to be able to just include "-r" on
>> the grep command line, than to have to write out a long, ugly "find"
>> invocation.
>
>Quite.

Agreed.

Note also that the real bad thing about the "find ... -exec grep ..." type
solution (that OP seems to favor) is that it inefficiently spawns a new
process for each file. This has, of course been noted many times in this
thread, but it cannot be over-stressed. Note that you can use "xargs" to
mitigate this problem (as I routinely do) or use the (non-standard) "+"
option in (GNU) find. I've never used the later option, just because I got
used to using xargs and never saw the need for a different method.

--
Elect a clown, expect a circus.

Kenny McCormack

unread,
Oct 5, 2023, 2:48:03 PM10/5/23
to
In article <wwv1qe9...@LkoBDZeT.terraraq.uk>,
Richard Kettlewell <inv...@invalid.invalid> wrote:
...
>> IF (tool == grep) && (provider == GNU) THEN
>> tool -r '...'
>> ELSE
>> find . -type f tool '...' {} +
>> ENDIF
>>
>> instead of just:
>>
>> find . -type f tool '...' {} +
>>
>> the person suggesting it would be ridiculed and sent back to the
>> drawing board.
>
>Are you somehow under the impression that the existence of grep -r
>prevents the second answer from working? If not then why on earth would
>you imagine anyone would give the first answer?

I think OP is somewhat cryptically arguing that:

a) People can't use the new fancy GNU stuff in portable scripts. except
if they...
b) In order to use them in a portable script, they'd have to code as
shown above - that is, only use the new fancy GNU stuff if they are
running in an environment known to have that stuff. But ...
c) Nobody is going to do that, because it is just silly, so you end up
just coding it the old-fashioned way, and thus ...
d) The new stuff never gets used, so why should anyone bother designing
and implementing it?

I don't agree with OP, but I think that's the position he is trying to
argue.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/Windows

Ben Bacarisse

unread,
Oct 5, 2023, 3:37:29 PM10/5/23
to
Ed Morton <morto...@gmail.com> writes:

> On 10/3/2023 2:00 AM, Richard Kettlewell wrote:
>> Ed Morton <morto...@gmail.com> writes:
>>> On 10/2/2023 2:09 AM, Richard Kettlewell wrote:
>>>> Ed Morton <morto...@gmail.com> writes:
>>>>> So I don't see any of that as justifying adding a bunch of options to
>>>>> grep to do something other than it's primary purpose of g/re/p and
>>>>> making it different from all other text processing tools in that
>>>>> regard.
>>>> Why do you think it needs justifying? It’s their code, they can add
>>>> whatever they like to it.
>>>
>>> The GNU people added functionality to grep in a way that is
>>> contradictory to the Unix philosophy and makes it inconsistent with
>>> all other text processing tools.
>> Why does that matter?
>
> For the same reasons that cohesion and consistency matter in any
> system. Google "software" with those 2 terms to find more information on
> how those concepts apply to software.

Those are not the only concepts that matter.

> If Unix were being invented today and the design decision being presented
> in an attempt to answer the question "how do we call a text processing tool
> on every file found under a directory?" was:
>
> IF (tool == grep) && (provider == GNU) THEN
> tool -r '...'
> ELSE
> find . -type f tool '...' {} +
> ENDIF
>
> instead of just:
>
> find . -type f tool '...' {} +

find . -type f -exec tool '...' '{}' +

> the person suggesting it would be ridiculed and sent back to the drawing
> board.

It's not "just" find ... find is a mess. If the file system, the shell
and all the rest were such that adding some command or prefix like

rec grep pattern .

made grep behave pretty much like grep -r then no one would bother to
add options like -r to grep.

But find is the best we have, and it's a mess. It uses mandatory syntax
that needs to be quoted (the {}) and does not always preserve the desred
behaviour. For example

grep -rq needle haystack

can succeed (exit 0) where

find haystack -type f -exec grep -q needle '{}' +

can fail (exit non-zero). This is because find can run multiple
commands and does not always combine the exit statuses correctly. I
would be happy to be wrong about this, but I'm sure I tested it some
time ago.

Of course what's really needed is proper functional composition of
tools. What I mean is that where a list of arguments is needed a lazy
list resulting from a program (something like a generator if you prefer
the term) could be given. We can't use bash's **/* nor $(find . -type
f) for various reasons, not least because the argument list must be
fully evaluated before being passed to exec'd program. A lot would have
to change before (to invent syntax on the fly)

grep needle $[files-in haystack]

started to produce output immediately and could not run out of
resources.

Unix tries with pipes and macro expansion, but the input stream is not
always available (or logical to use) and the shell's expansion and
quoting rules are so intricate that I doubt there is anyone here who has
not been caught out by them even after years of Unix use.

--
Ben.

Janis Papanagnou

unread,
Oct 5, 2023, 4:01:00 PM10/5/23
to
On 05.10.2023 20:32, Kenny McCormack wrote:
>
> Note also that the real bad thing about the "find ... -exec grep ..." type
> solution (that OP seems to favor) is that it inefficiently spawns a new
> process for each file.

Then just use: find ... -exec grep ... {} +

> This has, of course been noted many times in this
> thread, but it cannot be over-stressed. Note that you can use "xargs" to
> mitigate this problem (as I routinely do) or use the (non-standard) "+"
> option in (GNU) find. [...]

It is non-standard? - Here's a quote from the standard...

-exec utility_name [argument ...] ;
-exec utility_name [argument ...] {} +
[...]
If the primary expression is punctuated by a <plus-sign>, the primary
shall always evaluate as true, and the pathnames for which the primary
is evaluated shall be aggregated into sets. The utility utility_name
shall be invoked once for each set of aggregated pathnames. [...]

Did I misread the POSIX standard? [*]

Janis

[*] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html

Ed Morton

unread,
Oct 5, 2023, 6:03:22 PM10/5/23
to
On 10/5/2023 2:37 PM, Ben Bacarisse wrote:
> Ed Morton <morto...@gmail.com> writes:
>
>> On 10/3/2023 2:00 AM, Richard Kettlewell wrote:
>>> Ed Morton <morto...@gmail.com> writes:
>>>> On 10/2/2023 2:09 AM, Richard Kettlewell wrote:
>>>>> Ed Morton <morto...@gmail.com> writes:
>>>>>> So I don't see any of that as justifying adding a bunch of options to
>>>>>> grep to do something other than it's primary purpose of g/re/p and
>>>>>> making it different from all other text processing tools in that
>>>>>> regard.
>>>>> Why do you think it needs justifying? It’s their code, they can add
>>>>> whatever they like to it.
>>>>
>>>> The GNU people added functionality to grep in a way that is
>>>> contradictory to the Unix philosophy and makes it inconsistent with
>>>> all other text processing tools.
>>> Why does that matter?
>>
>> For the same reasons that cohesion and consistency matter in any
>> system. Google "software" with those 2 terms to find more information on
>> how those concepts apply to software.
>
> Those are not the only concepts that matter.

Of course, they're just the answer to the question Richard asked.

>
>> If Unix were being invented today and the design decision being presented
>> in an attempt to answer the question "how do we call a text processing tool
>> on every file found under a directory?" was:
>>
>> IF (tool == grep) && (provider == GNU) THEN
>> tool -r '...'
>> ELSE
>> find . -type f tool '...' {} +
>> ENDIF
>>
>> instead of just:
>>
>> find . -type f tool '...' {} +
>
> find . -type f -exec tool '...' '{}' +

Right, thanks. I typed my response once and my newsreader discarded it
so I had to type it again and rushed it.

>> the person suggesting it would be ridiculed and sent back to the drawing
>> board.
>
> It's not "just" find ... find is a mess. If the file system, the shell
> and all the rest were such that adding some command or prefix like
>
> rec grep pattern .
>
> made grep behave pretty much like grep -r then no one would bother to
> add options like -r to grep.

Agreed, that is the right solution if people are unhappy with `find` and
apparently there is an existing tool to do just that, see Janis's posts
in this thread.

>
> But find is the best we have, and it's a mess. It uses mandatory syntax
> that needs to be quoted (the {}) and does not always preserve the desred
> behaviour. For example
>
> grep -rq needle haystack
>
> can succeed (exit 0) where
>
> find haystack -type f -exec grep -q needle '{}' +
>
> can fail (exit non-zero). This is because find can run multiple
> commands and does not always combine the exit statuses correctly. I
> would be happy to be wrong about this, but I'm sure I tested it some
> time ago.

Interesting - that's not something I've ever come across but thanks for
bringing it up as a possible concern.

Ed.

Ed Morton

unread,
Oct 5, 2023, 6:23:28 PM10/5/23
to
On 10/5/2023 12:40 PM, Richard Kettlewell wrote:
> Ed Morton <morto...@gmail.com> writes:
>> On 10/3/2023 2:00 AM, Richard Kettlewell wrote:
>>> Ed Morton <morto...@gmail.com> writes:
>>>> The GNU people added functionality to grep in a way that is
>>>> contradictory to the Unix philosophy and makes it inconsistent with
>>>> all other text processing tools.
>>> Why does that matter?
>>
>> For the same reasons that cohesion and consistency matter in any
>> system. Google "software" with those 2 terms to find more information
>> on how those concepts apply to software.
>
> Prioritizing consistency over usability, then.

No. Having one doesn't mean you can't have the other. Assuming that some
people find it unusable to type "find . -type f exec grep" for calling
grep when they already do "find . -type f" for finding files and "find .
-type f -exec sed" for calling sed, then had the GNU people solved that
usability problem by providing a tool to let us do:

tool finding-args grep greping-args
tool finding-args sed seding-args
tool finding-args awk seding-args

instead of:

grep finding-args greping-args
find finding-args -exec sed seding-args
find finding-args -exec awk awking-args

we would have usability and consistency.

My point was that if what you're considering doing will introduce
inconsistencies then you should think long and hard about whether or not
it's the right solution to any given problem and there will almost
always be a better possible alternative, as there is in this case.

>
>> If Unix were being invented today and the design decision being
>> presented in an attempt to answer the question "how do we call a text
>> processing tool on every file found under a directory?" was:
>>
>> IF (tool == grep) && (provider == GNU) THEN
>> tool -r '...'
>> ELSE
>> find . -type f tool '...' {} +
>> ENDIF
>>
>> instead of just:
>>
>> find . -type f tool '...' {} +
>>
>> the person suggesting it would be ridiculed and sent back to the
>> drawing board.
>
> Are you somehow under the impression that the existence of “grep -r”
> prevents the second answer from working? If not then why on earth would
> you imagine anyone would give the first answer?

The imaginary proposed existence of "GNU grep -r" when first designing
Unix is what I described in that first answer and no, you cannot give
the second answer when there are exceptions to the rule. I don't know
how to restate what I'm saying in a way you might better understand, sorry.

>> The GNU providers have made many worthwhile contributions to many
>> tools, people are not adopting GNU tools because they added "-r" to
>> grep.
>
> Well it’s demonstrably not an obstacle to many people.
>

Of course not.

Ed.

Ben Bacarisse

unread,
Oct 5, 2023, 10:25:45 PM10/5/23
to
Ed Morton <morto...@gmail.com> writes:

> On 10/5/2023 2:37 PM, Ben Bacarisse wrote:
>> Ed Morton <morto...@gmail.com> writes:
>>
>>> On 10/3/2023 2:00 AM, Richard Kettlewell wrote:
>>>> Ed Morton <morto...@gmail.com> writes:
>>>>> On 10/2/2023 2:09 AM, Richard Kettlewell wrote:
>>>>>> Ed Morton <morto...@gmail.com> writes:
>>>>>>> So I don't see any of that as justifying adding a bunch of options to
>>>>>>> grep to do something other than it's primary purpose of g/re/p and
>>>>>>> making it different from all other text processing tools in that
>>>>>>> regard.
>>>>>> Why do you think it needs justifying? It’s their code, they can add
>>>>>> whatever they like to it.
>>>>>
>>>>> The GNU people added functionality to grep in a way that is
>>>>> contradictory to the Unix philosophy and makes it inconsistent with
>>>>> all other text processing tools.
>>>> Why does that matter?
>>>
>>> For the same reasons that cohesion and consistency matter in any
>>> system. Google "software" with those 2 terms to find more information on
>>> how those concepts apply to software.
>> Those are not the only concepts that matter.
>
> Of course, they're just the answer to the question Richard asked.

They are /your/ answer to his question -- these are the concepts that
make it matter to you. Other concepts (like ease of use) don't
necessarily give the same answer which is why I mention them. /My/
answer would be "It doesn't matter much -- ease of use is too important
in this case".

(I hope you are not being hyper literal here. My "answer" is not an
answer in the literal sense. If I took "Why does that matter?"
absolutely literally, I'd have to give your answer as well with maybe
just "but only a teeny, tiny bit" added. But ripostes like "Why does
that matter?" are almost always rhetorical, and not literal requests for
the reasons, no matter how insignificant the respondent might consider
them to be.)

>>> If Unix were being invented today and the design decision being presented
>>> in an attempt to answer the question "how do we call a text processing tool
>>> on every file found under a directory?" was:
>>>
>>> IF (tool == grep) && (provider == GNU) THEN
>>> tool -r '...'
>>> ELSE
>>> find . -type f tool '...' {} +
>>> ENDIF
>>>
>>> instead of just:
>>>
>>> find . -type f tool '...' {} +
>> find . -type f -exec tool '...' '{}' +
>
> Right, thanks. I typed my response once and my newsreader discarded it so I
> had to type it again and rushed it.
>
>>> the person suggesting it would be ridiculed and sent back to the drawing
>>> board.
>> It's not "just" find ... find is a mess. If the file system, the shell
>> and all the rest were such that adding some command or prefix like
>> rec grep pattern .
>> made grep behave pretty much like grep -r then no one would bother to
>> add options like -r to grep.
>
> Agreed, that is the right solution if people are unhappy with `find` and
> apparently there is an existing tool to do just that, see Janis's posts in
> this thread.

tw may be better than find, but that's a low bar. My "rec" was intended
to be hypothetical. I should maybe have said that if it were possible
to have a rec prefix that magically just "did the right thing" then -r
would not be needed.

But your remark raises an interesting question. Are you happy with
find? I use it quite a lot, but I am unhappy with it pretty much every
time!

>> But find is the best we have, and it's a mess. It uses mandatory syntax
>> that needs to be quoted (the {}) and does not always preserve the desred
>> behaviour. For example
>> grep -rq needle haystack
>> can succeed (exit 0) where
>> find haystack -type f -exec grep -q needle '{}' +
>> can fail (exit non-zero). This is because find can run multiple
>> commands and does not always combine the exit statuses correctly. I
>> would be happy to be wrong about this, but I'm sure I tested it some
>> time ago.
>
> Interesting - that's not something I've ever come across but thanks for
> bringing it up as a possible concern.

It's not a possible concern, it's an actual concern! If you want to
know if "pattern" occurs in any file from . down, you /can't/ use

find . -type f -exec grep -q pattern '{}' +

because you will get the wrong answer in some cases.

But consider the general case... Imagine a collection of tools to test
if a file has this or that property. The tools can all have multiple
arguments but what do they do with conflicting answers? For some
properties it might be useful for the tool to report success (0) if
/all/ the arguments have that property and for other properties it might
be desirable to report success if /any/ of the arguments have it. If
find splits an argument list into two and the tool reports 0 on one half
and 1 on the second, what should find report? If the tool is an "all"
tool it should report 1, and if the tool is an "any" tool it should
report 0.

--
Ben.

Richard Kettlewell

unread,
Oct 6, 2023, 3:12:26 AM10/6/23
to
Ed Morton <morto...@gmail.com> writes:
> On 10/5/2023 12:40 PM, Richard Kettlewell wrote:
>> Prioritizing consistency over usability, then.
>
> No. Having one doesn't mean you can't have the other. Assuming that
> some people find it unusable to type "find . -type f exec grep" for
> calling grep when they already do "find . -type f" for finding files
> and "find . -type f -exec sed" for calling sed, then had the GNU
> people solved that usability problem by providing a tool to let us do:
>
> tool finding-args grep greping-args
> tool finding-args sed seding-args
> tool finding-args awk seding-args
>
> instead of:
>
> grep finding-args greping-args
> find finding-args -exec sed seding-args
> find finding-args -exec awk awking-args
>
> we would have usability and consistency.

To me, banning ‘grep -r’ would not introduce enough additional
consistency to justify the reduction in convenience. Not even close.

>>> If Unix were being invented today and the design decision being
>>> presented in an attempt to answer the question "how do we call a text
>>> processing tool on every file found under a directory?" was:
>>>
>>> IF (tool == grep) && (provider == GNU) THEN
>>> tool -r '...'
>>> ELSE
>>> find . -type f tool '...' {} +
>>> ENDIF
>>>
>>> instead of just:
>>>
>>> find . -type f tool '...' {} +
>>>
>>> the person suggesting it would be ridiculed and sent back to the
>>> drawing board.
>> Are you somehow under the impression that the existence of “grep -r”
>> prevents the second answer from working? If not then why on earth would
>> you imagine anyone would give the first answer?
>
> The imaginary proposed existence of "GNU grep -r" when first designing
> Unix is what I described in that first answer and no, you cannot give
> the second answer when there are exceptions to the rule. I don't know
> how to restate what I'm saying in a way you might better understand,
> sorry.

What prevents you from giving the second answer? We know from the
real-life situation that exists today that it’ll work if you use it.

--
https://www.greenend.org.uk/rjk/

Janis Papanagnou

unread,
Oct 6, 2023, 5:40:27 AM10/6/23
to
On 06.10.2023 04:25, Ben Bacarisse wrote:
>
> But your remark raises an interesting question. Are you happy with
> find? I use it quite a lot, but I am unhappy with it pretty much every
> time!

At times you get used to the 'find' standards. But GNU 'find' can
even be dangerous(!) with 'find's generally crude syntax and logic;
I faintly recall a thread where we discussed the -delete option...

Janis

hymie!

unread,
Oct 6, 2023, 7:44:34 AM10/6/23
to
In our last episode, the evil Dr. Lacto had captured our hero,
Kenny McCormack <gaz...@shell.xmission.com>, who said:
> I think OP is somewhat cryptically arguing that:
>
> a) People can't use the new fancy GNU stuff in portable scripts. except
> if they...
> b) In order to use them in a portable script, they'd have to code as
> shown above - that is, only use the new fancy GNU stuff if they are
> running in an environment known to have that stuff. But ...
> c) Nobody is going to do that, because it is just silly, so you end up
> just coding it the old-fashioned way, and thus ...
> d) The new stuff never gets used, so why should anyone bother designing
> and implementing it?

This is starting to remind me of the csh vs bash argument, where csh is
viewed as "better than bash for interactive use, but worse than bash for
shell scripting." Similarly, "assuming the added features of GNU grep"
would be useful for interactive use (where I know GNU grep is available)
while shell scripting would expect the standard grep.

--hymie! http://nasalinux.net/~hymie hy...@nasalinux.net
===============================================================================
Another quality sig from hymie! Collect the whole set - trade with your friends
===============================================================================

Geoff Clare

unread,
Oct 6, 2023, 9:11:10 AM10/6/23
to
Ben Bacarisse wrote:

> But find is the best we have, and it's a mess. It uses mandatory syntax
> that needs to be quoted (the {})

The {} does not need to be quoted. That's because { is a reserved
word in the shell, not an operator like (.

> and does not always preserve the desred
> behaviour. For example
>
> grep -rq needle haystack
>
> can succeed (exit 0) where
>
> find haystack -type f -exec grep -q needle '{}' +
>
> can fail (exit non-zero). This is because find can run multiple
> commands and does not always combine the exit statuses correctly.

It combines them in the way specified in POSIX, which says "If any
invocation returns a non-zero value as exit status, the find utility
shall return a non-zero exit status." If I recall correctly, the
POSIX spec for this was based on SVR4. Presumably the SVR4 developers
did it that way because that's what you want for the usual case where
non-zero means an error occurred. I.e. if find exits non-zero you
know that either find itself or at least one of the utility
invocations encountered an error.

However, it does mean find's exit status is misleading if the invoked
utility can exit with non-zero in some non-error situations (such as
grep's exit of 1 indicating no match). It shouldn't be too hard to
come up with a tweak that gives the right info. Maybe something like
(untested):

test -n "$(find haystack -type f -exec \
sh -c 'grep -q needle "$@" && echo found' sh {} +)"

Obviously it's only worth going to that much trouble in a script.
For casual interactive use, I'd be inclined to use:

find haystack -type f -exec grep -l needle '{}' + | head -n 1

and just observe whether it writes a pathname or not. (Or even just
omit the "head -n 1" and hit Ctrl-C if it starts writing pathnames.)

--
Geoff Clare <net...@gclare.org.uk>

Ben Bacarisse

unread,
Oct 6, 2023, 9:57:17 AM10/6/23
to
Geoff Clare <ge...@clare.See-My-Signature.invalid> writes:

> Ben Bacarisse wrote:
>
>> But find is the best we have, and it's a mess. It uses mandatory syntax
>> that needs to be quoted (the {})
>
> The {} does not need to be quoted. That's because { is a reserved
> word in the shell, not an operator like (.

Is this true of all shells? I was going by the find man page. Maybe
it needs an update?
For casual interactive use I'd be inclined to use grep -r :-)

An alternative would be to give the user a way to tell find how to
combine the exit statuses: -minstatus and -maxstatus? A couple more
options can't hurt.

> and just observe whether it writes a pathname or not. (Or even just
> omit the "head -n 1" and hit Ctrl-C if it starts writing pathnames.)

The bigger picture here is that find is not a simple way to make
commands handle nested file structures. It mostly works, but many cases
need care. find imposes a cognitive load that undoes the supposed gain
from having everything consistent. For day-to-day use, it can be easier
to remember that -r exists (if you are lucky enough to have it) than it
is to do the mental check "will find get this one right?".

--
Ben.

Andy Walker

unread,
Oct 6, 2023, 10:16:33 AM10/6/23
to
On 06/10/2023 08:12, Richard Kettlewell wrote:
> To me, banning ‘grep -r’ would not introduce enough additional
> consistency to justify the reduction in convenience. Not even close.

Well, that's the trouble with feeping creaturism. Once the
creature has fept, it's impossible to reconsider it; there are always
some people using it who mustn't be upset. The time to decide is
/before/ adding it. My point would be that the addition is not cost-
free to those who don't use it. Grep is a case in point:

$ man grep | wc
645 4249 34365

For comparison, the 7th Edition manual entry is 61 lines. Is Gnu grep
more than 10x as useful as V7 grep? How many users read and understand
the whole entry? Or try

$ man cc | wc
22363 125223 1063468

That's a whole /book/ size! [V7 entry is less than two pages.] Worse,
it's unreadable by normal people. The same applies to command after
command. It's not just the size of the manual entries that are a bar
to understanding, it's their number. There are 3447 commands in my
$PATH, the great majority of which I've never heard of, never used,
and didn't do anything [conscious] to install; they just got dragged
in by other packages and commands.

The result is that no-one, but no-one, actually understands the
whole, or even half, of Linux. For V7, I have the entire manual plus all
the supporting documentation, plus about half as much again in local and
imported add-ons, in two A4 document folders totalling about 20cm thick;
and the entire source code [inc the operating system and the same add-ons]
in a modest pile of print-out on a shelf in my garage. Lots of people
understood the /whole/ of Unix-on-the-PDP11 [and later the VAX, Sun, SGI
Indy, ...], inc the PDP11 Processor Manual [about the size of a modest
paperback novel]. Today, it's a black box which changes faster than you
can keep up and only a tiny corner of which is within human grasp.

Well, I suppose that's progress, of a sort. But I'm not sure it's
been in the right direction. End of rant.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Forbes

Adam Funk

unread,
Oct 6, 2023, 11:15:12 AM10/6/23
to
On 2023-10-03, Richard Kettlewell wrote:

> Ed Morton <morto...@gmail.com> writes:
>> On 10/2/2023 2:09 AM, Richard Kettlewell wrote:
>>> Ed Morton <morto...@gmail.com> writes:
>>>> So I don't see any of that as justifying adding a bunch of options to
>>>> grep to do something other than it's primary purpose of g/re/p and
>>>> making it different from all other text processing tools in that
>>>> regard.
>>> Why do you think it needs justifying? It’s their code, they can add
>>> whatever they like to it.
>>
>> The GNU people added functionality to grep in a way that is
>> contradictory to the Unix philosophy and makes it inconsistent with
>> all other text processing tools.
>
> Why does that matter?
>
>> So far the only suggestion I've heard for why they did that is so
>> people could do:
>>
>> grep -r 'regexp'
>>
>> instead of:
>>
>> find . -type f -exec grep -H 'regexp' {} +
>>
>> which saves us about 20 simple, common characters over find+grep but
>> does nothing for find+awk, find+sed, etc.
>
> Sounds good to me. Less typing = better. Given the uptake of the GNU
> tools I suspect my opinion is widely shared.

I have to admit that I have been using grep -lr foo/ for so long that
it would never occur to me to try to wrangle find's arguments into
place for the purpose. The philosophical argument against grep -l is
understandable but for my own practical purposes I am grateful for it.


--
...the reason why so many professional artists drink a lot is not
necessarily very much to do with the artistic temperament, etc. It is
simply that they can afford to, because they can normally take a large
part of a day off to deal with the ravages. ---Amis _On Drink_

Janis Papanagnou

unread,
Oct 6, 2023, 11:40:23 AM10/6/23
to
On 06.10.2023 16:16, Andy Walker wrote:
>
> Well, that's the trouble with feeping creaturism. Once the
> creature has fept, it's impossible to reconsider it; there are always
> some people using it who mustn't be upset. The time to decide is
> /before/ adding it. [...]

Yes.

> [ disproportionally growing sizes of man-pages (since V7) ]

In our (post-V7, post-SysV) Linux era we now find also man pages
of smaller size - carrying only a hint to an info-page hierarchy.

Yes, it can get even worse.

Janis

Keith Thompson

unread,
Oct 6, 2023, 1:28:51 PM10/6/23
to
Ben Bacarisse <ben.u...@bsb.me.uk> writes:
> Geoff Clare <ge...@clare.See-My-Signature.invalid> writes:
>> Ben Bacarisse wrote:
>>> But find is the best we have, and it's a mess. It uses mandatory syntax
>>> that needs to be quoted (the {})
>>
>> The {} does not need to be quoted. That's because { is a reserved
>> word in the shell, not an operator like (.
>
> Is this true of all shells? I was going by the find man page. Maybe
> it needs an update?

`echo {}` prints "{}" in every shell I've tried (bash, csh, tcsh, ksh,
zsh, fish, dash, busybox sh).

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Will write code for food.
void Void(void) { Void(); } /* The recursive call of the void */

David W. Hodgins

unread,
Oct 6, 2023, 3:54:43 PM10/6/23
to
On Fri, 06 Oct 2023 13:28:43 -0400, Keith Thompson <Keith.S.T...@gmail.com> wrote:
> `echo {}` prints "{}" in every shell I've tried (bash, csh, tcsh, ksh,
> zsh, fish, dash, busybox sh).

From "man bash" ...
{ and } are reserved words and must occur where a reserved word is permitted to be recognized.

So if they occur somewhere where a reserved word is not permitted, then they are
just normal characters.

Regards, Dave Hodgins

Ben Bacarisse

unread,
Oct 6, 2023, 4:28:26 PM10/6/23
to
Keith Thompson <Keith.S.T...@gmail.com> writes:

> Ben Bacarisse <ben.u...@bsb.me.uk> writes:
>> Geoff Clare <ge...@clare.See-My-Signature.invalid> writes:
>>> Ben Bacarisse wrote:
>>>> But find is the best we have, and it's a mess. It uses mandatory syntax
>>>> that needs to be quoted (the {})
>>>
>>> The {} does not need to be quoted. That's because { is a reserved
>>> word in the shell, not an operator like (.
>>
>> Is this true of all shells? I was going by the find man page. Maybe
>> it needs an update?
>
> `echo {}` prints "{}" in every shell I've tried (bash, csh, tcsh, ksh,
> zsh, fish, dash, busybox sh).

The find man page repeats that {} "might" have to be escaped or quoted
several times. The EXAMPLES sections quotes it, and then some text goes
on to explain why!

I accept that it's wrong to say that.

So I'll change my remark back to the one I had intended: find uses
syntax that needs to be quoted (the ;). Annoyingly, I'd flipped from ;
to {} because I didn't think that ; was used in the quoted text!

--
Ben.

Keith Thompson

unread,
Oct 6, 2023, 5:34:09 PM10/6/23
to
The find(1) man page (for GNU findutils 4.8.0 and in the latest version
in the git repo) says in one place (under "-exec command ;") that {} and ;
*might* need to be escaped, and in another (under "-exec command {}+")
that {} *needs* to be escaped.

Kaz Kylheku

unread,
Oct 6, 2023, 7:35:14 PM10/6/23
to
Same as:

echo while

etc. while is a reserved word but not everywhere.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazi...@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

Ed Morton

unread,
Oct 7, 2023, 6:03:16 AM10/7/23
to
I said "The GNU people added functionality to grep in a way that is
contradictory to the Unix philosophy and makes it inconsistent with all
other text processing tools." and in response Richard asked "Why does
that matter?"

I don't see how "ease of use" is an answer to Richard's question.

/My/
> answer would be "It doesn't matter much -- ease of use is too important
> in this case".

I'm claiming that X matters and being asked why, so responding that it
doesn't matter wouldn't make sense or be useful.
It's only an actual concern if it actually happens and you're saying
above that you're not sure if it happens or not, it's just something you
think you saw some time in the past ("I would be happy to be wrong about
this, but I'm sure I tested it some time ago."), hence my saying it's a
possible concern.

If you want to
> know if "pattern" occurs in any file from . down, you /can't/ use
>
> find . -type f -exec grep -q pattern '{}' +
>
> because you will get the wrong answer in some cases.

OK, then don't do that, do it one of the various other simple ways it
can be done, e.g.:

find . -type f -exec grep -m1 pattern '{}' +

and test for any output, but I'm not arguing that `find` is the best
possible answer to the issues, I'm saying that adding `-r` to grep is
the wrong answer for the reasons I've already stated.

>
> But consider the general case... Imagine a collection of tools to test
> if a file has this or that property. The tools can all have multiple
> arguments but what do they do with conflicting answers? For some
> properties it might be useful for the tool to report success (0) if
> /all/ the arguments have that property and for other properties it might
> be desirable to report success if /any/ of the arguments have it. If
> find splits an argument list into two and the tool reports 0 on one half
> and 1 on the second, what should find report? If the tool is an "all"
> tool it should report 1, and if the tool is an "any" tool it should
> report 0.

How does adding "-r" to grep solve that problem for awk and sed?

Ed.

Ed Morton

unread,
Oct 7, 2023, 6:18:49 AM10/7/23
to
On 10/5/2023 9:25 PM, Ben Bacarisse wrote:
<snip>

I forgot to answer your question below in my preceding post, sorry:

> But your remark raises an interesting question. Are you happy with
> find? I use it quite a lot, but I am unhappy with it pretty much every
> time!

I find it's arguments clunky and arcane and, after 40+ years of using
it, any time I want to do more than `find . -type f \( -name X -o name Y
\) -exec foo {} +` I still have to look up the man page BUT I rarely
have to do more than that with it so that doesn't overly bother me.

That doesn't mean that giving grep (and only grep among all the text
processing tools) the ability to find files was the best possible
solution to that problem, if it needs to be solved.

Ed.

Ed Morton

unread,
Oct 7, 2023, 6:37:31 AM10/7/23
to
On 10/3/2023 2:26 AM, Kaz Kylheku wrote:
> On 2023-10-02, Ed Morton <morto...@gmail.com> wrote:
>> On 10/2/2023 2:09 AM, Richard Kettlewell wrote:
>>> Ed Morton <morto...@gmail.com> writes:
>>>> So I don't see any of that as justifying adding a bunch of options to
>>>> grep to do something other than it's primary purpose of g/re/p and
>>>> making it different from all other text processing tools in that
>>>> regard.
>>>
>>> Why do you think it needs justifying? It’s their code, they can add
>>> whatever they like to it.
>>>
>>
>> The GNU people added functionality to grep in a way that is
>> contradictory to the Unix philosophy and makes it inconsistent with all
>> other text processing tools.
>
> - GNU stands for GNU is Not Unix.
>
> - The GNU coding standards document says:
>
> The GNU Project regards standards published by other organizations
> as suggestions, not orders. We consider those standards, but we do
> not “obey” them. In developing a GNU program, you should implement
> an outside standard’s specifications when that makes the GNU system
> better overall in an objective sense. When it doesn’t, you shouldn’t.

To be clear, I'm not talking about "grep -r" violating any standards,
I'm talking about it violating the Unix philosophy and software
engineering fundamentals of cohesion and consistency.

>
> - Unix programs have recursion built in: from memory I can think of
> ls, rm, cp, mv, chgrp, chmod and chown.

Those all operate on file attributes, not on the contents of files as
text processing tools like grep do. Tools like `rm` or `chmod` being
able to find and perform actions on files is equivalent to text
processing tools like `grep` or `awk` being able to find and perform
actions on text inside files.

>
> - BSD Unixes have a -R option on grep; some, like FreeBSD, have an -r
> option as well as a rgrep utility.

Yes, GNUisms are apparently creeping into BSD tools, usually for the
better though.

Ed.

Kaz Kylheku

unread,
Oct 7, 2023, 11:09:32 AM10/7/23
to
On 2023-10-07, Ed Morton <morto...@gmail.com> wrote:
> To be clear, I'm not talking about "grep -r" violating any standards,
> I'm talking about it violating the Unix philosophy and software
> engineering fundamentals of cohesion and consistency.

Unix doesn't withstand a moment's scrutiny in the light of any of
these lofty words.

Ben Bacarisse

unread,
Oct 7, 2023, 6:18:02 PM10/7/23
to
Yes, I know. I thought that reading was "hyper literal" as explained
here:

>> (I hope you are not being hyper literal here. My "answer" is not an
>> answer in the literal sense. If I took "Why does that matter?"
>> absolutely literally, I'd have to give your answer as well with maybe
>> just "but only a teeny, tiny bit" added. But ripostes like "Why does
>> that matter?" are almost always rhetorical, and not literal requests for
>> the reasons, no matter how insignificant the respondent might consider
>> them to be.)

I think Richard's question was largely rhetorical. But maybe I'm
wrong. Maybe he did not know what inconsistency matters, and he's
learned from your reply.

>>>> But find is the best we have, and it's a mess. It uses mandatory syntax
>>>> that needs to be quoted (the {}) and does not always preserve the desred
>>>> behaviour. For example
>>>> grep -rq needle haystack
>>>> can succeed (exit 0) where
>>>> find haystack -type f -exec grep -q needle '{}' +
>>>> can fail (exit non-zero). This is because find can run multiple
>>>> commands and does not always combine the exit statuses correctly. I
>>>> would be happy to be wrong about this, but I'm sure I tested it some
>>>> time ago.
>>>
>>> Interesting - that's not something I've ever come across but thanks for
>>> bringing it up as a possible concern.
>> It's not a possible concern, it's an actual concern!
>
> It's only an actual concern if it actually happens and you're saying above
> that you're not sure if it happens or not, it's just something you think
> you saw some time in the past ("I would be happy to be wrong about this,
> but I'm sure I tested it some time ago."), hence my saying it's a possible
> concern.

Sure, but I went on to explain that it can't get it right in all cases
even if it happened to get this case right. "It" (relying on find to
combine exit statuses) really is an actual concern.

>> If you want to
>> know if "pattern" occurs in any file from . down, you /can't/ use
>> find . -type f -exec grep -q pattern '{}' +
>> because you will get the wrong answer in some cases.
>
> OK, then don't do that, do it one of the various other simple ways it can
> be done, e.g.:
>
> find . -type f -exec grep -m1 pattern '{}' +
>
> and test for any output, but I'm not arguing that `find` is the best
> possible answer to the issues, I'm saying that adding `-r` to grep is the
> wrong answer for the reasons I've already stated.

Yes, we all know how to do it other ways (and I like grep -r as the way
to do it). My point was that you exaggerated the simplicity of just
using find as an alternative. The consistent solution (no tool to
provide a -r option because that's a job for find) imposes a significant
cognitive load: remember change the command when you want to rely on the
(possibly)combined exit status.

The first time I encountered this problem was in a script that just
stopped working one day. The set of files had got large enough that
find has split them into two executions of grep. Now you can argue that
I should have been more careful. It's obvious (once you think about it)
that find can't get the right answer in all cases, but it's still part
of what makes find complicated.

>> But consider the general case... Imagine a collection of tools to test
>> if a file has this or that property. The tools can all have multiple
>> arguments but what do they do with conflicting answers? For some
>> properties it might be useful for the tool to report success (0) if
>> /all/ the arguments have that property and for other properties it might
>> be desirable to report success if /any/ of the arguments have it. If
>> find splits an argument list into two and the tool reports 0 on one half
>> and 1 on the second, what should find report? If the tool is an "all"
>> tool it should report 1, and if the tool is an "any" tool it should
>> report 0.
>
> How does adding "-r" to grep solve that problem for awk and sed?

Eh?

--
Ben.

Ed Morton

unread,
Oct 8, 2023, 12:46:03 AM10/8/23
to
On 10/7/2023 5:17 PM, Ben Bacarisse wrote:
> Ed Morton <morto...@gmail.com> writes:
>
>> On 10/5/2023 9:25 PM, Ben Bacarisse wrote:
<snip>
>>> But consider the general case... Imagine a collection of tools to test
>>> if a file has this or that property. The tools can all have multiple
>>> arguments but what do they do with conflicting answers? For some
>>> properties it might be useful for the tool to report success (0) if
>>> /all/ the arguments have that property and for other properties it might
>>> be desirable to report success if /any/ of the arguments have it. If
>>> find splits an argument list into two and the tool reports 0 on one half
>>> and 1 on the second, what should find report? If the tool is an "all"
>>> tool it should report 1, and if the tool is an "any" tool it should
>>> report 0.
>>
>> How does adding "-r" to grep solve that problem for awk and sed?
>
> Eh?
>

The problem you describe above isn't unique to grep, it also exists for
sed, awk and other tools. Giving GNU grep a "-r" option to solve the
problem for grep does nothing to solve the problem for any other tool,
but there are alternative solutions that could be implemented to solve
it for all similar tools as discussed elsethread.

Ed.

Ben Bacarisse

unread,
Oct 8, 2023, 9:43:11 AM10/8/23
to
Ed Morton <morto...@gmail.com> writes:

> On 10/7/2023 5:17 PM, Ben Bacarisse wrote:
>> Ed Morton <morto...@gmail.com> writes:
>>
>>> On 10/5/2023 9:25 PM, Ben Bacarisse wrote:
> <snip>
>>>> But consider the general case... Imagine a collection of tools to test
>>>> if a file has this or that property. The tools can all have multiple
>>>> arguments but what do they do with conflicting answers? For some
>>>> properties it might be useful for the tool to report success (0) if
>>>> /all/ the arguments have that property and for other properties it might
>>>> be desirable to report success if /any/ of the arguments have it. If
>>>> find splits an argument list into two and the tool reports 0 on one half
>>>> and 1 on the second, what should find report? If the tool is an "all"
>>>> tool it should report 1, and if the tool is an "any" tool it should
>>>> report 0.
>>>
>>> How does adding "-r" to grep solve that problem for awk and sed?
>> Eh?
>>
>
> The problem you describe above isn't unique to grep, it also exists for
> sed, awk and other tools.

Yes. The problem is even described in a tool-agnostic way -- no mention
of grep at all!

> Giving GNU grep a "-r" option to solve the
> problem for grep does nothing to solve the problem for any other tool, but
> there are alternative solutions that could be implemented to solve it for
> all similar tools as discussed elsethread.

Two points. I don't thing there are alternative solutions, but if so,
can you point me to them as I've lost track? I want to find one (since
I've been bitten by this problem) so I would like to check them out.

Secondly, I am pretty sure that any near solution to this problem will
come with complications. Your original complaint was based on adding -r
to grep (a very common use case) when find is the consistent option.
But find's behaviour is /not/ consistent with what grep does as I have
detailed above. I would go so far as to argue that a solution that is
apparently coherent and consistent, right up until it breaks because of
something arbitrary like when find breaks it's argument list is
borderline dangerous. At the very least, it requires more careful
thought than remembering which tools have -r.

A good, consistent, solution would be allow an argument list to be
supplied by a generator. The program could get to work as soon as it
had a file to process, but there would be no limit on the number of
files. But this is not possible with the Unix model of what constitutes
a command's arguments.

--
Ben.

Ed Morton

unread,
Oct 9, 2023, 9:18:54 AM10/9/23
to
A tool such as the tree-walk "tw"
(https://github.com/att/ast/tree/master/src/cmd/tw) tool that apparently
can find files and call a tool on those found files as suggested by me
and others in this thread. I'm not saying that THAT tool is the
solution, though, (I'd never heard of it till it was mentioned in this
thread and I don't know any more about it) I'm saying a solution for all
tools could have been created instead of modifying grep as was done.

>
> Secondly, I am pretty sure that any near solution to this problem will
> come with complications. Your original complaint was based on adding -r
> to grep (a very common use case) when find is the consistent option.

My complaint is that adding -r to grep makes it inconsistent with and
doesn't solve the issues for any of the other similar tools.

> But find's behaviour is /not/ consistent with what grep does as I have
> detailed above. I would go so far as to argue that a solution that is
> apparently coherent and consistent, right up until it breaks because of
> something arbitrary like when find breaks it's argument list is
> borderline dangerous. At the very least, it requires more careful
> thought than remembering which tools have -r.

I'm not saying that "find" solves all of the issues, I'm saying that
adding "-r" to grep was the wrong thing to do as it doesn't solve all
the issues and introduces other issues.

Ed.

Janis Papanagnou

unread,
Oct 9, 2023, 10:50:32 AM10/9/23
to
On 09.10.2023 15:18, Ed Morton wrote:
> On 10/8/2023 8:43 AM, Ben Bacarisse wrote:
>>
>> Secondly, I am pretty sure that any near solution to this problem will
>> come with complications. Your original complaint was based on adding -r
>> to grep (a very common use case) when find is the consistent option.
>
> My complaint is that adding -r to grep makes it inconsistent with and
> doesn't solve the issues for any of the other similar tools.

I think there's even a bigger issue. It's usually not sufficient to add
only a -r option, more sooner than later you want to control subsets of
the directory tree, thus have to add yet more options - to *every* tool
that is to be enhanced by a [built-in] recursive tree-walk function.
And every tool will choose its own option subset (and usually also its
own syntax) to implement that.

This observation (incl. inconsistencies between tools' implementations)
make it (design-wise) straightforward for a separation (as already
posted) like tw -tw-options... cmd -cmd-options... cmd-args...

Looking into the original post's GNU grep's options example is quite
enlightening, I'd say.

Janis

Ed Morton

unread,
Oct 9, 2023, 2:15:14 PM10/9/23
to
On 10/9/2023 9:50 AM, Janis Papanagnou wrote:
> On 09.10.2023 15:18, Ed Morton wrote:
>> On 10/8/2023 8:43 AM, Ben Bacarisse wrote:
>>>
>>> Secondly, I am pretty sure that any near solution to this problem will
>>> come with complications. Your original complaint was based on adding -r
>>> to grep (a very common use case) when find is the consistent option.
>>
>> My complaint is that adding -r to grep makes it inconsistent with and
>> doesn't solve the issues for any of the other similar tools.
>
> I think there's even a bigger issue. It's usually not sufficient to add
> only a -r option,

Right, I'm just using the phrase "adding -r" as an abbreviation for
"adding a bunch of options to find files".

> more sooner than later you want to control subsets of
> the directory tree, thus have to add yet more options - to *every* tool
> that is to be enhanced by a [built-in] recursive tree-walk function.
> And every tool will choose its own option subset (and usually also its
> own syntax) to implement that.
>
> This observation (incl. inconsistencies between tools' implementations)
> make it (design-wise) straightforward for a separation (as already
> posted) like tw -tw-options... cmd -cmd-options... cmd-args...
>
> Looking into the original post's GNU grep's options example is quite
> enlightening, I'd say.

Right again. Adding those options to 1 of the text processing tools was
absurd and adding them to every similar tool would also be absurd -
there is no good solution down that path.

Ed.
>
> Janis
>

Ben Bacarisse

unread,
Oct 9, 2023, 3:22:28 PM10/9/23
to
For the record, it does not even address the problem I outlined above,
much less solve it. The man page does not even say what its return
status is, so it can't be relied on to anything predictable.

> though, (I'd never heard of it till it was mentioned in this thread and I
> don't know any more about it) I'm saying a solution for all tools could
> have been created instead of modifying grep as was done.

Right. But until there is such a tool, it's hard to be definite about
the merits or demerits of grep -r. With a simple easy to use "rec"
tool, I would ditch using -r because a general solution is obviously
better. But, given the limitations of the Unix process model, it would
have to have flags to allow the user to choose how exit statuses are
combined. It's never going to be totally trivial so there will always
to a slight pressure to implement tool-specific solutions.

>> Secondly, I am pretty sure that any near solution to this problem will
>> come with complications. Your original complaint was based on adding -r
>> to grep (a very common use case) when find is the consistent option.
>
> My complaint is that adding -r to grep makes it inconsistent with and
> doesn't solve the issues for any of the other similar tools.

Yes, that's well understood and I am not disputing it.

Adding -R to ls introduced the same inconsistency a long time ago and
it's not alone. chown, chgrp, diff, rm and cp come to mind. Does the
fact that some of these need to operate in a specific order give them a
pass in your opinion? If so, why does grep -r not get the same pass
given that find -exec grep can't work for all of its uses cases either?

And if not (i.e. if you think none of these should have -r/-R options
either) why is your complaint not about find? Surely the problem would
then be that find it not up to the job, thus forcing the issue for some
tools.

>> But find's behaviour is /not/ consistent with what grep does as I have
>> detailed above. I would go so far as to argue that a solution that is
>> apparently coherent and consistent, right up until it breaks because of
>> something arbitrary like when find breaks it's argument list is
>> borderline dangerous. At the very least, it requires more careful
>> thought than remembering which tools have -r.
>
> I'm not saying that "find" solves all of the issues, I'm saying that adding
> "-r" to grep was the wrong thing to do as it doesn't solve all the issues
> and introduces other issues.

Yes, I get that that is your opinion. My opinion is that since find is
not up to the job, it's hard to say that grep -r was the wrong thing to
do. Do you think that ls -R, rm -r and chmod -r were the wrong thing to
do? What about cp -r? The syntax of find would have to be made even
more complex to make that one work.

I agree that a better solution, in some cases, would be a better find,
but given the limitations it has to work under, even then one would have
to weight up the pros and cons of tool-specific recursive options
because getting it right for grep, rm, chmod and so on is going to
involve more that just a simple "tree-walk" command.

--
Ben.

Ed Morton

unread,
Oct 10, 2023, 12:14:31 PM10/10/23
to
On 10/9/2023 2:22 PM, Ben Bacarisse wrote:
<snip>
> Do you think that ls -R, rm -r and chmod -r were the wrong thing to
> do? What about cp -r?

As I mentioned elsethread, those are tools that operate on file
attributes, not file contents, so it seems as reasonable to me for them
to be able to find files to operate on as it does for tools that operate
on file contents, such as grep and awk, to be able to find that text
within files to operate on.

If someone decided to add arguments to `rm` or `cp` to change the
contents of files then we could have this same discussion about breaking
with Unix philosophy, loose cohesion, etc. about those tools.

Ed.

Kaz Kylheku

unread,
Oct 10, 2023, 3:01:28 PM10/10/23
to
They do change the contents of files.

One makes them disappear (if the link count drops to zero). The other
can change the content of a an existing file B to be the same as that
of existing file A:

cp A B

It is grep that doesn't change file contents, only reporting on them.

Ben Bacarisse

unread,
Oct 11, 2023, 8:19:41 PM10/11/23
to
Ed Morton <morto...@gmail.com> writes:

> On 10/9/2023 2:22 PM, Ben Bacarisse wrote:
> <snip>
>> Do you think that ls -R, rm -r and chmod -r were the wrong thing to
>> do? What about cp -r?
>
> As I mentioned elsethread, those are tools that operate on file attributes,
> not file contents, so it seems as reasonable to me for them to be able to
> find files to operate on as it does for tools that operate on file
> contents, such as grep and awk, to be able to find that text within files
> to operate on.

I was going to suggest that, as we have both made our points well enough
by now, we could call it a day, but you've introduced a whole new
mystery (to me at least) about what kinds of tool can get away with
having -r and which can't.

To my mind, if I were re-designing the way programs get their arguments,
I would want to come up with one solution that worked for as many of the
above as possible.

--
Ben.

Keith Thompson

unread,
Oct 11, 2023, 9:06:19 PM10/11/23
to
I've submitted a bug report. No response so far.

https://lists.gnu.org/archive/html/bug-findutils/2023-10/msg00002.html

Ed Morton

unread,
Oct 13, 2023, 2:48:40 PM10/13/23
to
On 10/11/2023 7:19 PM, Ben Bacarisse wrote:
> Ed Morton <morto...@gmail.com> writes:
>
>> On 10/9/2023 2:22 PM, Ben Bacarisse wrote:
>> <snip>
>>> Do you think that ls -R, rm -r and chmod -r were the wrong thing to
>>> do? What about cp -r?
>>
>> As I mentioned elsethread, those are tools that operate on file attributes,
>> not file contents, so it seems as reasonable to me for them to be able to
>> find files to operate on as it does for tools that operate on file
>> contents, such as grep and awk, to be able to find that text within files
>> to operate on.
>
> I was going to suggest that, as we have both made our points well enough
> by now, we could call it a day,

Me too but you asked me a question so I answered it.

> but you've introduced a whole new
> mystery (to me at least) about what kinds of tool can get away with
> having -r and which can't.
>
> To my mind, if I were re-designing the way programs get their arguments,
> I would want to come up with one solution that worked for as many of the
> above as possible.

Also me too but I'm not too bothered by tools that operate on file
properties (rm, mv, ls, etc.) having different ways to find files just
like I'm not too bothered by tools that operate on text file contents
(grep, sed, awk, etc.) having different ways to find the text within
those files.

Ed.

Stan Moore

unread,
Nov 25, 2023, 3:39:58 AM11/25/23
to
Couple of thoughts. First grep has been ported far and wide to systems
that bear little resemblance to unix/posix/linux. Some fraction of those
systems and users will not have access to find.

Second some of those same ported systems are not process friendly and
creating a process for find and another for grep might not be a reasonable
solution. I suspect these two things may have had some influence on
the current state.

Finally I find calling a modern day Ford Mustang a travesty since I'm old
enough to remember 1968. Similarly I find very little unix in todays systems
including the bsd's (geneology notwithstanding). Linux has never bought into
"the unix way" and any doubts were eliminated by systemd. GNU has never
made any real attempt to be limited the existing unix conventions or standards.
Considering the above saying gnu grep doesn't exactly act like a true
unix command is like saying "water is wet".

Personally I see grep -r as a gray area between inconsistent and probably
helpful to some users. I'm more concerned with the general quality of
the software I deal with every day to have bigger concerns about the
future of both software developers and users.

Grep may the the most widely know unix command outside of hardcore
unix types, and I don't expect you will get any traction with you
consisteny concerns. I would also be suprised if you get any
relief either. Don't let it keep you from having a happy holidays!

Stan
0 new messages