Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

trying to parametrize find's name filter ...

39 views
Skip to first unread message

qwert...@syberianoutpost.ru

unread,
Sep 21, 2012, 8:41:33 PM9/21/12
to
Most shell scripts you find out there are not parametrized. I am trying to
parametrize find only using its own options, so that only one shell process
is started by the OS
~
I think what is causing problems is my attempt to include a combined "-name"
instruction to filter the kinds of files I need (based on the extensions).
After trying different ways around it and searching for a solution I think
I need help.
~
Also I would like to know what exactly is causing this problem, not just a
way out of it
~
thanks
lbrtchx
comp.unix.shell: trying to parametrize find's name filter ...

# ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
#!/bin/bash

# directory separator
_DIR_SEP="/"

_SDIR="/media/sda3/tmp"

# directory separator
_DIR_SEP="/"

# EOL (eval parses command twice in the first pass it eats the first"\")
_EOL="\\n"

# QUOT (eval parses command twice in the first pass it eats the "\" leaving
the quote)
_QT="\""

# ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

# excluding section (should be done in a loop (and using awk?))
_XKLD="-noleaf -wholename '/proc' -prune -or -wholename '/media/sda1' -prune"

# testing extensions (should be done in a loop)
_Xs=" -name '*.js' "
_Xs=" -name '*.js' -or -name '*.txt' "
_Xs=" -name '*.js' -or -name '*.txt' -or -name '*.gif' "

# size clause (non-empty files)
_SZ=" ! -size 0 "

eval find ${_SDIR} ${_XKLD} ${_Xs} -type f ${_SZ} -printf '%T@,%A@,%C@,${_QT}
%M${_QT},%n,${_QT}%l${_QT},${_QT}%u${_QT},${_QT}%g${_QT},%s,%d,${_QT}${_SDIR}
${_DIR_SEP}%P${_QT}${_EOL}'

Ben Bacarisse

unread,
Sep 21, 2012, 9:27:37 PM9/21/12
to
qwert...@syberianoutpost.ru writes:

> Most shell scripts you find out there are not parametrized. I am trying to
> parametrize find only using its own options, so that only one shell process
> is started by the OS
> ~
> I think what is causing problems is my attempt to include a combined "-name"
> instruction to filter the kinds of files I need (based on the extensions).
> After trying different ways around it and searching for a solution I think
> I need help.
> ~
> Also I would like to know what exactly is causing this problem, not just a
> way out of it

I'm stumped. I don't know what it is you are trying to do, and I don't
know what the problem is that you've encountered.
Since nothing here depends on any input or on any parameters, this last
line is just a complex way of writing a single find command. I know you
want to parametrise this in some way, but because there is no attempt
at doing so, I don't what to suggest.

--
Ben.

qwert...@syberianoutpost.ru

unread,
Sep 21, 2012, 10:06:48 PM9/21/12
to
Well if you actually try that short script, you will notice that you won't get
all "*.js", "*.txt" and "*.gif" files in a directory, but just the last
extension indicated in that case "*.gif"
~
lbrtchx

Bill Marcum

unread,
Sep 22, 2012, 4:12:13 AM9/22/12
to
You may need parentheses around the -or expressions so that -type and
-printf apply to all of them.

qwert...@syberianoutpost.ru

unread,
Sep 22, 2012, 4:51:04 AM9/22/12
to
bash does not complain if you use parentheses, a la:
~
_Xs=" \( -name '*.js' -or -name '*.txt' -or -name '*.gif' \) "
~
but find does not work.
~
And, as I said, if I use the filter as:
~
_Xs=" -name '*.js' -or -name '*.txt' -or -name '*.gif' "
~
it, at least, filters out the last name, namely, '*.gif'
~
lbrtchx

Janis Papanagnou

unread,
Sep 22, 2012, 5:06:58 AM9/22/12
to
Before you post again;
1. post to the original thread that you opened, don't open a new
thread on the same subject for every new reply that you make
2. quote posting context from preceding answers in the same thread
so that it's clear to what part you refer to and so that every
posting is complete by itself, don't assume that people browse
the newsgroup for possibly related information that you might
be referring to
3. explain what the actual problem is by providing error messages
or results, in addition to test data samples (your input, data
and command, and expected and actual output), don't just say
"it does not work" because that phrase does not tell anything
Understanding and following those points will help you getting
your problems solved quickly, not considering those will likely
get you ignored or kill-filed.
If you are uncertain about this basic "Usenet netiquette", google
for that term, or inspect other threads and postings.

Janis

Dave Gibson

unread,
Sep 22, 2012, 5:26:01 AM9/22/12
to
qwert...@syberianoutpost.ru wrote:
> Most shell scripts you find out there are not parametrized. I am trying to
> parametrize find only using its own options, so that only one shell process
> is started by the OS
> ~
> I think what is causing problems is my attempt to include a combined "-name"
> instruction to filter the kinds of files I need (based on the extensions).
> After trying different ways around it and searching for a solution I think
> I need help.
> ~
> Also I would like to know what exactly is causing this problem, not just a
> way out of it

Operator precedence, mostly.

> ~
> thanks
> lbrtchx
> comp.unix.shell: trying to parametrize find's name filter ...
>
> # ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
> #!/bin/bash
>
> # directory separator
> _DIR_SEP="/"
>
> _SDIR="/media/sda3/tmp"
>
> # directory separator
> _DIR_SEP="/"
>
> # EOL (eval parses command twice in the first pass it eats the first"\")
> _EOL="\\n"
>
> # QUOT (eval parses command twice in the first pass it eats the "\" leaving
> the quote)
> _QT="\""
>
> # ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
>
> # excluding section (should be done in a loop (and using awk?))
> _XKLD="-noleaf -wholename '/proc' -prune -or -wholename '/media/sda1' -prune"

Grouping and missing -o after -prune. -true can be used to avoid problems.

_XKLD="-noleaf \\( -wholename '/proc' -or -wholename '/media/sda1' \\) -prune -o -true"

>
> # testing extensions (should be done in a loop)
> _Xs=" -name '*.js' "
> _Xs=" -name '*.js' -or -name '*.txt' "
> _Xs=" -name '*.js' -or -name '*.txt' -or -name '*.gif' "

Grouping required.

_Xs=" \\( -name '*.js' -or -name '*.txt' -or -name '*.gif' \\) "

To illustrate, (the -a before -print is usually omitted, added
here for clarity), this:

find . -name '*.js' -o -name '*.txt' -o -name '*.gif' -a -print

means this:

find . -name '*.js' -o -name '*.txt' -o \( -name '*.gif' -a -print \)

what you want is this:

find . \( -name '*.js' -o -name '*.txt' -o -name '*.gif' \) -a -print

>
> # size clause (non-empty files)
> _SZ=" ! -size 0 "
>
> eval find ${_SDIR} ${_XKLD} ${_Xs} -type f ${_SZ} -printf '%T@,%A@,%C@,${_QT}
> %M${_QT},%n,${_QT}%l${_QT},${_QT}%u${_QT},${_QT}%g${_QT},%s,%d,${_QT}${_SDIR}
> ${_DIR_SEP}%P${_QT}${_EOL}'

As you're using a shell which supports arrays, eval can be avoided.

#! /bin/bash

dirsep=/
sdir=/media/sda3/tmp

xkld=( -noleaf \( -wholename /proc -o -wholename /media/sda1 \) -prune -o -true )

exts=( -type f \( -name '*.js' -o -name '*.txt' -o -name '*.gif' \) )

find "$sdir" "${xkld[@]}" "${exts[@]}" \
-printf '%T@,%A@,%C@,"%M",%n,"%l","%u","%g",%s,%d,'"${sdir}${dirsep}"'%P"\n'

qwert...@syberianoutpost.ru

unread,
Sep 22, 2012, 5:39:49 PM9/22/12
to
~
the thing is that I am trying to specify the directories to exclude and
extensions to search for as input parameters (as files with one liners) not
hardcoded in the script, but even though the variable echoes exactly what you
coded find reports error: "No such file or directory"
~
these are the script and the ouput:
~
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~
# directory separator
_DIR_SEP="/"

# ~ ~ ~ ~ ~ ~ ~ ~ reading files line by line ~ ~ ~ ~ ~ ~ ~ ~
# save the field separator
_IFS00=$IFS
# end of line as field separator for file with one liners
IFS=$'\n'

# search dir
_SDIR="/media/sda3/tmp"

# number of parameters passed to script
_ARGSL=$#

# debug
echo "// __ number of parameters passed to script: "${_ARGSL}

_FLS_XFLTR=" -type f \("

# Loop index
_IxL=0

while read X
do
echo "// __ ${_IxL} ${X}"
if [ ${_IxL} -eq 0 ]; then
_FLS_XFLTR="${_FLS_XFLTR} -name '${X}'"
elif [ ${_IxL} -gt 0 ]; then
_FLS_XFLTR="${_FLS_XFLTR} -or -name '${X}'"
fi
let _IxL=(${_IxL} + 1)
done <$1
_FLS_XFLTR="${_FLS_XFLTR} \) "

echo "// __ \${_FLS_XFLTR}: |"${_FLS_XFLTR}"|"

find ${_SDIR} ${_FLS_XFLTR} -printf '%T@,%A@,%C@,"%M",%n,"%l","%u","%g",%s,%d,'
"${_SDIR}${_DIR_SEP}"'%P"\n'

# restore default field separator
IFS=$_IFS00
~
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~
// __ number of parameters passed to script: 1
// __ 0 *.js
// __ 1 *.txt
// __ 2 *.gif
// __ ${_FLS_XFLTR}: | -type f \( -name '*.js' -or -name '*.txt' -or -name
'*.gif' \) | find: ` -type f \\( -name \'*.js\' -or -name \'*.txt\' -or -name
\'*.gif\' \\) ': No such file or directory
...
~
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~
lbrtchx

Ed Morton

unread,
Sep 22, 2012, 6:27:58 PM9/22/12
to
On 9/22/2012 4:39 PM, qwert...@syberianoutpost.ru wrote:
> ~
> the thing is
<snip>
...that you keep starting new threads with every post. As you've been told
before - stop doing that. Also, as you've been told before too, quote enough
context from previous post(s) that your current post makes sense stand-alone.

You're coming across as very rude and/or very stupid.

Ed.

Kenny McCormack

unread,
Sep 22, 2012, 8:33:57 PM9/22/12
to
In article <k3le1h$cp0$1...@dont-email.me>,
I'm assuming you are castigating the castigator, not the castigate-e.
That is good. (If, instead, you are "piling on" in castigating the OP, then
please ignore the rest of this post)

So, assuming we're still on the same page, let me say that I find this
particular phrasing - "... as we've told you before and told you many times
..." - as if that was the final word. This happens a lot in comp.lang.c.
Keith Thompson will say things like this - telling newbies that they have
had it explained to them many times in the past - feigning surprise that for
some reason that wasn't the end of the conversation.

I often imagine that it would be as if Keith showed up in alt.abortion and
explained to a participant there that he had explained it all many times in
the past and that therefore the moral question of abortion was a solved
issue and they could all just move on from there.

(My point, just in case it isn't clear, is that a lot of the issues that
Keith weighs in on in comp.lang.c are as subjective as the question of
abortion. You can't act as if your word is the final word in such
situations)

--
Modern Christian: Someone who can take time out from
complaining about "welfare mothers popping out babies we
have to feed" to complain about welfare mothers getting
abortions that PREVENT more babies to be raised at public
expense.

Dave Gibson

unread,
Sep 23, 2012, 6:33:09 AM9/23/12
to
qwert...@syberianoutpost.ru wrote:
> ~
> the thing is that I am trying to specify the directories to exclude and
> extensions to search for as input parameters (as files with one liners) not
> hardcoded in the script, but even though the variable echoes exactly what you
> coded find reports error: "No such file or directory"
> ~
> these are the script and the ouput:
> ~
> ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
> ~
> # directory separator
> _DIR_SEP="/"
>
> # ~ ~ ~ ~ ~ ~ ~ ~ reading files line by line ~ ~ ~ ~ ~ ~ ~ ~
> # save the field separator
> _IFS00=$IFS
> # end of line as field separator for file with one liners
> IFS=$'\n'

Remove those four lines -- use an array or $@.

>
> # search dir
> _SDIR="/media/sda3/tmp"
>
> # number of parameters passed to script
> _ARGSL=$#
>
> # debug
> echo "// __ number of parameters passed to script: "${_ARGSL}
>
> _FLS_XFLTR=" -type f \("

To use an array, replace that line with:

_FLS_XFLTR=( -type f )

>
> # Loop index
> _IxL=0
>
> while read X
> do
> echo "// __ ${_IxL} ${X}"
> if [ ${_IxL} -eq 0 ]; then
> _FLS_XFLTR="${_FLS_XFLTR} -name '${X}'"
> elif [ ${_IxL} -gt 0 ]; then
> _FLS_XFLTR="${_FLS_XFLTR} -or -name '${X}'"
> fi
> let _IxL=(${_IxL} + 1)
> done <$1
> _FLS_XFLTR="${_FLS_XFLTR} \) "

Replace from "while read X" to here with the following:

# The first time in the loop, $op will be '('. This will initialise
# the parenthesised list. Subsequently, $op will be '-o'.
op='('
while IFS= read X ; do
echo "// __ ${_IxL} ${X}"
_FLS_XFLTR+=( "$op" -name "$X" )
_IxL=$(( $_IxL + 1 ))
op='-o'
done < "$1"
if [ $_IxL -gt 0 ]; then
# If any input was read, terminate the parenthesised list.
_FLS_XFLTR+=( ')' )
fi

>
> echo "// __ \${_FLS_XFLTR}: |"${_FLS_XFLTR}"|"

echo "// __ \${_FLS_XFLTR}: |${_FLS_XFLTR[*]}|"

>
> find ${_SDIR} ${_FLS_XFLTR} -printf \
> '%T@,%A@,%C@,"%M",%n,"%l","%u","%g",%s,%d,'"${_SDIR}${_DIR_SEP}"'%P"\n'

Quote $_SDIR and replace ${_FLS_XFLTR}:

find "$_SDIR" "${FLS_XFLTR[@]}" -printf ...

>
> # restore default field separator
> IFS=$_IFS00

Not needed.

> ~
> ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
> ~
> // __ number of parameters passed to script: 1
> // __ 0 *.js
> // __ 1 *.txt
> // __ 2 *.gif
> // __ ${_FLS_XFLTR}: | -type f \( -name '*.js' -or -name '*.txt' -or -name
> '*.gif' \) | find: ` -type f \\( -name \'*.js\' -or -name \'*.txt\' -or -name
> \'*.gif\' \\) ': No such file or directory

Here's the loop to load the glob patterns from the file without arrays:

ifile=$1
set -- -type f
op='('
while IFS= read X ; do
set -- "$@" "$op" -name "$X"
_IxL=$(( $_IxL + 1 ))
op='-o'
done < "$ifile"
if [ $_IxL -gt 0 ]; then
set -- "$@" ')'
fi

The find command would then be modelled on:

find "$_SDIR" "$@" -printf ...
0 new messages