Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to separate a File-Extension

15 views
Skip to first unread message

Michael Hufschmidt

unread,
Feb 8, 2012, 5:46:00 AM2/8/12
to
Hello @ all,

I am writing a shell script to do some conversion:
#!/bin/bash
ls -1 *.cwk | while read FILE_OLD
do
FILE_NEW=$FILE_OLD".txt"
cat $FILE_OLD | [... do some recoding ...] > $FILE_NEW
done

This converts myFile.cwk into myFile.cwk.txt. How could I calculate
FILE_NEW without the extension in order to get myFile.txt?

Thanks in advance for any hints - Michael

Dave Gibson

unread,
Feb 8, 2012, 6:11:15 AM2/8/12
to
Michael Hufschmidt <Michael_H...@omnis.net> wrote:
> Hello @ all,
>
> I am writing a shell script to do some conversion:
> #!/bin/bash
> ls -1 *.cwk | while read FILE_OLD
> do
> FILE_NEW=$FILE_OLD".txt"
> cat $FILE_OLD | [... do some recoding ...] > $FILE_NEW
> done
>
> This converts myFile.cwk into myFile.cwk.txt. How could I calculate
> FILE_NEW without the extension in order to get myFile.txt?

With the ${var%pattern} (strip trailing pattern) parameter expansion:

FILE_NEW=${FILE_OLD%.cwk}.txt

Consider using a for loop instead of piping ls' output to a while loop:

for FILE_OLD in *.cwk
do
...
done

Ben Bacarisse

unread,
Feb 8, 2012, 6:14:16 AM2/8/12
to
Michael Hufschmidt <Michael_H...@omnis.net> writes:

> Hello @ all,
>
> I am writing a shell script to do some conversion:
> #!/bin/bash
> ls -1 *.cwk | while read FILE_OLD
> do
> FILE_NEW=$FILE_OLD".txt"
> cat $FILE_OLD | [... do some recoding ...] > $FILE_NEW
> done
>
> This converts myFile.cwk into myFile.cwk.txt. How could I calculate
> FILE_NEW without the extension in order to get myFile.txt?

${FILE_OLD%.cwk}.txt

In addition, I'd
(a) quote all variables so as to be safe against file names with spaces,
(b) use file globing rather than ls | while read, and
(c) remove the use of cat:

for f in *.cwk; do recode-program <"$f" >"${f%.cwk}.txt"; done

If the "recode-program" is actually a pipeline, it's nice to make use of
some little-used syntax:

for f in *.cwk; do <"$f" head | tr A-Z a-z >"${f%.cwk}.txt"; done

This keeps the pipe together in one place.

--
Ben.

Michael Hufschmidt

unread,
Feb 8, 2012, 6:30:10 AM2/8/12
to
Hello Dave,

thank you, this works perfectly. I do not use a loop because some of the
filenames may contain spaces.

Regards - Michael

Aragorn

unread,
Feb 8, 2012, 7:47:14 AM2/8/12
to
On Wednesday 08 February 2012 12:30, Michael Hufschmidt conveyed the
following to comp.unix.shell...
That shouldn't be a problem if you quote the variable.

--
= Aragorn =
(registered GNU/Linux user #223157)

Ben Bacarisse

unread,
Feb 8, 2012, 8:46:28 AM2/8/12
to
Your code does not work if you have spaces in a file name and the use of
a 'for' loop is not cause any problems. Using a 'for' loop with file
globing works just as well as 'ls | while read' when there are spaces,
but your use of unquoted variable expansions ($FILE_OLD rather than
"$FILE_OLD" for example) causes problems no matter how you get the names
to start with.

--
Ben.

Dave Gibson

unread,
Feb 8, 2012, 9:20:34 AM2/8/12
to
Michael Hufschmidt <Michael_H...@omnis.net> wrote:
> Am 08.02.2012 12:11 schrieb Dave Gibson:

>> Consider using a for loop instead of piping ls' output to a while loop:
>>
>> for FILE_OLD in *.cwk

> thank you, this works perfectly. I do not use a loop because some of the
> filenames may contain spaces.

Globbing (wildcard expansion) is performed after word splitting so the
'for' command will receive each of the names in the generated list as
a distinct item.

Try:

for f in *.cwk ; do printf ' -->%s<--\n' "$f" ; done

Robert Bonomi

unread,
Feb 9, 2012, 1:02:48 AM2/9/12
to
In article <9pf234...@mid.individual.net>,
The "classical" method uses basename(1).

Modern shells have string-manipulation constructs that let you do it without
the need to call an external utility.


Chris F.A. Johnson

unread,
Feb 9, 2012, 4:08:42 PM2/9/12
to
That is part of the standard Unix shell.

--
Chris F.A. Johnson, author <http://shell.cfajohnson.com/>
===================================================================
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
Pro Bash Programming: Scripting the GNU/Linux Shell (2009, Apress)

Ralph Spitzner

unread,
Feb 10, 2012, 2:33:10 AM2/10/12
to
Robert Bonomi wrote:
[...]
> The "classical" method uses basename(1).
[...]


Am I missing somethinge here ?


bash-4.1# basename 01.\ Whiteman\ macht\ den\ Reim.mp3
01. Whiteman macht den Reim.mp3

Does basename somehow get confused if the filename has more
that one dot ? (a la Windblo$)

I usually use `awk -F "something" {'print $x'}` to get rid
of weird pre/extensions...

-rasp




--
RTMPDump & ffmpeg are your friends..
-icke

Stachu 'Dozzie' K.

unread,
Feb 10, 2012, 3:09:09 AM2/10/12
to
On 2012-02-10, Ralph Spitzner <ra...@spitzner.org> wrote:
> Am I missing somethinge here ?
>
>
> bash-4.1# basename 01.\ Whiteman\ macht\ den\ Reim.mp3
> 01. Whiteman macht den Reim.mp3
>
> Does basename somehow get confused if the filename has more
> that one dot ? (a la Windblo$)

And what did you expect from this command?

--
Secunia non olet.
Stanislaw Klekot
Message has been deleted

Janis Papanagnou

unread,
Feb 10, 2012, 4:41:23 AM2/10/12
to
Am 10.02.2012 08:33, schrieb Ralph Spitzner:
> Robert Bonomi wrote:
> [...]
>> The "classical" method uses basename(1).
> [...]
>
>
> Am I missing somethinge here ?

Yes; reading the man page:

"[...] any leading directory components removed. [...]"

>
>
> bash-4.1# basename 01.\ Whiteman\ macht\ den\ Reim.mp3
> 01. Whiteman macht den Reim.mp3
>
> Does basename somehow get confused if the filename has more
> that one dot ? (a la Windblo$)
>
> I usually use `awk -F "something" {'print $x'}` to get rid
> of weird pre/extensions...

Bad and unnecessary use of awk. For the given purpose,
in shell, you have

${parameter#word}
${parameter##word}
${parameter%word}
${parameter%%word}


Janis

>
> -rasp
>
>
>
>

Thomas 'PointedEars' Lahn

unread,
Feb 10, 2012, 6:46:15 AM2/10/12
to
Robert Bonomi wrote:

> Michael Hufschmidt <Michael_H...@omnis.net> wrote:
>> I am writing a shell script to do some conversion:
>> #!/bin/bash
>> ls -1 *.cwk | while read FILE_OLD
>> do
>> FILE_NEW=$FILE_OLD".txt"
>> cat $FILE_OLD | [... do some recoding ...] > $FILE_NEW
>> done
>>
>> This converts myFile.cwk into myFile.cwk.txt. How could I calculate
>> FILE_NEW without the extension in order to get myFile.txt?
>
> The "classical" method uses basename(1).

basename(1) does not remove filename suffixes ("extensions"). It removes
the leading directory components of a file path.

--
PointedEars

Please do not Cc: me. / Bitte keine Kopien per E-Mail.

Thomas 'PointedEars' Lahn

unread,
Feb 10, 2012, 6:47:15 AM2/10/12
to
Janis Papanagnou wrote:

> Am 10.02.2012 08:33, schrieb Ralph Spitzner:
>> Robert Bonomi wrote:
>> [...]
>>> The "classical" method uses basename(1).
>> [...]
>> Am I missing somethinge here ?
>
> Yes; reading the man page:
>
> "[...] any leading directory components removed. [...]"

Which is not what the OP is looking for.

Stephane Chazelas

unread,
Feb 10, 2012, 6:57:52 AM2/10/12
to
2012-02-10 12:46:15 +0100, Thomas 'PointedEars' Lahn:
[...]
> basename(1) does not remove filename suffixes ("extensions"). It removes
> the leading directory components of a file path.
[...]

~$ basename /path/to/foo.cwk .cwk
foo

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/basename.html

--
Stephane

Casper H.S. Dik

unread,
Feb 10, 2012, 7:02:36 AM2/10/12
to
Thomas 'PointedEars' Lahn <Point...@web.de> writes:

>Robert Bonomi wrote:

>> Michael Hufschmidt <Michael_H...@omnis.net> wrote:
>>> I am writing a shell script to do some conversion:
>>> #!/bin/bash
>>> ls -1 *.cwk | while read FILE_OLD
>>> do
>>> FILE_NEW=$FILE_OLD".txt"
>>> cat $FILE_OLD | [... do some recoding ...] > $FILE_NEW
>>> done
>>>
>>> This converts myFile.cwk into myFile.cwk.txt. How could I calculate
>>> FILE_NEW without the extension in order to get myFile.txt?
>>
>> The "classical" method uses basename(1).

>basename(1) does not remove filename suffixes ("extensions"). It removes
>the leading directory components of a file path.

It also removes the listed extension:


$ basename ~casper/file.c .c
file

Casper

Thomas 'PointedEars' Lahn

unread,
Feb 10, 2012, 7:22:01 AM2/10/12
to
Stephane Chazelas wrote:

> Thomas 'PointedEars' Lahn:
>> basename(1) does not remove filename suffixes ("extensions"). It removes
>> the leading directory components of a file path.
> [...]
>
> ~$ basename /path/to/foo.cwk .cwk
> foo
>
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/basename.html

Thanks, that also works in GNU. However, this does not help with unknown
or multiple "extensions", and requires another tool, so I think I will
usually stick to the equally POSIX-compliant shell parameter expansions
${x%.*} and ${x%%.*} (which also work in GNU bash).

<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02>

Stephane Chazelas

unread,
Feb 10, 2012, 7:39:29 AM2/10/12
to
2012-02-10 13:22:01 +0100, Thomas 'PointedEars' Lahn:
> Stephane Chazelas wrote:
>
> > Thomas 'PointedEars' Lahn:
> >> basename(1) does not remove filename suffixes ("extensions"). It removes
> >> the leading directory components of a file path.
> > [...]
> >
> > ~$ basename /path/to/foo.cwk .cwk
> > foo
> >
> > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/basename.html
>
> Thanks, that also works in GNU. However, this does not help with unknown
> or multiple "extensions", and requires another tool, so I think I will
> usually stick to the equally POSIX-compliant shell parameter expansions
> ${x%.*} and ${x%%.*} (which also work in GNU bash).
[...]

careful with that though:

$ x=/etc/init.d/foo
$ echo ${x%.*}
/etc/init

zsh/csh:
$ echo $x:r
/etc/init.d/foo

You may also need to consider /path/to/foo.x/ and foo-2.1.tar.gz

--
Stephane

Thomas 'PointedEars' Lahn

unread,
Feb 10, 2012, 7:52:21 AM2/10/12
to
Stephane Chazelas wrote:

> 2012-02-10 13:22:01 +0100, Thomas 'PointedEars' Lahn:
>> Stephane Chazelas wrote:
>> > Thomas 'PointedEars' Lahn:
>> >> basename(1) does not remove filename suffixes ("extensions"). It
>> >> removes the leading directory components of a file path.
>> > [...]
>> >
>> > ~$ basename /path/to/foo.cwk .cwk
>> > foo
>> >
>> > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/basename.html
>>
>> Thanks, that also works in GNU. However, this does not help with unknown
>> or multiple "extensions", and requires another tool, so I think I will
>> usually stick to the equally POSIX-compliant shell parameter expansions
>> ${x%.*} and ${x%%.*} (which also work in GNU bash).
> [...]
>
> careful with that though:
>
> $ x=/etc/init.d/foo
> $ echo ${x%.*}
> /etc/init

OK, so one needs to trim the directories first:

my_basename3 () {
x=${1%/}
x=${x##*/}
printf "%s" "${x%%.*}"
}

> zsh/csh:
> $ echo $x:r
> /etc/init.d/foo

So zsh and csh are not POSIX-compliant?

> You may also need to consider /path/to/foo.x/ and foo-2.1.tar.gz

See above.

Ben Bacarisse

unread,
Feb 10, 2012, 8:29:40 AM2/10/12
to
Thomas 'PointedEars' Lahn <Point...@web.de> writes:

> Janis Papanagnou wrote:
>
>> Am 10.02.2012 08:33, schrieb Ralph Spitzner:
>>> Robert Bonomi wrote:
>>> [...]
>>>> The "classical" method uses basename(1).
>>> [...]
>>> Am I missing somethinge here ?
>>
>> Yes; reading the man page:
>>
>> "[...] any leading directory components removed. [...]"
>
> Which is not what the OP is looking for.

They want to strip the suffix. Of course, basename can do that too, but
if the OP were ever to generalise the script to work with paths other
than ., the full behaviour of basename would get in the way.

"${f%.cwk}.txt" is surely the right answer.

--
Ben.

Ben Bacarisse

unread,
Feb 10, 2012, 8:31:59 AM2/10/12
to
Ben Bacarisse <ben.u...@bsb.me.uk> writes:

> Thomas 'PointedEars' Lahn <Point...@web.de> writes:
>
>> Janis Papanagnou wrote:
>>
>>> Am 10.02.2012 08:33, schrieb Ralph Spitzner:
>>>> Robert Bonomi wrote:
>>>> [...]
>>>>> The "classical" method uses basename(1).
>>>> [...]
>>>> Am I missing somethinge here ?
>>>
>>> Yes; reading the man page:
>>>
>>> "[...] any leading directory components removed. [...]"
>>
>> Which is not what the OP is looking for.
>
> They want to strip the suffix...
<snip irrelevant clarification>

Oh dear. I somehow missed the word "not". Please ignore.

--
Ben.

Stephane Chazelas

unread,
Feb 10, 2012, 12:52:05 PM2/10/12
to
i2012-02-10 13:52:21 +0100, Thomas 'PointedEars' Lahn:
[...]
> > zsh/csh:
> > $ echo $x:r
> > /etc/init.d/foo
>
> So zsh and csh are not POSIX-compliant?
[...]

No, and they never claimed to be. zsh has a "sh" emulation mode
(also enabled when called as "sh") than aims at improving POSIX
compliance, where you need ${x:r} instead (but of course you
wouldn't if you were writing a POSIX script).

~$ a=a/a.b zsh -c 'echo $a:r'
a/a
~$ a=a/a.b ARGV0=sh zsh -c 'echo $a:r'
a/a.b:r
~$ a=a/a.b zsh -c 'emulate sh; echo $a:r'
a/a.b:r
~$ a=a/a.b ARGV0=sh zsh -c 'echo ${a:r}'
a/a

(ARGV0=x cmd, is zsh way to set the argv[0] of cmd).

(zsh also has csh and ksh emulation modes)

--
Stephane

Barry Margolin

unread,
Feb 10, 2012, 6:21:56 PM2/10/12
to
In article <10386993....@PointedEars.de>,
Thomas 'PointedEars' Lahn <Point...@web.de> wrote:

> Robert Bonomi wrote:
>
> > Michael Hufschmidt <Michael_H...@omnis.net> wrote:
> >> I am writing a shell script to do some conversion:
> >> #!/bin/bash
> >> ls -1 *.cwk | while read FILE_OLD
> >> do
> >> FILE_NEW=$FILE_OLD".txt"
> >> cat $FILE_OLD | [... do some recoding ...] > $FILE_NEW
> >> done
> >>
> >> This converts myFile.cwk into myFile.cwk.txt. How could I calculate
> >> FILE_NEW without the extension in order to get myFile.txt?
> >
> > The "classical" method uses basename(1).
>
> basename(1) does not remove filename suffixes ("extensions"). It removes
> the leading directory components of a file path.

It optionally removes a suffix, but you have to supply the suffix
explicitly as the second argument.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

Thomas 'PointedEars' Lahn

unread,
Feb 10, 2012, 7:06:12 PM2/10/12
to
Stephane Chazelas wrote:

> i2012-02-10 13:52:21 +0100, Thomas 'PointedEars' Lahn:
> [...]
>> > zsh/csh:
>> > $ echo $x:r
>> > /etc/init.d/foo
>>
>> So zsh and csh are not POSIX-compliant?
> [...]
>
> No, and they never claimed to be. […]

OK, thanks for the explanation. But then you do not have a point there.

Thomas 'PointedEars' Lahn

unread,
Feb 10, 2012, 7:08:02 PM2/10/12
to
Barry Margolin wrote:

> Thomas 'PointedEars' Lahn <Point...@web.de> wrote:
>> basename(1) does not remove filename suffixes ("extensions"). It removes
>> the leading directory components of a file path.
>
> It optionally removes a suffix, but you have to supply the suffix
> explicitly as the second argument.

11-hours newsfeed?

Robert Bonomi

unread,
Feb 12, 2012, 5:09:44 AM2/12/12
to
In article <q7pd09-...@cjlocal.com>,
Chris F.A. Johnson <cfajo...@gmail.com> wrote:
>On 2012-02-09, Robert Bonomi wrote:
>> In article <9pf234...@mid.individual.net>,
>> Michael Hufschmidt <Michael_H...@omnis.net> wrote:
>>>Hello @ all,
>>>
>>>I am writing a shell script to do some conversion:
>>> #!/bin/bash
>>> ls -1 *.cwk | while read FILE_OLD
>>> do
>>> FILE_NEW=$FILE_OLD".txt"
>>> cat $FILE_OLD | [... do some recoding ...] > $FILE_NEW
>>> done
>>>
>>>This converts myFile.cwk into myFile.cwk.txt. How could I calculate
>>>FILE_NEW without the extension in order to get myFile.txt?
>>>
>>>Thanks in advance for any hints - Michael
>>
>> The "classical" method uses basename(1).
>>
>> Modern shells have string-manipulation constructs that let you do it without
>> the need to call an external utility.
>
> That is part of the standard Unix shell.

True _TODAY_. *Not* historically. Thus the 'modern' qualifier.

Robert Bonomi

unread,
Feb 12, 2012, 5:17:53 AM2/12/12
to
In article <10386993....@PointedEars.de>,
Thomas 'PointedEars' Lahn <use...@PointedEars.de> wrote:
>Robert Bonomi wrote:
>
>> Michael Hufschmidt <Michael_H...@omnis.net> wrote:
>>> I am writing a shell script to do some conversion:
>>> #!/bin/bash
>>> ls -1 *.cwk | while read FILE_OLD
>>> do
>>> FILE_NEW=$FILE_OLD".txt"
>>> cat $FILE_OLD | [... do some recoding ...] > $FILE_NEW
>>> done
>>>
>>> This converts myFile.cwk into myFile.cwk.txt. How could I calculate
>>> FILE_NEW without the extension in order to get myFile.txt?
>>
>> The "classical" method uses basename(1).
>
>basename(1) does not remove filename suffixes ("extensions").

Really?? What does "basename foobar ar' do on _your_ system?


Robert Bonomi

unread,
Feb 12, 2012, 5:20:08 AM2/12/12
to
In article <arte09-...@spitzner.org>,
Ralph Spitzner <ra...@spitzner.org> wrote:
>Robert Bonomi wrote:
>[...]
>> The "classical" method uses basename(1).
>[...]
>
>
>Am I missing somethinge here ?

apparenly. :)
>
>
>bash-4.1# basename 01.\ Whiteman\ macht\ den\ Reim.mp3
>01. Whiteman macht den Reim.mp3
>
>Does basename somehow get confused if the filename has more
>that one dot ? (a la Windblo$)

no. In fact, it _doesn't_care_ if the filename has a dot in it or not.
it will remove the specified trailing string. e.g.:

$ basename foobar ar
foob



Robert Bonomi

unread,
Feb 12, 2012, 5:24:27 AM2/12/12
to
In article <10072393....@PointedEars.de>,
Thomas 'PointedEars' Lahn <use...@PointedEars.de> wrote:
>Janis Papanagnou wrote:
>
>> Am 10.02.2012 08:33, schrieb Ralph Spitzner:
>>> Robert Bonomi wrote:
>>> [...]
>>>> The "classical" method uses basename(1).
>>> [...]
>>> Am I missing somethinge here ?
>>
>> Yes; reading the man page:
>>
>> "[...] any leading directory components removed. [...]"
>
>Which is not what the OP is looking for.

Per the OPs _example_ he was dealing with files in the 'current directory',
so there was no directory component to remove.

Try 'basename foobar ar' on _your_ system, and see what happens.

Applying it to the process of renaming a bunch of files with a specified
common trailing part to have a different trailing part is left as an
exercise for the student.

Janis Papanagnou

unread,
Feb 12, 2012, 5:37:37 AM2/12/12
to
Which age do you consider historically and which not? Ksh88 had those
string manipulation functions already, and those ksh features made it
into the POSIX standard quite early as well.

Janis

Janis Papanagnou

unread,
Feb 12, 2012, 5:48:55 AM2/12/12
to
On 12.02.2012 11:24, Robert Bonomi wrote:
> In article <10072393....@PointedEars.de>,
> Thomas 'PointedEars' Lahn <use...@PointedEars.de> wrote:
>> Janis Papanagnou wrote:
>>
>>> Am 10.02.2012 08:33, schrieb Ralph Spitzner:
>>>> Robert Bonomi wrote:
>>>> [...]
>>>>> The "classical" method uses basename(1).
>>>> [...]
>>>> Am I missing somethinge here ?
>>>
>>> Yes; reading the man page:
>>>
>>> "[...] any leading directory components removed. [...]"
>>
>> Which is not what the OP is looking for.
>
> Per the OPs _example_ he was dealing with files in the 'current directory',
> so there was no directory component to remove.

Correct. That's why basename, as used by the poster, did not work.

>
> Try 'basename foobar ar' on _your_ system, and see what happens.

That does not match what the poster tried; which was...

>>>> bash-4.1# basename 01.\ Whiteman\ macht\ den\ Reim.mp3
>>>> 01. Whiteman macht den Reim.mp3

IOW, he used basename with just one argument. (If he had read the
manpage he would have known that and also that it can be use with
another argument in case he cares, which he shouldn't, since there
are the built-in shell constructs that I suggested in my posting.)

Janis

Thomas 'PointedEars' Lahn

unread,
Feb 18, 2012, 2:00:10 PM2/18/12
to
When you are getting back to a thread, please read all new postings in it
before you reply. IOW: Read, think, post. In that order. TIA.

Eric

unread,
Feb 18, 2012, 2:51:03 PM2/18/12
to
You should try that yourself. You come back after a week and complain
about things that all happened on the same day, and we have no way of
telling what posts he could see when he wrote that. And in any case the
post makes sense in context. You have neither read carefully enough nor
thought clearly enough.

But then we have past evidence that you do neither of these things.

Eric

--
ms fnd in a lbry
0 new messages