RFE: brace expansion sequences should do zero padding [patch]

7 views
Skip to first unread message

Martin von Gagern

unread,
Aug 29, 2007, 8:12:27 PM8/29/07
to bug-...@gnu.org
Hi!

Suppose you have a set of files numbered and those numbers zero padded.
Of course you can list them using some magic with printf or some such,
but I would really love a simple feature like this:

cat x{000..123}

to concatenate files x000 through x123, not x0 through x123 as bash
currently does.

The attached patch should do the trick. I hope you agree with the place
where I chose to implement this feature. Do you think this has a chance
of getting implemented into the official source tree? If so, what is
left to do? Documentation? Any kind of test cases? Anything else?

Greetings,
Martin von Gagern

P.S.: This is the second time I post this message here.
The first one was via NNTP and it seems like it didn't make it.

seqpad.patch

Martin von Gagern

unread,
Aug 29, 2007, 6:33:22 PM8/29/07
to gnu-ba...@moderators.isc.org
seqpad.patch

Martin von Gagern

unread,
Sep 3, 2007, 4:16:58 PM9/3/07
to bug-...@gnu.org
Hi again!

I saw my first post made it to the list eventually as well. Sorry for
the duplicate. I hadn't realized that the newsgroup was moderated.

I'm a bit dishearted at the lack of response. On IRC many people pointed
out that usually this kind of issue can be solved by passing a sequence
to printf. Now I've come up with a real life example that isn't easy to
hack together using printf:
wget -x http://some.really.long/url/prefix/{,{000..123}.{html,jpg}}

There have been some concerns about changing the behaviour and thus
breaking existing scripts. The preferred solution in this case would be
to use "..." instead of ".." if one wanted to activate this feature.
Personally I believe that zero padded sequences in an existing
application that cares for the exact string and not only the numeric
value are so unlikely that adding another piece of syntax is not worth
the trouble, but I'd like your opinion on this.

Another thing worth mentioning is negative numbers. My padding pads all
numbers to a common width, not a common number of digits. This is what
printf does, and it's a wee little bit easier to implement. However it
could be changed to common number of digits as well. On IRC I got the
idea that {-07..003} should do common width, whereas {-007..003} should
do common number of digits. This, however, would add a lot of code. I
think negative numbers are so rare that they are not worth the effort.
Do you agree?

I hope to generate some feedback here. If you think this useful, tell me
about it, and I'll try a bit harder to get this into the offical
sources. If I get no single answer this time as well, I'll probably post
the patch somewhere online, patch my own version of bash, and that's it.

Greetings,
Martin von Gagern


signature.asc

Eric Blake

unread,
Sep 3, 2007, 4:38:39 PM9/3/07
to Martin von Gagern, bug-...@gnu.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Martin von Gagern on 9/3/2007 2:16 PM:


>
> Another thing worth mentioning is negative numbers. My padding pads all
> numbers to a common width, not a common number of digits. This is what
> printf does, and it's a wee little bit easier to implement. However it
> could be changed to common number of digits as well. On IRC I got the
> idea that {-07..003} should do common width, whereas {-007..003} should
> do common number of digits. This, however, would add a lot of code. I
> think negative numbers are so rare that they are not worth the effort.
> Do you agree?

Perhaps rather than trying to improve bash {} expansion, you could use
coreutils seq instead. For example,

$ seq -f 'a/%03g' -007 003
a/-07
a/-06
a/-05
a/-04
a/-03
a/-02
a/-01
a/000
a/001
a/002
a/003

- --
Don't work too hard, make some time for fun as well!

Eric Blake eb...@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG3HDN84KuGfSFAYARAgLOAJ45Ksjc9I5AxFSUAICNGl+2M0q4IACgjz5m
uSXIlLY4d+bCCCc76LC9NCs=
=LJhV
-----END PGP SIGNATURE-----


Martin von Gagern

unread,
Sep 3, 2007, 4:58:29 PM9/3/07
to Eric Blake, bug-...@gnu.org
Eric Blake wrote:
> Perhaps rather than trying to improve bash {} expansion, you could use
> coreutils seq instead.

Hi Eric, thank's for taking an interest.

seq is not that much different from printf here, although I hadn't known
of its formatting capabilities. This changes nothing of the fact that
there are situations like the wget command in my last posting where
calling a subcommand isn't going to help much.

If I'd follow your argument, brace number sequences would not be needed
at all, as you can somehow use seq in most cases. However brace
sequences are part of bash, and for a good reason. For one they are a
lot easier to write than subcommand invocation, for a second you don't
have to worry so much about word splitting, and finally there are some
more complex cases where using a subcommand will make the command much
more complicated.

To give you another example, highlighting the word-splitting issue:

for i in $'This is record\n'{000..007}" of ${PWD}"; do
echo "> $i <"
done

OK, yes, I could add all that text inside the loop, but I believe you
can think of uses of word lists where this isn't so easy. As you can
see, the escape string introduces spaces and a newline, and the variable
may introduce arbitrary other characters. Together you have no obvious
save character at which you can split the output of any subprocess into
words. So word splitting now becomes a real problem, whereas with brace
expansion the world splitting is implicit.

Yes, I know, this as well can be solved using other tools, probably
together with bash arrays. However there will be a huge overhead. On the
other hand, brace expansion is intuitive, quick to write, easy to read,
and gets the job done - if the job doesn't need the zero padding, or you
apply my patch.

Greetings,
Martin von Gagern

signature.asc
Reply all
Reply to author
Forward
0 new messages