I'm looking for some community input for ksh 93u+m on the default set of
path-bound commands built in to the ksh binary. Providing commands as
built-ins can greatly increase performance for those commands, but
increases the size of the ksh binary (particularly as they all have
built-in --man documentation as well), so a tradeoff is necessary.
Shipping all of them by default would be far too much bloat.
Distributors and power users can of course make their own choices when
compiling ksh.
The source distribution comes with many such commands as part of libcmd
(these commands are built in when /opt/ast/bin is prefixed to $PATH, or
as of ksh 93u+m, you can also invoke /opt/ast/bin/somecommand directly).
The full list of possibilities is:
basename cp head mv sync
cat cut id paste tail
chgrp date join pathchk tee
chmod dirname ln pids tty
chown expr logname rev uname
cksum fds md5sum rm uniq
cmdinit fmt mkdir rmdir wc
cmp fold mkfifo stty
comm getconf mktemp sum
Only a few of them are chosen to be compiled in by default -- a
selection originally made by AT&T. However, I don't think their default
selection makes a lot of sense:
$ builtin | grep ^/
/opt/ast/bin/basename
/opt/ast/bin/cat
/opt/ast/bin/chmod
/opt/ast/bin/cmp
/opt/ast/bin/cut
/opt/ast/bin/dirname
/opt/ast/bin/getconf
/opt/ast/bin/head
/opt/ast/bin/logname
/opt/ast/bin/mkdir
/opt/ast/bin/sync
/opt/ast/bin/uname
/opt/ast/bin/wc
IMO, good choices are: basename, cat, cut, dirname. Those are often used
in performance-sensitive code paths like loops, and/or in command
substitutions. Having these as built-ins on $PATH can greatly increase
performance.
We may want to keep getconf as well since it can report on some of the
internal libast state, which an external getconf clearly cannot.
Although we might want to consider if it should be in release builds by
default or perhaps only in development builds. The AST userland universe
is dead except for ksh, so it doesn't seem likely many people still run
scripts that depend on getconf's AST-specific functionality.
I'm not so sure about having the following as defaults, I'd be inclined
to remove them unless someone can give me a reason why they should stay:
* chmod. Some scripts do change the permission of lots of files, but
if they care about performance then xargs chmod… is typically used
which, being external, will not invoke any built-ins.
* cmp. I think it's relatively rarely used. People use diff a lot more,
but it's external.
* head. Why have this, but not tail? Also, neither of these are
typically performance-sensitive. Some process that produces a lot of
output gets piped into head or tail and that generally only needs to
be done once.
* uname. This is fairly frequently used in system scripts but if the
script is any good, it'll store the value in a variable before using
it in any performance-sensitive manner.
* wc. This is for counting lines/words in files. I don't think this is
commonly used in a loop either, though I could be wrong.
I think the following are almost certainly unnecessary defaults, and I
would need a pretty solid reason to keep them:
* logname. Getting your login name only needs to be done once.
* sync. This generally only needs to be done at shutdown time by system
scripts, which will not be using any ksh built-ins. Even if you do
occasionally want to sync your disk during regular usage, the external
command should do just fine for that.
The following are *not* included in the current defaults list, but I
think they would be nice to have:
* cp, ln, mv. Having these as built-ins can be a good performance
optimisation. This is done in loops all the time, often with one
file at a time.
* fds and pids. These list open file descriptors and process IDs,
respectively. They could not possibly work unless built in. Also,
they're very small.
* mktemp. Not for performance reasons (clearly), but because external
mktemp implementations are so different and incompatible on various
systems that it would be good for ksh to come with a known interface
to this important functionality.
Thoughts/opinions?
--
|| modernish -- harness the shell
||
https://github.com/modernish/modernish
||
|| KornShell lives!
||
https://github.com/ksh93/ksh