One way to view this is, sure, your arrays should have bounds checking.
And so should every other level of every other part of your program.
You should everywhere find words running paranoid checks on values
right before those values are fed into other words that run the same
paranoid checks against them. Nowhere in your program may you say "I
trust what has gotten this far, because it's surely been checked by
now." You may, at best, have transition points where the tests kill
your program with "the impossible happened--bailing out!" rather than
gracefully handle an error condition. The only escape from this
drudgery? A really smart compiler, that can know when the checks
aren't unnecessary, and which can even come up with checks by itself.
The advantage of this view, if you stick with it in Forth, is that you
can, with some effort, achieve something like the absolutely miserable
results that people manage with other languages that offer more
advanced support for it.
Another way to view this is... huh? What do *internet devices* have to
do with *arrays*?! Let's say you're writing an IRC server. One thing
IRC servers do is allow, into a room, a number of users. Faced with
the famous "zero, one, or many" question of what limits to have, you of
course pick an arbitrary configurable number, and then document the
limits and resource implications of that number. Software, you see,
does not in fact run on hardware possessed of infinite resources, and
network services lack magic hardware even more severely than most
software. So your configurable number makes its way to two separate
places in your code: the size of an array of users in a room, where
there is no bounds checking whatsoever, and also to the 'join a user to
a room' function of the server, which refuses to join users to a room
that, by that number, is full. The cost of the bounds checking is
directly tied to a user's attempt to enter a room, which helps you
consider the costs of a user's total actions, which helps you kick
users who are abusing your server. The error condition of a user
trying to join a full room is obvious: "Sorry, the room is full." You
don't need to throw an exception from deep within your code--you're
already where you want to be, processing a user request and prepared to
reply to it.
Or you could, the moment you have a need for an array, spend some time
adding bounds checking to it, just so that you can kill the program
with an "index out of bounds" error. What a wonderful thing for a
server to do out of the blue. And it *will* be 'out of the blue'
because the only reason you would code like this is if the rest of the
server's code was a dark and scary labyrinth to you. You don't know
what you're doing, so you write defensive code, and the result is that
you still don't know what you're doing and your defensive code now
sometimes kills your server based on decisions made far away from your
code. So have fun debugging that!
-
Of course, dark and scary labyrinths are certainly out there. A while
back I added a feature to such a body of code. A bug in my addition
would have harmlessly caused the program to fail to do its job, which
would've been noticed and which would've resulted in mild
embarrassment, if not for the bug resulting later on in a function
getting called with a rather unpleasant argument. The result? The
program failed to do its job -- and also inflicted enormous performance
costs on the servers it ran on. It wasn't noticed for some time that
my code push coincided with the sudden spate of performance issues. I
spent a lot of time cursing and swearing while personally fighting the
fires that resulted from these performance issues, without suspecting
that I'd caused them...
Would that have been helped by boundary-checked data structures? No, I
had them. Would it have been helped by algebraic types and
compile-time checking of them by a really smart compiler? No, the
argument was validly typed as far as the performance-killing function
was concerned. And since we weren't asking the question of should the
potential arguments to this function be more constrained than that
function was designed to accept, we didn't answer that question with a
new constrained type, or an additional test, or scrutiny towards the
potential arguments to it that could arise in the body of code.
Additionally, I wasn't aware of how the function was respond to the
argument it got; if you'd asked me, I would guessed that it would very
quickly do nothing.
What might've actually helped: instead of trusting in my ability to
dive into a 500-line or whatever *massive* routine and make just the
demanded changes, I could've pulled out the section I was editing into
its own function, Forth-style, as a new factor with precisely
understood and considered inputs and outputs. I also could've followed
up on the logs and performance of this program after the push even
though I knew I had made very minor changes.
That is, what would've helped was all programmer-side, not
language-side.
-
Still, I introduced a bug and bad things happened. So also could
someone edit the 'join an IRC channel' code and accidently introduce a
bug that then caused a buffer overflow in that internal 'users in a
channel' array. And so, even with a clear and complete understanding
of your flawless IRC server, you might add some checks at some
boundaries, to protect parts of it from bugs that may later get
introduced by other people into other parts of it.
There's a fractal here: the network devices *could* be trusted because
the whole network is trusted and carefully run, and then there'd be no
reason even to write in "Sorry, the channel is full" sorts of code. But
you don't trust the network, and the same impulse you have to consider
such 'errors' coming the network, you can repeat at deeper and deeper
levels of your server. The network could be at harmony but instead the
server distrusts the network and the clients all distrust the server;
the server could be in harmony but instead libraries within the server
distrust the code that uses them; individual libraries could be in
harmony but instead functions distrust other functions within the same
library. At a deep enough level your real concern with that 1DIM array
word is that code from a completely different module can alter the
memory space 1DMIN uses for its array, without going through proper
array words or 'join a user to a channel' code or anything.
The point is, Alex wrote an array word in Forth. If you're going to
bother with Forth, you may as well try to get as far as you can with
harmonious, fully understood, brightly-lit code. There's where the
benefits lie. There's also where the deficiencies don't worry you as
much. You do not whinge, for example, about not having a potentially
untrustworthy array library that you'd defend your code against.
You're not dropping knowledge bombs on feckless monkeys who would come
out with software ridden through with buffer overflows. You just kinda
come off like a tourist.
Meanwhile, here is an array library with bounds checking:
https://code.google.com/p/ffl/wiki/car
Yes, even CAR-SET throws an exception. CAR-OFFSET does the work:
https://code.google.com/p/ffl/source/browse/trunk/ffl/car.fs
-- Julian
http://www.catb.org/jargon/html/Z/Zero-One-Infinity-Rule.html
(Wisdom that's appropriate when "I ran out of memory and crashed
trying to do what you just stupidly told me to do" is an acceptable
behavior. So, rarely. And never if you're writing a server.)