Much of the distress in the program verification community seems to me
to have arisen from a failure to adequately distinguish between vari-
ous different positions that are possible here. I freely admit that I
was less concerned with drawing distinctions between various positions
at the time than I was to assess those that concerned me. Let me there-
fore attempt to make a modest contribution toward that end by specify-
ing several different positions that no doubt deserve to be separated,
where some concern logic, others methodology, and others verification:
Positions of LOGIC:
(T1) Formal proofs of program correctness can provide an absolute, con-
clusive guarantee of program performance;
(T2) Formal proofs of program correctness can provide only relative, in-
conclusive evidence concerning program performance.
Positions of METHODOLOGY:
(T3) Formal proofs of program correctness should be the exclusive method-
ology for assessing software reliability;
(T4) Formal proofs of program correcness should be the principal method-
ology for assessing software reliability;
(T5) Formal proofs of program correctness should be one among various
methodologies for assessing software reliability.
Positions on VERIFICATION:
(T6) Program verifications always require formal proofs of correctness;
(T7) Program verifications always require proof sketches of correctness;
(T8) Program verifications always require the use of deductive reasoning.
My original paper was devoted to the distinction between (T1) and (T2) as
matters of logic. I contend that (T1) is false but that (T2) is true. In
relation to questions of methodology, I also maintain that (T3) is false,
and at the ACM CSC 90 debate, I argued further that (T4) is also false. I
have no doubt, however, that (T5) is true. With respect to verificaiton in
the broad sense, it is my view that (T6) is clearly false and that (T8) is
clearly true, but that the truth-value of (T7) is subject to debate. Much
might hinge upon whether these proof sketches had to be written down, could
be merely thought-through, or whatever. In the former case, (T7) becomes
close to (T6), in the latter case, closer to (T8). However these matters
may finally be decided, these seem to be the principal distinctions which
need to be drawn to get at the issues that lie beneath the surface here.
Let me conclude with one further gesture of agreement with Charles Martin.
He acknowledges that there are important differences between accounting
procedures, for example, and life-critical situations. On this we could
not more strongly agree. The greater the seriousness of the consequences
that would ensue from making a mistake, the greater our obligation to in-
sure that it is not made. This suggests that the role for prototyping and
testing increases dramatically as life-critical tasks come into play. We
can afford to run a system that has only been formally assessed when the
consequences of mistakes are relatively minor (Nintendo games, for example),
but not when they are serious (radiation therapy, for example). So it be-
comes a matter of how confident we need to be that a system will perform
as it is intended to perform. We cannot know without testing the system.
James H. Fetzer
In article <32...@umn-d-ub.D.UMN.EDU> ph...@umn-d-ub.D.UMN.EDU (Philosophy Dept) writes:
..Positions of LOGIC:
..
..(T1) Formal proofs of program correctness can provide an absolute, con-
..clusive guarantee of program performance;
..
..(T2) Formal proofs of program correctness can provide only relative, in-
..conclusive evidence concerning program performance.
..
..Positions of METHODOLOGY:
..
..(T3) Formal proofs of program correctness should be the exclusive method-
..ology for assessing software reliability;
..
..(T4) Formal proofs of program correcness should be the principal method-
..ology for assessing software reliability;
..
..(T5) Formal proofs of program correctness should be one among various
..methodologies for assessing software reliability.
..
..Positions on VERIFICATION:
..
..(T6) Program verifications always require formal proofs of correctness;
..
..(T7) Program verifications always require proof sketches of correctness;
..
..(T8) Program verifications always require the use of deductive reasoning.
..My original paper was devoted to the distinction between (T1) and (T2) as
..matters of logic. I contend that (T1) is false but that (T2) is true.
Agreed.
..In
..relation to questions of methodology, I also maintain that (T3) is false,
..and at the ACM CSC 90 debate, I argued further that (T4) is also false. I
..have no doubt, however, that (T5) is true.
Also agreed.
(I'd still like to know, someday, who came up with the thesis statement
you all started with. I'd have hated to be Ardis and Gries.)
..With respect to verificaiton in
..the broad sense, it is my view that (T6) is clearly false and that (T8) is
..clearly true, but that the truth-value of (T7) is subject to debate. Much
..might hinge upon whether these proof sketches had to be written down, could
..be merely thought-through, or whatever.
Absolutely. My use of hand-proof of programs is somewhere in the middle
of the T6..T8 spectrum, attempting to present a "literate program" that
is comparable with good mathematical prose in level of detail and depth
of argument. One of the unfortunate things with most mechanical theorem
provers is that the level of proof needed is so much more complex
(deeper, less abstract) that it's kind of hard to follow the proofs at
all. As I've said, I think this technique seems very promising, but more
experimentation is needed. (Funding agencies please note!)
(One problem is the question of what "formal proof" means. If I were
feeling disputatious (:-)) I'd argue that a hand proof checked by
several others *is* a formal proof. But let's not start that debate
'till next week, okay?)
..
..Let me conclude with one further gesture of agreement with Charles Martin.
..He acknowledges that there are important differences between accounting
..procedures, for example, and life-critical situations. On this we could
..not more strongly agree. The greater the seriousness of the consequences
..that would ensue from making a mistake, the greater our obligation to in-
..sure that it is not made. This suggests that the role for prototyping and
..testing increases dramatically as life-critical tasks come into play. We
..can afford to run a system that has only been formally assessed when the
..consequences of mistakes are relatively minor (Nintendo games, for example),
..but not when they are serious (radiation therapy, for example). So it be-
..comes a matter of how confident we need to be that a system will perform
..as it is intended to perform. We cannot know without testing the system.
I'm with you here, in spades. Especially since the thing that formal
methods usually give little insight into is the APPROPRIATENESS of the
specifications; does it REALLY do what we want?
For that reason, I question whether only formal assessment is EVER
appropriate. The Nintendo game analogy is an apt one: while we can
formally get quite good assurance that our program realizes the program
for a game that was formally specified (formally specifying a game
program like trek might make an interesting exercise, at that... hmmm)
anyway, while we can get good assurance of the correspondence between
the formally-specified game and the realization, this formal assurance
tells us little about the "goodness" of the human-machine interface;
however, the user's satisfaction with the game relies heavily on this
very "goodness" as perceived.
Charlie Martin (c...@cs.duke.edu,mcnc!duke!crm)
>(T2) Formal proofs of program correctness can provide only relative, in-
>conclusive evidence concerning program performance.
The problems with these statements is that T1 is not ~T2.
NOTHING provides an absolute, conclusive guarentee of program
performance. Certainly testing doesn't. However, words like
"inconclusive" are loaded with hidden meaning, and I don't
want to waste my time doing experiments if I know that they
will be inconclusive. I want to conclude something! As a matter
of fact, program verification lets you conclude that your program
does not have dumb coding errors. Thus, I don't buy either T1 or T2.
>(T3) Formal proofs of program correctness should be the exclusive method-
>ology for assessing software reliability;
>(T4) Formal proofs of program correcness should be the principal method-
>ology for assessing software reliability;
>(T5) Formal proofs of program correctness should be one among various
>methodologies for assessing software reliability.
I don't agree with T3, T4, or T5, either. Program verification has
nothing to do with assessing software reliability. Program verification
make your software more reliable, though, but it doesn't help you measure
the reliability, and "assess" means measure to me.
"So it becomes a matter of how confident we need to be that a system will
perform as it is intended to perform. We cannot know without testing the
system."
Djikstra: Testing never proves the absence of faults,
it only shows their presence.
I usually disagree with Djikstra, but this quote of his seems obviously
true. However, I will counter with a quote of my own
Johnson: If it hasn't been tested then it doesn't work.
Specifications need testing more than anything else. Prototyping
is a way of testing specifications. Program verification doesn't
help specifications very much. Instead, it helps ensure that
the code that you produce matches the specification. Thus, testing
and verification play complementary roles.
However, there are cases where program verification is much better
than testing at showing that a system behaves as expected. For
example, a couple of years ago there were some people who were
killed because an X-ray machine emitted much more power than it
should have. The problem was in supposedly in the user interface;
the operator missed a decimal point or something and specified
too large a number. People claimed that the user interface should
have caught this. Maybe, but if I had been building the system and
I had been told that NEVER, under ANY circumstances, should the
machine emit more than X units of power, then I would have found the
one subroutine that turned on power, and written it as
EmitPower(power) {
if power < X then
do lots of low level stuff
}
Of course, you would have to test that the low level stuff really
implemented the high level abstraction. However, after that you
would KNOW that the program would never ask for the fatal power.
Naturally, if the < operation on the computer breaks, then a
patient might still be zapped, but that is a lot less likely than
operator mistake.
Ralph Johnson -- University of Illinois at Urbana-Champaign
In article <1021...@p.cs.uiuc.edu> joh...@p.cs.uiuc.edu writes:
>
>For example, a couple of years ago there were some people who were
>killed because an X-ray machine emitted much more power than it
>should have.
The correct radiation was released, but it was released without the
proper filter being in place to reduce the amount of radiation that
reached the patient.
>The problem was in supposedly in the user interface;
This is wrong (along with everything in the article that follows this
statement). The problem (actually two problems found so far) was in
the code. I think this wrong information originally appeared in an
article about this incident that appeared in Datamation, and it has been
repeated widely. Although the user interface was far from desirable, it
did not cause the accident -- poor programming did.
My paper describing the incident details will be available in a month or
so, and I will try to get it published quickly -- I have not decided where
to send it yet. In the meantime, please be careful when repeating rumors
about the Therac accident. So far, everything I have seen written about it
has been completely wrong.
nancy
--
Nancy Leveson
In article <1021...@p.cs.uiuc.edu> joh...@p.cs.uiuc.edu writes:
>
>For example, a couple of years ago there were some people who were
>killed because an X-ray machine emitted much more power than it
>should have.
The correct radiation was released, but it was released without the
proper filter being in place to reduce the amount of radiation that
reached the patient.
>The problem was in supposedly in the user interface;
This is wrong (along with everything in the article that follows this