I'm trying to discover the real reason why people do not prove their programs
are correct. I would like to collect those reasons --- even those snappy
one liners that we all use as excuses.
Please forward your comments ( short flames ok) to me by E-mail. I'll post
the replies if there is sufficient interest. Thanks.
Steve (really "D. E.") Stevenson st...@hubcap.clemson.edu
Department of Computer Science, (803)656-5880.mabell
Clemson University, Clemson, SC 29634-1906
--
Steve (really "D. E.") Stevenson st...@hubcap.clemson.edu
Department of Computer Science, (803)656-5880.mabell
Clemson University, Clemson, SC 29634-1906
Theorem provers are too slow and expensive.
Programmers are too lazy to write assertions and invariants, especially
for complex code (which if they weren't lazy wouldn't be so complex).
Efficient compilers for programming languages that support automatic
verification are not widely available.
Dave
Theorem provers might be slow and expensive, but program provers are
not available at any seed or price. Indeed it is doubtful that such
programs can be written at all. (See the CACM article entitled "
Program Verification: The Very Idea". I can no longer remember the
exact CACM it was in, sorry. My best guess would be Sept. 1986 or 1989.)
>Programmers are too lazy to write assertions and invariants, especially
>for complex code (which if they weren't lazy wouldn't be so complex).
As was discussed in another new group (comp.ai or comp.object), assertions
have little to do with proving a program correct. The best they can
do is prove that the assertions are met. While this is a good thing,
it does not PROVE anything. How do you prove that the assertions assert
anything useful?
Blaming the programmers is a cheap shot. If that were the only problem,
then some programs would be proven correct, the fact almost no commercial
or military programs have ever been proven correct suggests that the
technology is not available. Remeber that assertions are not nearly
enough.
Dave Decot makes it sound like there is a choice. You can choose to
prove your program correct, or not prove it correct, but this is false.
With current technology you can not prove a (real-world) program correct.
Joshua Levy
-------- Quote: "If you haven't ported your program, it's not
Addresses: a portable program. No exceptions."
jos...@atherton.com
{decwrl|sun|hpda}!athertn!joshua work:(408)734-9822 home:(415)968-3718
Theorem provers might be slow and expensive, but program provers are
not available at any seed or price. Indeed it is doubtful that such
programs can be written at all. (See the CACM article entitled "
Program Verification: The Very Idea". I can no longer remember the
exact CACM it was in, sorry. My best guess would be Sept. 1986 or 1989.)
This article is only of interest because of the resulting CACM Flame-Fest
that ensued. The article itself merely makes the obvious statement that
a program-proof is wrt some set of axioms about the behaviour of the hardware
that the program runs on. The article's author then concludes that you "cannot
prove programs correct" because the hardware may not satisfy the axioms.
Personally, I would be very happy with a program that would only malfunction
if the hardware adds 1+1 and comes up with 3.
What the author (and apparently the poster) fails to realize is that the
purpose of a program proof in the real world is to increase your confidence
that the program will do what you want it to do. Sure, some academics with
axes to grind (and grants to obtain, and tenure to achieve) indulge in some
faulty advertising, but anyone seriously working in the area is aware that
a proven program can still fail to do what the user wants it to do.
Of course the hardware axioms could
be unsatisfied, the proof itself could be faulty, a faulty compiler could be
used, etc., but that does not eliminate the fact that a program proof
increases your confidence that the program does what you want it to do.
As was discussed in another new group (comp.ai or comp.object), assertions
have little to do with proving a program correct. The best they can
do is prove that the assertions are met. While this is a good thing,
it does not PROVE anything. How do you prove that the assertions assert
anything useful?
This comment applies to any technique that can be used to increase your
confidence in a program, and is totally pointless. To repeat, the point of
a program proof is to increase your confidence that the program does what
you want. Neither it, nor any other technique will guarantee it. Sometimes
testing is better, sometimes code walkthroughs, usually a combination of
all the various techniques is best.
Geoff Clemm
I can't answer the question "why I don't prove my programs correct"
because I do. All the time. As a matter of course. Almost without
exception.
As a practical matter, the issues Fetzer raised in his diatribe passing
falsely as a technical article can be largely ignored, both from the
standpoint of engineering and mathematically. You simply must know
where to draw the line. You assert a semantics for whatever you can't
prove, assume that as part of your theorey, and proceed. when you
realize your code, you firewall the parts which are assumed but unproven
with assertions that are executed at run time, and ensure that the
semantics you assumed are the semantics you got. Thus as a matter of
engineering, Fetzer's Dilemma -- which we might state as "how can
mathematics which is deductive apply to the real world, about which we
can only gather data and validate hypotheses?" -- can be (almost)
ignored. All the issues raised are issues that apply equally to other
pure vs. applied mathematics dichotomies; all must be handled by the
(implicit or explicit) assumtion that our physical models are a
sufficiently good approximation of the "unspeakable" real world.
But, in fact, these issues had been examined at some length in the
formal methods community. Had Dr Fetzer been conversant with the
literature, he would have known it; had the reviewers been
appropriately chosen, they would have called him on it. The fact that
Dr Fetzer's article was published without appropriate examination of
the literature is failure on the part of CACM which I think is very
hard to excuse.
*On the other hand*, Fetzer's Dilemma does expose an unfortunate
tendency in some parts of the formal methods community that we might
term "Hoare's Folly." Tony Hoare has said in print that proven
programs remove the need to debug *and to test*. For the reasons that
Fetzer described, this is clearly untrue. You don't know the way the
apple tastes until you taste the apple. The program as a mathematical
construct can be proven; the realized program as a physical construct
-- a sequence of executable operations in a physical memory and their
execution in a physical processor -- can AND MUST be tested.
It is a myth that programs must be verified by theorem provers to be
proven. Hand proofs can be done; in several small experiments I have
seen very large improvements in quality as measured by number of
defects seen per thousand lines of code (I know, I know, but propose a
better measure and I'll use it.) These are only slightly better than
pure anechdotes; I am currently looking for research funding to perform
experiments which would more strongly confirm the hypothesis.
It is a fact that many of the clients or customers for very highly
trusted programs require verification using a mechanical prover or
proof checker. I am willing to work with one when it is required, but
there is as yet little evidence that they provide more certainty than
hand proofs, and hand proofs are lots cheaper. I can't help but think
that the use of theorem provers in this context is often a case of the
old myth that if it's been blessed by a machine, it is somehow ennobled
and incapable of error.
(I'm not claiming that theorem provers are useless, either. There are
contexts in which they can be very useful. There is ongoing research
which will help to delimit these contexts. We need to know more about
it.)
Program proving isn't a panacea: among other things, a proof doesn'
t tell us a damned thing about the suitability of the requirements to
the user's needs. I don't see how formal methods can help much with
that question, but there is a lot of "requirements engineering"
literature that I should read before I have an informed opinion.
Pant, pant.
These are, of course, only my opinions. They are not necessarily held
by Duke University, the Department of Computer Science, or any of my
consulting clients, and would probably be bitterly contested by some
subset of the above.
Charlie Martin (c...@cs.duke.edu,mcnc!duke!crm)
Theorem provers might be slow and expensive, but program provers are
not available at any seed or price. Indeed it is doubtful that such
programs can be written at all. (See the CACM article entitled "
Program Verification: The Very Idea". I can no longer remember the
exact CACM it was in, sorry. My best guess would be Sept. 1986 or 1989.)
Could someone post the exact reference, please ?
Periklis Tsahageas
Software Eng. IV
Imperial College, London
In my corner of the real world, most programs consist mainly of operating
system and other outside vendor supplied subsystem calls glued together with
rather trivial computations. It seems to me that it is futile to attempt to
prove my programs correct if the low level operations (from my point of view)
are not rigorously defined (let alone proven to be correct themselves).
Thanks . . .
John Kimball
Domain: jkim...@src.honeywell.com Honeywell Systems and Research Center
postm...@src.honeywell.com Computer Sciences/Software Technology
uucp: <any-smart-host>!srcsip!jkimball 3660 Technology Drive, MN65-2100
voice: 612/782-7343 fax: 612/782-7438 Minneapolis, MN 55418-1006
Well, I don't excuse it. Has anyone noticed the dumbing down of CACM in
order to try to get a broader membership? Long gone are the days when
you could find mainly articles like "On the Probability Distribution of
the Values of Binary Trees", "Application of Game Tree Searching Techniques
to Sequential Pattern Recognition", and "Complex Interval Arithmetic" (these
titles taken from random from CACM, Vol. 14, No. 2, Feb. '71 - and, BTW, the
sixties issues were even better).
Instead we get whole issues devoted to "Computers and Society", which although
an important subject, is not one which can be very usefully applied to most
Comp. Sci. professionals (although we build the tools, we often have little
say in usage policy). The inclusion of more "practicum" articles rather than
research is another sign of decline. And the final nail in the coffin
has been the recent inclusion of a Personal Computing column. Any "technical"
magazine which allows one of these rotten apples into its pages has never stayed
"technical" long. I don't know if it's because the editorial policy has
shifted before the column appears or if it's just some sort of karmic influence.
However, I still believe that before long we will be seeing articles on "How I
Used Lotus 1-2-3 to do a Least-Squares Analysis" or others of its ilk (after
all, we do have to increase membership, eh?).
All in all, I wish that I could drop the $30 charge for CACM and keep my
membership for Computing Surveys, ToPLAS, SIGArch, SIGPLAN, etc. However,
ACM will not allow me to do this. It is becoming more and more obvious to
me that CACM is becoming more of an embarrassment than a flagship publication.
Maybe it's just that I'm grumpy today. Has anyone else noticed this, or am I
just getting continually more grouchy as the years go along?
--
Frank A. Adrian
Mentor Graphics, Inc.
fra...@mntgfx.com
I've been trying to follow this string, remembering those days in college
when the instructor indicated that program proof of correctness was
possible, but intractable. Now someone claims to do it. Apparently,
often.
o- Do you use a program or tool to do it?
o- How complex a program are we talking about?
o- If a tool is used, how would I go about acquiring it? If a tool isn't
used, what literature (books, articles, or pamphlets) is available to
the "engineer in the field"?
And, if this posting makes me sound ignorant or ill-informed, please
be kind in your flames...
Mark Runyan
SCENARIO: you're working for MicroS*ft on their new whiz bang C++
compiler (I don't even know if they *have* a C++ compiler :-). You
work hard at your assignment (the parser, for example). You tell your
boss that you think it works, but you have discovered that your code
is in the class of undecidable programs, being transformed from
Godel's famous 1931 example.
You think your boss will be pleased? I doubt it!
Indeed, I doubt *ANYONE* on the entire net has *EVER* written a
real-life program which couldn't in theory be verified. We all (well,
except for a few of those who posted earlier) know that finding a
decision procedure for the predicate calculus (called the
"Entscheidungsproblem" and "Hilbert's Program" in various places) is
impossible, and therefore the halting problem and program verification
are undecidable.
But this is a *theoretical* concern which *rarely* impacts our practical
"here's a program, boss" world.
The fact of the matter is that program verification is *hard*. It's
limitation in everyday life is that it's *hard*, not that it's undecidable.
Furthermore it *is* semi-decidable. Ie: there exist procedures which
will always detect that correct programs *are* correct. The problem
is that no *algorithm* can exist for this, since somewhere out there
there exist programs which will defy proof (or perhaps they are false
but they defy finding a counter-example).
So don't think you can't find *anything* via program verification.
It's just that you can't find *everything*.
Marshall Cline
--
===============================================================================
Marshall Cline/ECE Department/Clarkson University/Potsdam NY 13676/315-268-3868
cl...@sun.soe.clarkson.edu, bitnet: BH0W@CLUTX, uunet!clutx.clarkson.edu!bh0w
Career search in progress: ECE faculty. Research oriented. Will send vitae.
===============================================================================
There does not seem to be much recognition by anyone in any of the
groups which are discussing this subject that there is a difference
between *UNDECIDABILITY* and *INCOMPLETENESS*. (Sorry, this isn't
meant to be a flame, I just don't want to sound like I'm picking on you
in particular, Marshall.) The fact that the halting problem is
undecidable does not mean that we cannot prove whether or not a program
halts. What it does mean is that there is no algorithm which will come
up with the answer. The impact of this is that there can be no program
which will take specification and implementation as inputs and decide
whether the implementation is correct. However, this does not mean
that there is no proof of correctness. It certainly does not mean that
there is no proof checker. In other words, there can be a machine
which will take specification, implementation and `proof' of
correctness and decide whether the `proof' really is a proof.
Insomuch as a verification relies on `theorems' for which there is no
known proof, I for one would much prefer to know that the theorem is
taken as an axiom. It is not beyond the state of the art to have a
verification theory which is complete whenever the theory of the
underlying data types is complete (i.e. the things you want to
manipulate in the program). This is the best we could hope for. It
means, for instance, that we do not know a proof that
IF ((n>2) AND (a>0) AND (b>0) AND (c>0) AND
((a^n) + (b^n) = (c^n)) THEN p ELSE q
is equivalent to q under the assumptions that a, b, c and n are all
integers and we have some way of evaluating the equality correctly.
On the other hand, anything which cannot be proved can be reduced to a
statement in the theory of some of the data types which cannot be
proved. I think most programmers would shy away from depending on such
theorems for correctness without some explicit recognition of the fact.
Geoff Barrett
Fetzer, James H. "Program Verification: The Very Idea."
Communications of the ACM, Volume 31, Number 9 (September 1988), pp. 1048-1063.
Hope this helps.
chris...
--
c...@tove.umd.edu Computer Science Dept, U. Maryland at College Park
4122 A.V.W. 301-454-8711 <standard disclaimers>
As a publication which is directed to ALL MEMBERS, CACM must publish
material which is relevant to its audience. Those who want narrowly
targeted technical material should read the more specialized technical
journals. CACM, which once seemed to feature more Greek letters than
ASCII characters, has now become a (relatively) lively and readable
publication. Given that CACM's audience includes both researchers
and practitioners, it would seem that a judicious mixture of research
and practicum would be most appropriate. Research results need to be
communicated to practitioners in a readable form, and practical
considerations should help to guide researchers in their search
for things which will be of significant benefit to society. CACM
serves to prevent either community from becoming too absorbed in
itself, and facilitates communication within the computer science
community.
> All in all, I wish that I could drop the $30 charge for CACM and keep my
> membership for Computing Surveys, ToPLAS, SIGArch, SIGPLAN, etc. However,
> ACM will not allow me to do this.
As described above, there may well be good reasons for this policy.
> Maybe it's just that I'm grumpy today.
Let's hope so.
Bill Wolfe, wtw...@hubcap.clemson.edu
True, but for safty critical systems there is verified hardare out there.
The viper chip.
Also safty critical stuff tends to be small and embeded(no os).
There are folk working on verified os's.
The approach that you criticize is similar to that of the IEEE societies
(e.g., the Computer and Communications Societies) and their 1) general
purpose magazines and 2) special purpose transactions. What would you
propose that the ACM do differently? I certainly prefer the ability to
select the technical areas that I pay for by joining SIGs and
subscribing to transactions.
The trend that you have noticed in publications is paralleled in
conferences. While the NCC has fallen on hard times, conferences with a
narrower focus (e.g., CSCW, HCI, OOPSLA and TRI-Ada) are flourishing. I
think these trends are symptoms of growth in both scope and depth in the
CS/IS/SE field(s). Some realignment of ACM publications (baseline 1971)
was necessary. I would hope that our (the ACM) publications board would
welcome suggestions on how to better serve the growing and diverse
interests of our fellow members.
--
/s/ George vmail: 703/883-6029
email: mitc...@community-chest.mitre.org [alt: gmit...@mitre.arpa]
snail: GB Mitchell, MITRE, MS Z676, 7525 Colshire Dr, McLean, VA 22102
>>I've been trying to follow this string, remembering those days in college
>>when the instructor indicated that program proof of correctness ...
Note the term I used, "proof of correctness"...
>Indeed, I doubt *ANYONE* on the entire net has *EVER* written a
>real-life program which couldn't in theory be verified. We all (well,
>except for a few of those who posted earlier) know that finding a
>decision procedure for the predicate calculus (called the
>"Entscheidungsproblem" and "Hilbert's Program" in various places) is
>impossible, and therefore the halting problem and program verification
>are undecidable.
There is a difference (perhaps only in *my* mind) between using the
predicate calculus proof of correctness and program verification. If
we call code inspections and regression testing "proofs", then, I do
prove my programs. I was under the mistaken expression that someone
was using an automated proof assistant. My apologies to netters for
stepping into an argument where I didn't share the same understanding
of the terms.
Mark Runyan
How about:
- Careful review by researchers in the field, so that something like
Fetzer paper doesn't happen again
- "targeted" issues that explore topics in depth with research
papers and in-depth survey papers, such as IEEE Computer does
- A detailed goal for CACM -- right now it seems to try to be
- a technical magazine
- a club news journal
- a practitioner's magazine ("the new Dr. Dobbs")
and it ends up not doing any of these very well
Acutally, at this point ACM has managed to alienate large portions of
its research audience, and recovery is going to be difficult. That's a
pity since it used to be a flagship publication, useful to researchers
and practitioners alike. Right now, there really isn't enough meaningful
content to be useful to many people.
Tim
I guess it depends upon what role you think the "flagship" publication
of the ACM should fulfil. I believe that generalist articles create a
common ground of interest. The SIG publications contain enough technical
literature without filling up CACM as well.
I think we have to acknowledge that CS profesionals play a very
important role in society and issues such as database matching, organizational
impact of IT, computer based elections etc. should be the concern of these
people. I'm always reminded of people like Oppenheimer who have had an enormous
effect on the kind of world/societies we live in. Frankly, I'd rather read
about his pre and post reactions to the Manhattan project or something like
that then another paper on "how I made a bigger bang", important though
that might be.
I guess its a sense of balance that we should pursue. I certainly
don't agree that CACM has been "dumbed down" by being more generalist. If
anything, many of the problems raised in the CACM generalist pieces are
more difficult to solve than the papers which describe a problem with say,
data structures and "How I did something clever with B-Trees" to solve it.
The latter is probably more relevant to our daily work, but we need a
perspective beyond this as well ? :-)
Cheers.
__ _______________________W_(Not Drowning...Waving!)___________________________
Perry Morrison Ph.D
SNAIL: Maths, Stats and Computing Science, UNE, Armidale, 2351, Australia.
per...@neumann.une.oz or pmor...@gara.une.oz Ph:067 73 2302
With regard to the former. While some of you may be involved in
projects with verified code; most of us are forced to use UNIX, VMS,
MSDOS, CMS, ORACLE, USENET, TCP/IP and other *useful* systems which
are most definitely *not* verified. The compilers are not verified. The
hardware is not verified. And we are writing software for use
by human beings whose behavior must be included in the model of the
system we might attempt to verify. Now, this is not to say that the real
world of "software engineering" would not be improved by "proven"
programs, nor even to say that systems *cannot* be proven correct. But
there is a difference between mathematical models of computation and
practical implementations of computing systems (just as there is a
difference between theoretical and applied mathematics). Whether
Fetzer's arguments are correct or not, the questions raised are
important for the practicing programmer.
Now, ACM is not just for grad students and researchers. There are
various archival journals (e.g. JACM) published by ACM which are
not "dumb." For the professional practitioner there are many real
societal issues that are rightfully addressed in CACM. Personally,
I would prefer more "Programming Pearls" and "Literate Programming"
type content and less Greek gobbledegook; however, those of you
fortunate enough not to have to get working systems out the door
in the next 3 months might have other preferences. And that's okay.
--
Brad Sherman (b...@alfa.berkeley.edu)
With regard to the former. ... Whether
Fetzer's arguments are correct or not, the questions raised are
important for the practicing programmer.
That is exactly the point. The questions are important. The problem with
the CACM is that the article was a travesty. The arguments ranged from
trivial to misleading (at least they were not actually false). There was
a recent posting in this news group that was of far higher quality. Back in
the 60's, CACM would have published an article that accurately portrayed
the current state of program proving, and detailed a new result in the
area, understandable by a non-expert that was willing to work at the article.
Perhaps it is not CACM's fault ... perhaps (shudder) those are the best
articles that are being submitted. The problem is that IEEE Computer is
doing a much better job, and THAT indicates that it may well be something
wrong with CACM.
Geoff Clemm
***************************************************************
**** For a large class of programs, "correct" is UNDEFINED. ***
***************************************************************
In general, the big, real-world programs that we spend big bucks to
create, things like telephone systems, air traffic control, SDI, 1-2-3,
and "unix" don't support the abstraction of "correct." Their
specifications are partly on paper, but mostly exist as concepts of
greater or lesser exactness in the minds of a large number of people.
Obviously, these concepts are not identical from one person to another.
When a program functions in ways that most of those people decide agrees
with their concepts, it is said to be "working;" when it functions in
way that most decide disagrees, it is said to be "wrong." (When there is
no majority opinion either way, it is "unfinished.")
I'm afraid that belief that we will someday be able to prove our
largest, most important programs correct has become one of my tests of
native intelligence, along with belief in astrology and support of the
flag-burning amendment.
-- Bob <Mu...@MITRE.ORG>, linus!munck.UUCP
-- MS Z676, MITRE Corporation, McLean, VA 22120
-- 703/883-6688
I just gave a talk on myths of verification, and this was one of the
ones I mentioned. It was true in a sense, a long time ago, when
Floyd-style proofs were the only ones that could be done. It is no
longer true now, in the sense that proof of program texts or notations
that are semanitcally reasonably well behaved can be done. It is still
true in the sense that there are classes of programs and programming
languages which are so badly behaved that it's hard to prove anything
about them. Ada and C are both in this class. Interprocess
communication tends to be difficult, asynchrony is hard (but the
technology is catching up, see e.g. some of Leslie Lamport's work.)
The basic trick comes down to this:
(1) describe your specification as formally as you can. Things like Z
specifications are very handy, but a little mathhematical maturity and
some thought can lead to a document in mathematical prose which encodes
most of the hard parts in a crisp way. (And recognize that there will
be requirements which are not amenable to formality: "the system will
be built in a well-structed fashion"; "the human interface will be easy
to use and consistent." You have to pay lip service to them, because
saying "this is crap" will get you talked about at annual reviews, but
you should generally try to do a good job and challenge them to show you
didn't meet it if pressed.)
(2) describe your program first in a notation that is well-behaved, and
for which the semantics are simple. I use Dijkstra's notation ("the
programming calculus") because it is the first one I used, I'm
comfortable with it, and I find the reasoning rules intuitive. But use
the _Science of Programming_ [Gries] to get a start with it, not
Dijkstra's book.
(3) complex programs, espeically programs that depend on the system in
which they are embedded (e.g., window programs), will have things that
you can't prove, but must assume. This need not stop you. You must
first try to understand those things formally; second, state them
formally in the semantic formalism you are using. I would, for example,
try to state the weakest-precondition specification for a window routine
using informal arguments. Third, when you write your program, make sure
that you TEST, yes test, the program with particular attention to the
things you couldn't prove.
As an aside, this does NOT mean that there is no "proof" because the
proof depends on components that are unproveable; it does not. What
you are doing is introducing new axioms into your theory; these axioms
are statements about the semantics of the operations which you can't
prove. As in any usual deductive system, either these statements are
theorems or they are not. If they are theorems (i.e., the unproven
chunks do behave as specified) then you're in fat city. If not, you
will have a defect there, so long as the defect in the underlying stuff
is exercised by what you write. Conveniently, most of these things are
also things which are extremely well tested, so you have good evidence
to believe that it will work. This is very much like the way
mathematics has been done in history: Euclid's axioms are formal
statements of things that were observed to be true in many cases. The
formal system derives results that always correspond to reality on
Earth so long as the scale is limited. But when you get outside of the
range that Euclid could observe, or the limits of classical Greek
measuring instruments, you observe the axioms do not correspond to
truth in the large.
As another aside, there is a sort of competing body of work on the IBM
Software Clean Room. I don't have references right at hand, but they
should be easily found. They use a notation that I consider exceedingly
ugly, but they get good results in their hoard-of-programmers
environment. I don't personally think their methods are completely
workable for smaller groups, or groups with less formal structure, which
also describes the kind of group I like to work in.
(4) Code the program in any convenient language. I use C or C++ almost
exclusively. This coding is translation from the design notation to an
executable notation; you should argue for each translation that the
semantics of the executable match what you describe in the design. This
is easier than it sounds, but you have to be careful about places where
the language is wierd. One that I've run into recently is that there
are cases where a[i] is NOT the same as *(a+i), in all compilers I can
test with.
>o- Do you use a program or tool to do it?
Not regularly, although I have done so. It is much harder using tools
to do the proofs, in general.
>o- How complex a program are we talking about?
I've done programs up to about 10K source lines of C. The IBM Clean
Room people have done several bigger ones with many programmers working
together.
>o- If a tool is used, how would I go about acquiring it? If a tool
isn't used, what literature (books, articles, or pamphlets) is
available to the "engineer in the field"?
As I said above, I like Science of Programming by Gries. I have a book
called _Program Derivation_ by Geoff Dromey (Addison Wesley 1989) which
looks pretty good, but I haven't had time to read it closely. I hope
someday to write The Ultimate Book On Proof In Software Engineering, but
I wouldn't hold my breath if I were you. Therea re a number of other
books that other may recommed, which I like more or less well.
Charlie Martin (c...@cs.duke.edu,mcnc!duke!crm)
In real programs, the whole halting problem (and the stronger
equivalence problem) do not apply, because all real realizations of
program are finite state machines; both of these are decidable for
finite state machines.
If someone wants to claim they are NOT finite state machines, I want to
see your machine with infinite memory registers.
Charlie Martin (c...@cs.duke.edu,mcnc!duke!crm)
That simply isn't true. Fetzer's argument is precisely that program
proofs are flawed on the face, that no proof can ever give insight or
certainty to the behavior of a real program on a real machine, because
of the distinction between the mathematical model and the physical
world.
But one can argue on the same basis that euclidian geometry has no use
and gives no insight, because the same distinction between mathematical
model and physical world exists; worse, Euclid's axioms for plane
geometry can be demonstrated false in the real world. Thus it is
useless and morally wrong for surveyors to learn euclidiam plane
geometry, because the real world doesn't fit the model.
Furthermore, the questions raised -- while they are real and significant
questions, and must be considered, and while there clearly are people
who needed the reminder -- have been explored in the verification
literature. They form the basis of at least three research efforts I
know about (the CL Inc Trusted Stack, the Odyssey hardware verficication
efforts, and the UK verified hardware/software stuff). Fetzer's paper
did not explore that existing research, and further used very old
publications exclusively. Fetzer came up with a real issue but made no
contribution other than filling a number of pages, because this
literature was not reviewed. It was an extremely poor showing on the
part of CACM; my personal opinion is that the editors involved should
resign.
Charlie Martin (c...@cs.duke.edu,mcnc!duke!crm)
This is quite true. It does not mean, however, that there are not
PROPERTIES of such programs which we want them to have which we could
conceivably formally verify.
I agree completely that we cannot in general say what it means for
something like an operating system to be "correct". We therefore can't
prove the OS "correct" because we don't even know what that means.
Even if we did take a stab at writing down in complete detail
everything we wanted the OS to do, it would probably wind up being so
huge that we'd never be able to verify it, formally or otherwise.
In general, though, we don't consider it critical that a program like
an OS perform EXACTLY as we'd like it to in every single instance.
Imagine a system which must function in a faulty environment. We might
consider it critical that the system fail safely, that is, shut down
rather than making certain kinds of mistake. It is conceivable that we
could formally specify and verify that such a system fails safely in
this sense without proving that the system always did exactly what we
wanted it to. In fact, the system might still have lots of bugs in it,
and make a host of less critical mistakes.
Even if our specification only covers SOME of the critical requirements
of the system, it may still be worthwhile to verify that the system has
the properties which ARE covered by the spec.
In a nutshell: formal verification aims to prove that programs have
desirable properties, not that they are "correct" in some generic
sense.
P.S. Please no responses arguing with my informal specification of
what it means to "fail safely". I understand that fault tolerance is
not so simply stated.
--
Ian Sutherland ian%orava...@cu-arpa.cs.cornell.edu
Sans Peur
> No one has said this yet, though James Adcock and Joshua Levy have
> alluded to it:
> **** For a large class of programs, "correct" is UNDEFINED. ***
> In general, the big, real-world programs that we spend big bucks to
> create, things like telephone systems, air traffic control, SDI, 1-2-3,
> and "unix" don't support the abstraction of "correct." Their
> specifications are partly on paper, but mostly exist as concepts of
> greater or lesser exactness in the minds of a large number of people.
> Obviously, these concepts are not identical from one person to another.
It is true that a proof of correctness requires that you come to some kind
of agreement on what the program should do. I find it amusing
(although perhaps a bit sad ...) that this is used as an argument
_AGAINST_ using proofs of correctness.
You prefer your air traffic control to exist "as a concept of greater
or lesser exactness in the minds of a large number of people" ?
I realize that it involves more mental effort to decide exactly what you
want your program to do, and then to make it do that, than it does to hack
something together and say "the system does some combination of whatever
vague idea each programmer had in mind while they were hacking".
I propose that the effort be made ... certainly your users would appreciate it.
You will find that anyone trying to maintain the system will appreciate it
as well.
Geoff Clemm
The fact that the requirements for a very large system are not comletely
and unambiguously specified (or understood) by either the user or the
procuring agency, has nothing to do with whether the implementation of
that system is based on "hacking" or accepted software engineering
practices (including extensive requirements analysis).
Indeed, the larger the system the less likely it is that anyone knows
"exactly" what the program should do AND the less likely it is that the
development team can "hack something together" that can even be
satisfactorily integrated much less satisfy the large number of
requirements that have been expressed.
Yes--it started, IMHO, about the time they switched from the black, white, and
blue covers. It was truly sad to see the journal that once gave us papers
such as in the twenty-fifth-anniversary issue shortly thereafter run a
comparison of CP/M-80 and HDOS.
In a way, though, perhaps it's hard to avoid--the neat papers these days
probably go into the more specialized publications. (And even the remodeled
CACM sometimes has some worthwhile stuff--the one that sticks out in my
mind was a study of the politics of introducing or changing a computerized
system.)
James Jones
>In article <1990Jan22....@mentor.com> fra...@mntgfx.UUCP (Frank A. Adrian) writes:
>>Well, I don't excuse it. Has anyone noticed the dumbing down of CACM in
>>order to try to get a broader membership?
>Yes--it started, IMHO, about the time they switched from the black, white, and
>blue covers. It was truly sad to see the journal that once gave us papers
>such as in the twenty-fifth-anniversary issue shortly thereafter run a
>comparison of CP/M-80 and HDOS.
One wonders what the 50th anniversary issue will look like :-)
>In a way, though, perhaps it's hard to avoid--the neat papers these days
>probably go into the more specialized publications. (And even the remodeled
>CACM sometimes has some worthwhile stuff--the one that sticks out in my
>mind was a study of the politics of introducing or changing a computerized
>system.)
Yes, there is a lot of truth in this and the statements put forth by
others that the aim of CACM has changed over the years. However, I think
that they have watered things down so much that they miss the mark completely
most of the time. Let me explain.......
I rarely find interesting papers in CACM anymore. I am a practitioner,
not a researcher. Although I tend to pay attention to research work more
so than the average practitioner. I do subscribe to some of the SIG
publications for areas that I am interested in. However, I can't afford
to subscribe to *all* of the "Transcripts on....." series of journals.
Unless your organization has a well stocked library, you miss out on these
papers.
Most of the CACM papers are useless to me. I would much prefer that
perhaps some of the papers that go into the specialized journals could
go into CACM later on for exposure to the general ACM population. Perhaps
the papers could even be rewritten for a less knowledgable audience, but
not as a tutorial or survey.
The computing literature is extremely hard to keep up with. Even
for a specialist there are so many journals, societies, conferences,
and workshops to keep up with. For the practitioner, who may need to
keep abreast of many areas at once, this is a nightmare. A flagship
publication, like CACM, could help to disseminate some of the more
outstanding work to the entire community.
Someone else stated that the best part of CACM these days are the
Programming Pearls and Literate Programming columns. But how often
have we seen this during the past year?????
George W. Leach AT&T Paradyne
(uunet|att)!pdn!reggie Mail stop LG-133
Phone: 1-813-530-2376 P.O. Box 2826
FAX: 1-813-530-8224 Largo, FL 34649-2826 USA
Actually, all that is needed is for a machine to have *infinitely
expandable* memory, and mine has that -- I can always go out and buy
another disk. (One of my professors once described a Turing Machine
as having a tape that ran out the door and into a paper factory.)
In any case you can probably show that in order to solve the halting
problem for FSM's with < N states, you need a FSM with > N states.
So the problem is still effectively unsolvable on any given set of
computers -- i.e. no computer in the set can solve the halting problem
for all computers in the set.
IMHO the *real* problem with verification is that it gives you a false
sense of security because (1) you don't know whether what you
initially specified is what you really wanted, and (2) you don't know
what real requirements you left out. (E.g. reasonable bounds on time,
memory, and accuracy.)
--
\ Steve Savitzky \ ADVANsoft Research Corp \ REAL hackers use an AXE!
\ st...@arc.UUCP \ 4301 Great America Pkwy \ #include<std_disclaimer.h>
\ arc!st...@apple.COM \ Santa Clara, CA 95954 \ 408-727-3357
\__________________________________________________________________________
I'm not sure that you're drawing a reasonable analogy here. Do surveyors
routinely include proofs of correctness with the maps that they construct?
--
Brad Sherman (b...@alfa.berkeley.edu)
You don't really believe that sin^2 theta + cos^2 theta = 1, do you?
But even if we use Turing machine or lambda-calculus as our "models",
the halting problem and related phenomena have little relevance. Just
because we can prove that an *arbitrary* program will halt given an
arbitrary input does not mean we can't prove whether a *specific* program
will halt or not. And we're always working with *specific* programs!
The next time you come up with a useful program which inherently includes
the statement "You can't prove me correct", please let me know!
There is a parallel in the reaction of mathematicians to Godel's
Incompleteness Theorem, where they bascially said "Ok, we can't prove
everything -- but we can still prove a lot of things."
Mike Lutz
Mike Lutz Rochester Institute of Technology, Rochester NY
UUCP: {rutgers,cornell}!rochester!rit!mjl
INTERNET: m...@csrit.edu
Sigh. Ever get the feeling you were trying to have a discussion across
the Grand Canyon? No wonder recent graduates often spend their first
few years in the "real world" in a state of shock. It's a second birth
trauma.
My point, of course, is that "exactly what you want your program to do"
is not decidable. No one person can grasp the totality of a system of
the size we're trying to create. The interactions of parts are beyond
our ability to understand. There are an immense number of logic
contradictions, misunderstandings, and blank areas.
Remember, no two people can be assumed to take the same meaning from a
single word like "safety." How, then, can we possibly prove the meeting
of a requirement that uses that word? Human limitations aside, how
could we ever program a computer to consider all possible functional
interaction of a system and decide if the result is "safe?"
Moreover, the world is not static (even at universities). Anecdote:
through a series of accidents, I once found myself in charge of a gang
of COBOL programmers maintaining a payroll system for a national
company, a relatively small system of about 0.5 MLoC. A great deal of
their time was spent changing the code to reflect changes in state and
local tax withholding. With 50 states and a great many towns and cities
churning out new rules, there was always a long queue of changes waiting
to be made, some for rules that had already taken effect. It's not much
of an exaggeration to say that the changes were continuous. What is a
"correct" program in such circumstances? ATC, telephone, C3I, and other
real-world software products have very similar situations; think of the
poor bastards trying to keep running a telephone company's billing
system these days!
Mr. Clemm (Ah Clemm's brother?) suggests that we "make the effort." I
hope he realizes that we'll have to enlist him, his children, his
grandchildren, and every other human being for a century or so to do it.
Heck, we wouldn't even be able to get everyone currently working on one
of these systems to whistle "Yankee Doodle" together.
Does anybody know where they can be found ?
Equivalently:
Since everybody has to get CACM it is inappropriate to make it so
poorly referreed and low content that the majority of readers throw
it away unread.
The concern is NOT that there is practical
information in CACM. The concern is for when CACM deals with
a subject in a manner that is misleading to practitioners, either by
trivialization or by misleading statements, and for when CACM replaces
applicable resarch (not esoterica, but material that may affect how
you do your job in 1-5 years) with tips on how to run a pc.
Tim
This is an excellent idea. Science Magazine, the flagship publication of
the AAAS, is a good example of such a publication. I find it fascinating
although I am far from an expert in most of the fields it covers. In
contrast, I do not even bother to remove the plastic cover from my CACM
these days.
--
Nancy Leveson
This is not really a problem with verification, it's a problem with
people's expectations. Any time you want to apply mathematics to a
real world situation, you have to (1) model the real world situation by
a mathematical object, and (2) state the properties you'd like the
situation to have in mathematical terms. Any time you do this, you
are vulnerable to (1) mismatchs between the world and your model, and
(2) mistakes in your expression of the desirable properties formally.
In most engineering disciplines, mathematics is used to increase
confidence that undesirable things will not happen, but is not
regarded as an absolute guarantee. If people had the same attitude
about verification, they wouldn't get a false sense of security,
they'd get a realistic sense that their security had been increased,
and they'd know that they still have to test.
Unfortunately, verification has been "oversold" by a collection of
overly zealous advocates. Not everyone working in the field is such
an advocate.
Cleanroom Software Engineering
Harlan D. Mills, Michael Dyer, and Richard Linger
IEEE Software, September, 1987
You will always always always run into a limit: the number of slots for
disk controllers, the number of disks per controller, the addressing
capability of the bus, something.
>
>In any case you can probably show that in order to solve the halting
>problem for FSM's with < N states, you need a FSM with > N states.
Given an FSM, you can immediately derive the set of all input strings
which lead to acceptance, and all other strings which do not lead to
acceptance. "Halting" per se isn't a problem in a deterministic
machine; it always halts when the tape runs out. And equivalence, which
is "harder," can be proven by reducing two machines to canonical
(minimized) form, and comparing.
>
>IMHO the *real* problem with verification is that it gives you a false
>sense of security because (1) you don't know whether what you
>initially specified is what you really wanted, and (2) you don't know
>what real requirements you left out. (E.g. reasonable bounds on time,
>memory, and accuracy.)
>
This is a semantic problem; you're using the word "specification" in a
different sense than I am. In my terminology you are saying that you
don't know if what you stated in the requirements really meets your
needs, and you're right. However, Current methods give no assurance
that the results you get from programming really correspond to what you
thought you asked for. The point of verification is to try to get as
close as possible to having the suitability of the realized program be
dependent on the suitability of the requirements stated.
Charlie Martin (c...@cs.duke.edu,mcnc!duke!crm)
Oh, crap. I'm sorry, but this is simply the single most assinine
argument that I've seen proposed. Never in twenty years of software
engineering have I seen a system where the question "does it work?" was
undefined. Sometimes it's POORLY defined, sometimes the definition
turns out to have contradictions, sometimes it turns out to be vacuous,
but all of them have some statement of the expected behavior.
Telephone systems, contrary to your assertion, have extremely stringent
definitions of what is "correct", stated in everything from CCITT
interface specifications to performance statements (e.g., 99.95 % correct
call completion at standard load.)
Program proof does in fact require more care in specification: you must
now state the conditions under which the program is considered "correct"
is a more formal way. The specification must be derived from
requirements, and making sure that the requirements are adequate is a
hard problem. However, the only sense of correctness that makes sense
in terms of program correctness is one which states that the program
produced meets the specification.
>I'm afraid that belief that we will someday be able to prove our
>largest, most important programs correct has become one of my tests of
>native intelligence, along with belief in astrology and support of the
>flag-burning amendment.
I only include this paragraph as an apology in advance for the somewhat
fiery tone of my reply. Clearly, I don't agree.
Charlie Martin (c...@cs.duke.edu,mcnc!duke!crm)
SCENARIO: you're working for MicroS*ft on their new whiz bang C++
compiler (I don't even know if they *have* a C++ compiler :-). You
work hard at your assignment (the parser, for example). You tell your
boss that you think it works, but you have discovered that your code
is in the class of undecidable programs, being transformed from
Godel's famous 1931 example.
You think your boss will be pleased? I doubt it!
Indeed, I doubt *ANYONE* on the entire net has *EVER* written a
real-life program which couldn't in theory be verified...
[more stuff in the same vein deleted]
This is stupid, and shows that you don't understand the problem.
In my experience, anyone who has written a compiler for a real
language (and I have) KNOWS that the compiler (and in particular the
parser) cannot be verified even in theory. This is true even though
he may have a formal definition of the source language as complete as
the PL/I standard. (Actually the compiler writer more often knows of
remaining bugs, so he knows that it is not correct.)
In your specific example of a C++ front-end, I would take this
statement as evidence of promoteability in a junior programmer. The
statement is correct, of course, and indicates that he actually
understands what he was taught. Before you disagree (or flame), go
back and read Godel. His famous paper is all about recognizing
languages.
[Later:]
Furthermore it *is* semi-decidable. Ie: there exist procedures which
will always detect that correct programs *are* correct. The problem
is that no *algorithm* can exist for this, since somewhere out there
there exist programs which will defy proof (or perhaps they are false
but they defy finding a counter-example).
Nonsense again, you have it backwards. There is a large class of
programs, including compilers and operating systems, which can never
be proved "correct." It is often disgustingly easy to prove a
compiler or operating system is incorrect.
Note: Security verifications are a different class of animal. It
is possible to prove that an operating system never allows
unauthorized access, even though the system cannot be proved to never
crash. In fact, I know of one case where a certain type of attempt at
unauthorized access, if it got past certain controls could be shown to
always crash the system. Fine, the controls did not need to be
verified. If it only crashed the system if the file existed, the
controls would have had to be verified.
So don't think you can't find *anything* via program verification.
It's just that you can't find *everything*.
True. But the class of large software systems I seem to spend
most of my time dealing with are almost always in the *everything*
part. Proofs of correctness work on parts of the system, and are
very useful in some areas to improve reliability. But when someone
starts talking about verification and parsing in the same breath, or
provably correct operating systems, I get my crucifix, garlic and
wooden stakes. Its time to do another exorcism.
--
Robert I. Eachus
with STANDARD_DISCLAIMER;
use STANDARD_DISCLAIMER;
function MESSAGE (TEXT: in CLEVER_IDEAS) return BETTER_IDEAS is...
I was referring to diskettes, not drives -- I never said that manual
intervention by an operator might not required. Yes, you will run
into limits (including the heat death of the universe); the point is
that a computer can be treated as a Turing machine FOR ALL PRACTICAL
PURPOSES, (Oops; I forgot -- verification isn't practical! :-),
because you can't put a finite upper bound on its number of states.
>>In any case you can probably show that in order to solve the halting
>>problem for FSM's with < N states, you need a FSM with > N states.
>
>Given an FSM, you can immediately derive the set of all input strings
>which lead to acceptance, and all other strings which do not lead to
>acceptance. "Halting" per se isn't a problem in a deterministic
>machine; it always halts when the tape runs out. And equivalence, which
>is "harder," can be proven by reducing two machines to canonical
>(minimized) form, and comparing.
Actually, it is perfectly feasible for a FSM with a finite number of
states to accept an infinite number of distinct strings (e.g. the set
of all strings of digits terminated by a period).
I think you missed my point, which is that whether or not the halting
problem can be solved *in theory* for real programs on real computers,
it cannot be solved *in practice*.
In any case, I don't think that any practical proof methods operate by
treating programs as FSM's.
>>IMHO the *real* problem with verification is that it gives you a false
>>sense of security because (1) you don't know whether what you
>>initially specified is what you really wanted, and (2) you don't know
>>what real requirements you left out. (E.g. reasonable bounds on time,
>>memory, and accuracy.)
>>
>This is a semantic problem; you're using the word "specification" in a
>different sense than I am. In my terminology you are saying that you
>don't know if what you stated in the requirements really meets your
>needs, and you're right.
The problem with "verification" is precisely that the *word*
"verification", with all its mathematical associations, leads to this
kind of semantic problem! (I.e. the expectation that a "verified"
program will work the way the customer wanted it to.) The more
complicated and formal the specification, the greater the likelihood
that (a) the customer will believe it and (b) the spec will be wrong.
As you say, ...
> However, Current methods give no assurance
>that the results you get from programming really correspond to what you
>thought you asked for. The point of verification is to try to get as
>close as possible to having the suitability of the realized program be
>dependent on the suitability of the requirements stated.
How about dropping the term "verification" in favor of something that
more accurately describes the state of the art, like "compile-time
assertion-checking" (or, if appropriate, "coding-time assertion
checking" ("semantics-directed editing?!") and "design-time
consistency checking").
Certifying the Reliability of Software. P. Allen Currit, Michael
Dyer, and Harlan D. Mills. Transactions on Software Engineering.
V. SE-12, No 1, January 1986.
Cleanroom Software Development, An Empirical Evaluation. Richard W.
Selby, Victor R. Basili, and F. Terry Baker. Transactions on Software
Engineering, V. SE-13, No. 9, September 1987.
Does anyone have any more recent papers that cover new ground?
Brian Marick
Motorola & University of Illinois
mar...@cs.uiuc.edu, uunet!uiucuxc!marick
This means that "formal verification" is just as sort of run time lint.
Except, instead of checking for bad arguments, it checks for timing
violations, or something else you care about. I do not see how that can
be refered to as proving a program correct, which is how this thread got
started. You are talking about some sort of supe