Tcl on the pathfinder

39 views
Skip to first unread message

Dan Kuchler

unread,
Jun 12, 2000, 3:00:00 AM6/12/00
to
Several people showed interest in the use of tcl in the Mars
Pathfinder

I found the following by doing a web search:

(The paper is available from the Url)

http://www.neosoft.com/tcl/ftparchive/sorted/misc/Tcl_on_Pathfinder/

Tcl and Concurrent Object-Oriented Flight Software: Tcl on Mars

David E. Smyth
da...@devvax.jpl.nasa.gov
Mars Pathfinder Flight Software Team
Jet Propulsion Laboratory
MIDCOM Corporation

...This paper describes the current early develoment effort,
where we are using Tcl and its object oriented extension itcl,
combined with tclX, blt, and tk, as the language for inter-object
messages, for the monitor and control environment, and for the
inital implementation of several flight softare responsibilities.
As the system develops, the flight software may remain as Tcl, or
it may evolve into C. The similarity between Tcl and C makes the
translation of objects from Tcl to C reasonably straightforward....

I haven't read the paper yet, but it sounds interesting

--Dan

laurent....@uforce.com

unread,
Jun 13, 2000, 3:00:00 AM6/13/00
to
On 12 Jun, Dan Kuchler wrote:
>
> Several people showed interest in the use of tcl in the Mars
> Pathfinder
>
> I found the following by doing a web search:
>
> (The paper is available from the Url)
>
> http://www.neosoft.com/tcl/ftparchive/sorted/misc/Tcl_on_Pathfinder/
>
> Tcl and Concurrent Object-Oriented Flight Software: Tcl on Mars

Yeah, that's the one!

L

--
Laurent Duperval "Montreal winters are an intelligence test,
mailto:laurent....@uforce.com and we who are here have failed it."
-Doug Camilli
Penguin Power! ***Nothing I say reflects the views of my employer***


Dan Kuchler

unread,
Jun 13, 2000, 3:00:00 AM6/13/00
to laurent....@uforce.com

On Tue, 13 Jun 2000 laurent....@uforce.com wrote:

> On 12 Jun, Dan Kuchler wrote:
> >
> > Several people showed interest in the use of tcl in the Mars
> > Pathfinder
> >
> > I found the following by doing a web search:
> >
> > (The paper is available from the Url)
> >
> > http://www.neosoft.com/tcl/ftparchive/sorted/misc/Tcl_on_Pathfinder/
> >
> > Tcl and Concurrent Object-Oriented Flight Software: Tcl on Mars

I actually got ambitious and e-mailed the original author of the paper
to see if there was any followup paper done at a later point in the
project. I thought his reply was interesting, and thought some here might
be interested in reading about it. Enjoy.

--Dan

--------------------------------------------------------------------------

Hi Dan,

Feel free to post this.

No follow up paper was written. I used itcl/Tk (with object extensions) to
prototype the priority based communication strategy that we implemented on
Pathfinder, but I re-implemented it all in C before we flew it because we
could exhaustively test the object-oriented C, while we could not the
interpreted Tcl.

As the paper described, the Pathfinder software conformed (mostly) to the
Law of Demeter. All the parts that were cheap to develop, integrate, and
prove correct were Demeter compliant. LoD is the principle of responsible
objects that act on what they know. Essentially, all methods return null.
This means the objects can be in separate threads, running off message
queues (they don't have to be, but they can be, and you can't tell if they
are or are not). This message based architecture seemed a natural to apply
the Tcl concept of sending scripts around, with each object implementing a
Tcl interpreter. So that was how we started.

The itcl prototype was not intended to be a prototype, it was intended to
be candidate flight code and candidate ground control and monitoring
software. However, when we started doing careful testing, with the
(eventually achieved for LoD compliant code) goal of exhaustive testing,
the interpretive nature of Tcl meant we had to not only test all code
paths and values, but also all types and combinations thereof. That was simply
too tough.

By the time we launched, the entire flight system was implemented in
object-oriented C, and the ground system was a combination of legacy
systems and dynamically created HTML generated predominately by Perl CGI.

Note that a spacecraft system presents a couple of serious constraints.
First, the flight software has got to work without any bugs. When there is
a bug, the spacecraft is often lost. That is almost always the cause of
the end of all spacecracft missions (sometimes hardware failures are blamed,
but that's really the software not being able to deal with failing
hardware). Second, the stuff on the ground can be crufty and nobody cares.

Specifically, the Pathfinder software had three (exactly) bugs in the
flight code. One was that certain options of one command could not be
enabled, but we didn't care (a truncation error in a command word) so
there
was no effect. Second was the famous "wrong bit" in the VxWorks kernel's
message queue semaphore creation function call, which did not enable the
"priority inversion" capability for that semaphore (we changed that on all
other semaphores), and this wrong bit caused the system to halt and reboot
when everything was running so far beyond our wildest imaginations that we
got the entire system running almost saturated across all resources. We
fixed that one bit bug with a patch, and continued. The third bug ended
the mission -- we specifically did not address the scenario of "what
happens when the battery goes totally dead overnight" because we didn't expect
that the battery would be the first thing to go -- we thought most of
the chips would fail due to radiation and thermal cycling within 30
days, 45 tops, whereas there were in fact zero silicon errors for 90 days,
until the battery died one night... So that was a bug of ommission, that we
consciously chose not to implement. We could have, but we spent the time
everywhere else first.

Its an amazing thing to develop 125K lines of real-time multi-threaded
code (I think about 120 threads) running on about 400K lines of VxWorks
code and have exactly three bugs over a nearly year long total mission
life that exceeded all mission goals by very far (something like a
hundred times the expected return of science and engineering data, all
developed in a dozen or so person years.

Because we had that goal, we had to be ruthless about technology
selections.

Nevertheless, that itcl/tk system provided impetus for several other Tcl
efforts at JPL.

Clearly, the most successful Tcl project we started at JPL during the
Pathfinder years was called WWWorkflow, a web based workflow system. The
core of that system is still implemented as a Tcl engine (Pxl is the Tcl
extension for workflow process programming), and is used in several
interesting situations. Now named Oak Grove Reactor, it provides the
workflow system to track changes to Space Shuttle Operations Documents at
Johnson Space Center, about a thousand engineers scattered around the
world. This Tcl system also provides the workflow capabilities for the
Xerox DocuShare web-based document repository (that product is called Dx
for Document Accelerator). In one of my lives, as Chief Architect for Oak
Grove Systems, we have sold thousands of seats of this product
commercially over the past month or so.

See dx.oak-grove-systems.com for a live, downloadable Reactor system (you
do need Xerox DocuShare).

David Smyth
Chief Architect, Oak Grove Systems
CEO and Senior Consultant, Oak Grove Consulting
Chief Scientist, Innerland
(626) 296-6405
dsmyth at www.oak-grove-consulting.com


Kristoffer Lawson

unread,
Jun 14, 2000, 3:00:00 AM6/14/00
to
Dan Kuchler <kuc...@lime.cs.wisc.edu> wrote:
: The itcl prototype was not intended to be a prototype, it was intended to
: be candidate flight code and candidate ground control and monitoring
: software. However, when we started doing careful testing, with the
: (eventually achieved for LoD compliant code) goal of exhaustive testing,
: the interpretive nature of Tcl meant we had to not only test all code
: paths and values, but also all types and combinations thereof. That was simply
: too tough.

A fascinating commentary, but I must admit I didn't really get this part.
What is it in Tcl which prevented it from being as useful as C++? How
is Tcl more difficult to test than C++?


--
- ---------- = = ---------//--+
| / Kristoffer Lawson | www.fishpool.fi|.com
+-> | se...@fishpool.com | - - --+------
|-- Fishpool Creations Ltd - / |
+-------- = - - - = --------- /~setok/

Dan Kuchler

unread,
Jun 14, 2000, 3:00:00 AM6/14/00
to
Kristoffer Lawson wrote:
>
> Dan Kuchler <kuc...@lime.cs.wisc.edu> wrote:
> : The itcl prototype was not intended to be a prototype, it was intended to

> : be candidate flight code and candidate ground control and monitoring
> : software. However, when we started doing careful testing, with the
> : (eventually achieved for LoD compliant code) goal of exhaustive testing,
> : the interpretive nature of Tcl meant we had to not only test all code
> : paths and values, but also all types and combinations thereof. That was simply
> : too tough.
>
> A fascinating commentary, but I must admit I didn't really get this part.
> What is it in Tcl which prevented it from being as useful as C++? How
> is Tcl more difficult to test than C++?
>

The fact that tcl is interpreted. NASA, the military, etc. don't like running
anything that is interpreted because of the problem of validating it. The same
thing can be said for dynamic (on the fly) code optimization (which has been
discussed for some modern compilers/OSes). Because of the mission critical
nature of the software they like to make sure they are testing the *exact* same
thing that will be running in the production system.

I thinnk that NASA and the military like to do a build and then run that static
binary through the tests, and then take that same static binary and run it
on the production system.

Does that make sense?

--Dan

Kristoffer Lawson

unread,
Jun 14, 2000, 3:00:00 AM6/14/00
to
Dan Kuchler <kuc...@ajubasolutions.com> wrote:

:> A fascinating commentary, but I must admit I didn't really get this part.


:> What is it in Tcl which prevented it from being as useful as C++? How
:> is Tcl more difficult to test than C++?
:>

: The fact that tcl is interpreted. NASA, the military, etc. don't like running
: anything that is interpreted because of the problem of validating it. The same
: thing can be said for dynamic (on the fly) code optimization (which has been
: discussed for some modern compilers/OSes). Because of the mission critical
: nature of the software they like to make sure they are testing the *exact* same
: thing that will be running in the production system.

Yes but nobody is forcing anyone to use the dynamics that might cause
pains? Ie. you don't *have* to replace "set" or whatever and it's probably
better not to in cases like the above unless everyone really really knows
what they're doing. Of course I get's that was one of the intents of
mobile programming where scripts are passed back and forth -- so that
you can do dynamic things like that (one of the more fascinating things
about script programming btw). So maybe they decided such a design model
isn't appropriate. Of course, they still wouldn't have to use it with
Tcl.

Not really criticizing NASA's decision. Just wondering about it a bit...

Dan Kuchler

unread,
Jun 14, 2000, 3:00:00 AM6/14/00
to
Kristoffer Lawson wrote:
>
> Yes but nobody is forcing anyone to use the dynamics that might cause
> pains? Ie. you don't *have* to replace "set" or whatever and it's probably
> better not to in cases like the above unless everyone really really knows
> what they're doing. Of course I get's that was one of the intents of
> mobile programming where scripts are passed back and forth -- so that
> you can do dynamic things like that (one of the more fascinating things
> about script programming btw). So maybe they decided such a design model
> isn't appropriate. Of course, they still wouldn't have to use it with
> Tcl.

I guess the point is that the tcl interpreter byte-compiles and interprets the
code at run-time. This means that to truly validate the code you would have
to validate the tcl core and insure that it always does the right thing, then
you would need to validate your script running on top of the core.

It is much easier to do this with an executable. The code is only generated once.
Then you can test whether there were any problems during code generation and
validate that final executable.

--Dan

David Cuthbert

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to
Kristoffer Lawson wrote:
> Yes but nobody is forcing anyone to use the dynamics that might cause
> pains? Ie. you don't *have* to replace "set" or whatever and it's
> probably better not to in cases like the above unless everyone really
> really knows what they're doing.

It's much more subtle than that. Consider the following code (a blatant
example):

if {$some_variable_which_is_nonzero_99_percent_of_the_time} {
puts "Hello world."
} else {
this is a syntax error
}

Exercising this bug is difficult, though it has the advantage of being
imediately observable. Other bugs may be more difficult to exercise *and*
observe. Tcl will parse this statement without so much as a blink. C/C++
compilers will curse at you for syntax errors.

A coverage tool helps immensely here; it indicates which lines of code your
test suite has missed. It's still up to you to write the test cases and
interpret the output, though.

Finally, note that even C/C++ have been derided (at times, though obviously
not in this particular study) as not performing enough type checking for
mission critical software.
--
David Cuthbert
da...@kanga.org

Paul Duffin

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to

If you go to http://www.eiffel.com and follow the "500 million dollar mistake"
link you will end up at
http://www.eiffel.com/doc/manuals/technology/contract/ariane/page.html
This has a link to the official report as to why the Ariane 5 crashed, a brief
summary of the problem and then attempts to prove that this would not happen
with Eiffel.

The problem I have with the 'proof' is that while they can add invariants etc.
to the code which can be compiled into runtime checks, if those checks fail
they simply cause an exception (which is what led to the crash in the first
place).

The only way that the error could have been found before the crash was for
that piece of code to be tested across the full range of its inputs. If
this was done then it would be irrelevant whether the code was written in
Eiffel, C++ or Tcl.

The point I am trying to make is that syntax checking / type checking is no
substitute for perfect test coverage.

Juan Carlos Gil Montoro

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to pdu...@hursley.ibm.com
Paul Duffin wrote:
>
> If you go to http://www.eiffel.com and follow the "500 million dollar mistake"
> link you will end up at
> http://www.eiffel.com/doc/manuals/technology/contract/ariane/page.html
> This has a link to the official report as to why the Ariane 5 crashed, a brief
> summary of the problem and then attempts to prove that this would not happen
> with Eiffel.
>
> The problem I have with the 'proof' is that while they can add invariants etc.
> to the code which can be compiled into runtime checks, if those checks fail
> they simply cause an exception (which is what led to the crash in the first
> place).
>
> The only way that the error could have been found before the crash was for
> that piece of code to be tested across the full range of its inputs. If
> this was done then it would be irrelevant whether the code was written in
> Eiffel, C++ or Tcl.
>
Not really. Adding a precondition to the offending subroutine which states
that it works for accelerations from 0 to 9 (or whatever), together with a
similar statement in the caller which informs it is using accelerations from
0 to 12, could catch the bug during testing when all preconditions,
postconditions and invariants are on.

Juan Carlos---

lvi...@cas.org

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to

According to Paul Duffin <pdu...@hursley.ibm.com>:

:David Cuthbert wrote:
:>
:> Kristoffer Lawson wrote:
:> > Yes but nobody is forcing anyone to use the dynamics that might cause
:> > pains? Ie. you don't *have* to replace "set" or whatever and it's
:> > probably better not to in cases like the above unless everyone really
:> > really knows what they're doing.
:>
:> It's much more subtle than that. Consider the following code (a blatant
:> example):
:>
:> if {$some_variable_which_is_nonzero_99_percent_of_the_time} {
:> puts "Hello world."
:> } else {
:> this is a syntax error
:> }
:>
:> Exercising this bug is difficult, though it has the advantage of being
:> imediately observable. Other bugs may be more difficult to exercise *and*
:> observe. Tcl will parse this statement without so much as a blink. C/C++
:> compilers will curse at you for syntax errors.


My take on things was that it was even more subtle. Imagine a line like:


eval $var $arg

just because a test case covers this line doesn't mean that you've tested
the code ; that only occurs when you have tested every possible value that
var and arg can take on... eval's are the bane of automated testing...

--
<URL: https://secure.paypal.com/refer/pal=lvirden%40yahoo.com>
<URL: mailto:lvi...@cas.org> <URL: http://www.purl.org/NET/lvirden/>
Unless explicitly stated to the contrary, nothing in this posting
should be construed as representing my employer's opinions.

Paul Duffin

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to

You are correct. That does require that they know what the possible range
of accelerations that the hardware can return which of course is probably
tested to desctruction. The Eiffel team should beef up there page to make
it a little clearer how Eiffel would address the issue, especially as it is
going to be read by people who know next to nothing about Eiffel.

laurent....@uforce.com

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to
On 14 Jun, Dan Kuchler wrote:
>
> I guess the point is that the tcl interpreter byte-compiles and interprets the
> code at run-time. This means that to truly validate the code you would have
> to validate the tcl core and insure that it always does the right thing, then
> you would need to validate your script running on top of the core.
>

Hmmm... Weren't they using 7.6?

Darren New

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to
David Cuthbert wrote:
> A coverage tool helps immensely here; it indicates which lines of code your
> test suite has missed. It's still up to you to write the test cases and
> interpret the output, though.

I think the problem is likely with the coverage. In C, you can say

if (w < 0) { this(); } else { that(); }

And you only have to test once for w < 0 and once for w >= 0 (assuming w is
an int).

In Tcl, you can try
if {$w < 0} {this} {that}
and you not only have to test the <0 and >0 values, but also "0.0", "-0.0",
"yada", and so on for w. That's the only way I can think of to explain the
comment about the typeless nature of Tcl.

--
Darren New / Senior MTS & Free Radical / Invisible Worlds Inc.
San Diego, CA, USA (PST). Cryptokeys on demand.
In the worst case scenario, a giant meteor wipes out all life,
in which case we'll have a hard time meeting analyst expectations.

Dan Kuchler

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to
laurent....@uforce.com wrote:
>
> On 14 Jun, Dan Kuchler wrote:
> >
> > I guess the point is that the tcl interpreter byte-compiles and interprets the
> > code at run-time. This means that to truly validate the code you would have
> > to validate the tcl core and insure that it always does the right thing, then
> > you would need to validate your script running on top of the core.
> >
>
> Hmmm... Weren't they using 7.6?
>

I think the paper was written in 1994 so it probably started with something
even earlier than 7.6

I guess that means that byte-compilation isn't an issue, but there is still
the interpreted nature of the language.

--Dan

Cameron Laird

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to
In article <8iajkk$lth$1...@srv38.cas.org>, <lvi...@cas.org> wrote:
>
>According to Paul Duffin <pdu...@hursley.ibm.com>:
>:David Cuthbert wrote:
>:>
>:> Kristoffer Lawson wrote:
>:> > Yes but nobody is forcing anyone to use the dynamics that might cause
>:> > pains? Ie. you don't *have* to replace "set" or whatever and it's
>:> > probably better not to in cases like the above unless everyone really
>:> > really knows what they're doing.
>:>
>:> It's much more subtle than that. Consider the following code (a blatant
>:> example):
>:>
>:> if {$some_variable_which_is_nonzero_99_percent_of_the_time} {
>:> puts "Hello world."
>:> } else {
>:> this is a syntax error
>:> }
>:>
>:> Exercising this bug is difficult, though it has the advantage of being
>:> imediately observable. Other bugs may be more difficult to exercise *and*
>:> observe. Tcl will parse this statement without so much as a blink. C/C++
>:> compilers will curse at you for syntax errors.
>
>
>My take on things was that it was even more subtle. Imagine a line like:
>
>
>eval $var $arg
>
>just because a test case covers this line doesn't mean that you've tested
>the code ; that only occurs when you have tested every possible value that
>var and arg can take on... eval's are the bane of automated testing...
.
.
.
I'm going to rant on this subject some day.

The people involved in NASA--well, I have no argument with
them. They have a lot of constraints on their choices (but
why didn't they use Ada for the "real" version?).

In general, though, the untestability of Tcl is VASTLY over-
estimated. Take the example above: if it's an essential
coding (by which I mean, roughly, that there's no strictly
superior mechanical transformation of the source that yields
indistinguishable functionality), then Tcl's coding is MORE
easily provable than the C equivalent.

Testability is one of the great things about Tcl. It's just
a more sophisticated form of testability than some QA depart-
ments seem ready to accept.
--

Cameron Laird <cla...@NeoSoft.com>
Business: http://www.Phaseit.net
Personal: http://starbase.neosoft.com/~claird/home.html

Tom Krehbiel

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to
If I recall the story correctly neither Eiffel nor any other language would have
caught
the problem. The guidance system and its software was lifted from Ariane 4 and was
assumed good for Ariane 5. A lack of hardware integration testing was the ultimate
cause of the failure (which is a better way to test flight software).
-tjk

bo...@aol.com

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to
In article <8iajkk$lth$1...@srv38.cas.org>,
lvi...@cas.org wrote:
>
> According to Paul Duffin <pdu...@hursley.ibm.com>:
> :David Cuthbert wrote:
> :>
> :> Kristoffer Lawson wrote:
> :> > Yes but nobody is forcing anyone to use the dynamics that might cause
> :> > pains? Ie. you don't *have* to replace "set" or whatever and it's
> :> > probably better not to in cases like the above unless everyone really
> :> > really knows what they're doing.
> :>
> :> It's much more subtle than that. Consider the following code (a blatant
> :> example):
> :>
> :> if {$some_variable_which_is_nonzero_99_percent_of_the_time} {
> :> puts "Hello world."
> :> } else {
> :> this is a syntax error
> :> }
> :>
> :> Exercising this bug is difficult, though it has the advantage of being
> :> imediately observable. Other bugs may be more difficult to exercise *and*
> :> observe. Tcl will parse this statement without so much as a blink. C/C++
> :> compilers will curse at you for syntax errors.
>
> My take on things was that it was even more subtle. Imagine a line like:
>
> eval $var $arg
>
> just because a test case covers this line doesn't mean that you've tested
> the code ; that only occurs when you have tested every possible value that
> var and arg can take on... eval's are the bane of automated testing...

True for the general case you presented. Often it isn't necessary to
put the command name in a variable. You could have a coding standard
that says write:

eval cmd $arg1 $arg2 ...

unless there is a good reason why the command name must be in a
variable.


bob


Sent via Deja.com http://www.deja.com/
Before you buy.

Eric Taylor

unread,
Jun 15, 2000, 3:00:00 AM6/15/00
to
Dan Kuchler wrote:

> The fact that tcl is interpreted. NASA, the military, etc. don't like running
> anything that is interpreted because of the problem of validating it. The same
> thing can be said for dynamic (on the fly) code optimization (which has been
> discussed for some modern compilers/OSes). Because of the mission critical
> nature of the software they like to make sure they are testing the *exact* same
> thing that will be running in the production system.
>

> I thinnk that NASA and the military like to do a build and then run that static
> binary through the tests, and then take that same static binary and run it
> on the production system.
>


The big decision maker is who's last mission worked.
and who's didn't. If they used TCL or C or Java or VxWorks,
then those are the current gods of software. At least until
the next failure.

I don't like to argue. When I suggested that future missions
should consider a multitasking o.s. that used memory protection
and that VxWorks did not do a good job of this, I was
quickly hushed up and told: who needs an mmu. The problem
is C; if we go with java we don't need an mmu.

In addition, the goal now is to create an object oriented
mission design; not a design that just so happens to use
oop methods.

In other words, NASA is just like everywhere else. Religion
and politics rule and opinions are like you know what,
everyone has them.

et

Juan Carlos Gil Montoro

unread,
Jun 16, 2000, 3:00:00 AM6/16/00
to

Tom Krehbiel wrote:
>
> If I recall the story correctly neither Eiffel nor any other language
> would have caught the problem. The guidance system and its software was
> lifted from Ariane 4 and was assumed good for Ariane 5. A lack of hardware
> integration testing was the ultimate cause of the failure (which is a
> better way to test flight software).
> -tjk
>

If the Ariane 4 software were written in Eiffel with proper contract
clausules, including the hardware specification (maximum allowed lateral
thrust equal to 200K newtons, or whatever), when 'resued' for Ariane 5
there are good chances that the bug were caught during testing, even without
hardware integration.

That's the beauty of designing by contract.

Juan Carlos---

Darren New

unread,
Jun 16, 2000, 3:00:00 AM6/16/00
to
Juan Carlos Gil Montoro wrote:
> there are good chances that the bug were caught during testing, even without
> hardware integration.

This isn't the right group to discuss this, but... if the uncaught exception
wasn't triggered when they were testing the Ada version, what makes you
think it would have been triggered in the Eiffel version? Remember, too, the
reason it was uncaught was because there wasn't enough CPU time to *check*
the exceptions and still have the system work right, meaning the Eiffel
would have been running with exceptions unchecked as well.

mjs...@my-deja.com

unread,
Jun 17, 2000, 3:00:00 AM6/17/00
to
In article <394A0383...@gmv.es>,

Juan Carlos Gil Montoro <jc...@gmv.es> wrote:
>
>
> Tom Krehbiel wrote:
> >
> > If I recall the story correctly neither Eiffel nor any other
language
> > would have caught the problem. The guidance system and its software
was
> > lifted from Ariane 4 and was assumed good for Ariane 5. A lack of
hardware
> > integration testing was the ultimate cause of the failure (which is
a
> > better way to test flight software).
> > -tjk
> >
> If the Ariane 4 software were written in Eiffel with proper contract
> clausules, including the hardware specification (maximum allowed
lateral
> thrust equal to 200K newtons, or whatever), when 'resued' for
Ariane 5
> there are good chances that the bug were caught during testing,
even without
> hardware integration.

No, there was never any testing/simulation of the offending input at
the value where the software failed! If they *had* run the simulated
input to the failing value the existing software (in Ada) would have
"caught" the problem via an exception (just as it did on the
spacecraft...). As was stated, no language could have discovered the
problem on the ground, because the software was never tested using
Ariane-5 data!

Mike

Donal K. Fellows

unread,
Jun 18, 2000, 3:00:00 AM6/18/00
to
In article <3949040D...@ajubasolutions.com>, Dan Kuchler
<kuc...@ajubasolutions.com> writes

>I think the paper was written in 1994 so it probably started with something
>even earlier than 7.6

Hmm. Probably 7.3. I started using Tcl in 1995[*] with one of the 7.4
betas, but I doubt those were used in anything mission critical! So 7.3
is the likely candidate, especially since it was current for quite a
long period of time (only 8.0 was current for a comparable stretch
recently.)

Donal.
[* I remember which bookshop I bought JO's book at, what I was doing in
that bookshop at the time, and on whose recommendation I was
investigating this language-with-a-GUI. I didn't think a program a
colleague was writing the year before would make such a difference to
my online life and career; it just goes to show... ]
--
Donal K. Fellows (at home)
--
FOOLED you! Absorb EGO SHATTERING impulse rays, polyester poltroon!!
(WARNING: There is precisely one error in this message.)

Juan Carlos Gil Montoro

unread,
Jun 19, 2000, 3:00:00 AM6/19/00
to
>
> No, there was never any testing/simulation of the offending input at
> the value where the software failed! If they *had* run the simulated
> input to the failing value the existing software (in Ada) would have
> "caught" the problem via an exception (just as it did on the
> spacecraft...). As was stated, no language could have discovered the
> problem on the ground, because the software was never tested using
> Ariane-5 data!
>

As Darren says, this is not the best place to discuss this but ...

I've just checked how the Eiffel environment works, and it turns out that
you both are right. At

http://www.eiffel.com/eiffel/nutshell.html

in the 'Assertions look nice, but how do they help me?' point it says:

* Testing and debugging mechanism. Using the ISE Eiffel compiler, you
select which assertions will be monitored at run time; you can set different
levels (no check, preconditions only, preconditions and postconditions,
everything) separately for each class. Then if an assertion is found at run
time to be violated -- meaning a bug remains in your software -- an exception
will interrupt execution. This is a tremendous help for getting software
right quickly: testing and debugging are no longer blind searches; they
are helped by a precise description both of what the software does (the
actual executable texts, given by the do clauses) and of what it should do
(the assertions).

So, given that the software was not tested with Ariane-5 data, the bug would
have remain unnoticed.

My point was that a clever system *MIGHT* catch the error statically, as a
routine asserts 'I work for a < 10 only' and its user needs 'A routine which
works for a < 15': if all contract clauses are checked, the error could
be detected as the needs of the user exceed the client range. Maybe in the
next round of technology ...

Juan Carlos---

Donal K. Fellows

unread,
Jun 19, 2000, 3:00:00 AM6/19/00
to
In article <8ib4ej$5nb$1...@nnrp1.deja.com>, <bo...@aol.com> wrote:
> True for the general case you presented. Often it isn't necessary to
> put the command name in a variable. You could have a coding standard
> that says write:
>
> eval cmd $arg1 $arg2 ...
>
> unless there is a good reason why the command name must be in a
> variable.

It doesn't actually help that much unless you constrain the contents
of the arg1 and arg2 variables even more strongly (e.g. by forcing
them to be produced by proper list operations.) For example:

set arg1 ";exit;"
# ...


eval cmd $arg1 $arg2 ...

See? Of course, this might be actually a reasonable argument in
favour of the {}$... and {}[...] constructs that have been proposed in
the past, since there at least you can know for sure that there are no
evil tricks being pulled. Hmmm...

Donal.
--
Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ fell...@cs.man.ac.uk
-- I may seem more arrogant, but I think that's just because you didn't
realize how arrogant I was before. :^)
-- Jeffrey Hobbs <jeffre...@scriptics.com>

Donal K. Fellows

unread,
Jun 19, 2000, 3:00:00 AM6/19/00
to
In article <394E1739...@gmv.es>,

Juan Carlos Gil Montoro <jc...@gmv.es> wrote:
> My point was that a clever system *MIGHT* catch the error
> statically, as a routine asserts 'I work for a < 10 only' and its
> user needs 'A routine which works for a < 15': if all contract
> clauses are checked, the error could be detected as the needs of the
> user exceed the client range. Maybe in the next round of
> technology...

The trouble with this is that reality isn't really all that good at
holding to the contracts that we (attempt to) impose upon it. And too
many people like to make last minute "tweaks" that end up breaking
fundamental assumptions (which has scuppered at least one hardware
project that I've heard of; it turned out that the clock speed of the
original design was pretty much the maximum it could be given the
fabrication technology...)

Darren New

unread,
Jun 19, 2000, 3:00:00 AM6/19/00
to
Juan Carlos Gil Montoro wrote:
> So, given that the software was not tested with Ariane-5 data, the bug would
> have remain unnoticed.

Doubly-so, considering it was an 16-bit integer overflow which Eiffel
wouldn't have caught anyway, IIRC. :-)

> My point was that a clever system *MIGHT* catch the error statically, as a
> routine asserts 'I work for a < 10 only' and its user needs 'A routine which
> works for a < 15': if all contract clauses are checked, the error could
> be detected as the needs of the user exceed the client range. Maybe in the
> next round of technology ...

The technology has been around for quite some time. The problem is that
nobody has found a good way to evaluate the conditions with better than
exponential time & space. But with a sufficiently formal language, it's
possible to actually *prove* that the implementation conforms to the spec.
But languages this formal are not especially easy to write in, compile, or
for that matter test. They're great for specifications, tho.

--
Darren New / Senior MTS & Free Radical / Invisible Worlds Inc.
San Diego, CA, USA (PST). Cryptokeys on demand.

"You know Lewis and Clark?" "You mean Superman?"

Juan Carlos Gil Montoro

unread,
Jun 20, 2000, 3:00:00 AM6/20/00
to
Juan Carlos Gil Montoro wrote:
>
> Paul Duffin wrote:
> >
> > If you go to http://www.eiffel.com and follow the "500 million dollar mistake"
> > link you will end up at
> > http://www.eiffel.com/doc/manuals/technology/contract/ariane/page.html
> > This has a link to the official report as to why the Ariane 5 crashed, a brief
> > summary of the problem and then attempts to prove that this would not happen
> > with Eiffel.
> >
> > The problem I have with the 'proof' is that while they can add invariants etc.
> > to the code which can be compiled into runtime checks, if those checks fail
> > they simply cause an exception (which is what led to the crash in the first
> > place).
> >
> > The only way that the error could have been found before the crash was for
> > that piece of code to be tested across the full range of its inputs. If
> > this was done then it would be irrelevant whether the code was written in
> > Eiffel, C++ or Tcl.
> >

You are right. I refer the interested reader to

Critique of "Put it in the contract: The lessons of Ariane"
http://home.flash.net/~kennieg/ariane.html

Juan Carlos---

Donal K. Fellows

unread,
Jun 21, 2000, 3:00:00 AM6/21/00
to
In article <394E5333...@san.rr.com>, Darren New <dn...@san.rr.com>
writes

>The technology has been around for quite some time. The problem is that
>nobody has found a good way to evaluate the conditions with better than
>exponential time & space. But with a sufficiently formal language, it's
>possible to actually *prove* that the implementation conforms to the spec.
>But languages this formal are not especially easy to write in, compile, or
>for that matter test. They're great for specifications, tho.

You can do it in less, provided your language is sufficiently rigorous.
The best example I can think of is a system called B (which is unrelated
to the predecessor of C, which the author probably hadn't heard of)
which has been successfully used on a number of safety-critical projects
like the Paris Metro. But the key to these sorts of schemes is that
they have pretty powerful proof engines and *everything* is coded up
within the language; not just lock, stock and barrel, but also the shot,
powder, user and target. This takes a lot of time, but the integration
and system testing phases of the project are much quicker, with the
result that the project as a whole can even come in under budget and on
time...

If anyone is really interested, pester me and I'll hunt down a few
(online and/or published) references.

Frankly, compared to the kind of development support offered by the
likes of B (not just contract enforcement, but refinement and
reformulation of contractual requirements, and proof of satisfaction of
contractual obligations too) Eiffel is something of a poor cousin. This
is because the only way to tractably apply formal analysis to real world
designs is by building it into the development process itself. And this
has been known in the formal software development community for a long
time (i.e. longer than I've been studying or working in CS)

Donal.
--

Darren New

unread,
Jun 23, 2000, 3:00:00 AM6/23/00
to
Donal K. Fellows wrote:
> If anyone is really interested, pester me and I'll hunt down a few
> (online and/or published) references.

I'm interested!

Of course, the problem with using a name like "B" is it's impossible to do a
web search for it. :-)

Reply all
Reply to author
Forward
0 new messages