Summary Of May lisp-nyc meeting, aka "Meet Thomas F Burdick Night"

Kenny Tilton

unread,

May 28, 2003, 1:53:44 PM5/28/03

to

OK, I promised to share, here goes nothing.

Disclaimer: I don't know what I am talking about, and I did not clear
this with Thomas beforehand.[1] Corrections from better-informed
attendess are welcome, and if you really want to know what Thomas said
(about program verification without compiler support), ask him. :)

Finding Lisp
------------
But first, it turns out Thomas has a "how I found Lisp" story that is
hard to beat, and he shared that as well. It seems he was looking at the
code generated by gcc and was horrified by some of it. He concluded that
better code could have been generated if he could have given gcc some
hints. he was hard at revising gcc to do this when he came upon some
obscure code and turned to a gcc veteran for assistance.

the veteran informed Thomas that (a) no delivered app should require a
custom compiler and (b) that Thomas was greenspunning: Lisp already did
as a matter of course what Thomas was trying to coax from gcc.[2]

Hilarious Mis-summary of a Highly Technical Talk From An LWUI Un-versed
in the Area, From Memory Three Weeks After the Fact of Thomas's Talk on
"Testa: Program Verification Without Compiler Support".
-----------------------------------------------------------------------

Thomas and cohort (Edwin Westbrook) started from, inter alia, Necula's
ideas:

http://www.cs.berkeley.edu/~necula/

What they do is try to identify problems (program verification) in
object code (without compiler support).

Step one is to disassemble binary to a RISC ASM they concocted (hence
chip-neutral). This step is assumed to be correct in the overarching
determination of program correctness, which i guess just says Testa does
not verify Testa.

Step two is to create a flow graph (?) which looks like hell when
scribbled on a napkin and passed around the table in a sports bar lit
mostly by TV screens.

Now Testa walks thru the code and tries to infer as much type
information as it can from that code. A simple example is that after a
test for null, one knows one has a null in one branch and not-null in
the other branch.

This process is repeated because things learned in one pass can make
possible new inferences a second time through. As long as new inferences
are found, these iterations continue. Else...

Testa writes a proof of the program's correctness. This bit, like the
disassembly, must be trusted. Then the easy part: a proof-checker checks
the proof. Can you sense the hand-waving?

One big thing is that Testa has its sights set not on finding every
fault in some code, but on finding at least some, such as memory-safety
violations.

Experience story: (and I am sure I have this wrong because Thomas denied
it when I told it in front of him): Testa found a problem in a
real-world module, one which looks traceable to GCC itself.

Recent developments: at a recent presentation on Testa by Edwin, someone
else working in the same research area asked how they handled certain
types of offset pointers. Doh!, cried Edwin, god bless the intellectual
honesty. His questioner comforted him by saying they did not know what
to do either. Team Testa has taken this in stride. Thomas wrote: "we, uh
... had a setback with our recent theoretical advance. Our cool
machine-language type system can't deal with some types of offset
pointers. Oops. I'm pretty sure we can work around this, and we've
done some work on this, but haven't tried our thoughts out yet in a
practical manner."

OK, I don't want to do any more manage, I'll shut up.

Footnotes
---------
[2] Thomas cannot recall the details of all this. We may need hypnosis
or the Vulcan mind-meld to recover these memories.

[1] In Thomas's own words:

"Testa: Software verification without compiler support

"Testa is a software verification system, written in Common Lisp. It
supports the Proof-Carrying Code model put forward by Necula, among
others. Unlike current systems, which require special compiler
support, Testa operates directly on machine code, which not only
allows it to be used with systems written in arbitrary programming
languages, but also makes it robust in the face of compiler bugs.
Currently, Testa attempts to ensure the memory safety of Darwin (the
Mac OS X kernel) extensions. In the future, we plan to have Testa
work with more complicated safety predicates, including various
definitions of thread safety.

"Software verification of arbitrary machine code is a difficult
problem, that has not been sufficiently explored, and cannot be solved
in the general case. Because of this, Common Lisp was a major asset
in our development process. We have regularly found ourselves in the
debugger and explored the current state of our in-memory graphs, to
discover that we needed to change a part of our approach. We were
able to prototype a solution, and restart without losing our state.
If we had used a more conventional language, we would have spent so
much of our time battling the language, that we could not have
attempted the problem, given our limited man-hours for the project.

"About the name: "testa" is a term from plant biology, for the maternal
diploid tissue that protects the rest of the seed. That is to say, it
protects the kernel."

--

kenny tilton
clinisys, inc
http://www.tilton-technology.com/
---------------------------------------------------------------
"Everything is a cell." -- Alan Kay

Thomas F. Burdick

unread,

May 28, 2003, 10:48:08 PM5/28/03

to

Kenny Tilton <kti...@nyc.rr.com> writes:

> Finding Lisp
> ------------
[...]

> [2] Thomas cannot recall the details of all this. We may need hypnosis
> or the Vulcan mind-meld to recover these memories.

Partly can't, partly won't, because he's not quite sure how much he
can say without violating an NDA ;-)

Kenny Tilton <kti...@nyc.rr.com> writes:

> Hilarious Mis-summary of a Highly Technical Talk From An LWUI
> Un-versed in the Area, From Memory Three Weeks After the Fact of
> Thomas's Talk on "Testa: Program Verification Without Compiler
> Support".

Pretty good summary. Of course, anyone who wants to know more can
e-mail me. One minor detail with major implications:

> Step one is to disassemble binary to a RISC ASM they concocted (hence
> chip-neutral). This step is assumed to be correct in the overarching
> determination of program correctness, which i guess just says Testa
> does not verify Testa.

Right, and this is okay because it's Really F***ing Easy to make a
reliable disassembler, especially for a language like the PowerPC,
which has all its instructions defined in the spec using a semi-formal
notation (we implemented the pseudo-code -- I love macros).

> Testa writes a proof of the program's correctness. This bit, like the
> disassembly, must be trusted.

Ah, no! The generated proofs can be correct or not, just like
everything after disassembly, because if the generated proof amounts
to (= 0 1), it'll get caught by...

> Then the easy part: a proof-checker checks the proof.

*This* needs to be trusted. Again, if you use a reasonable logic
system, this is Really F***ing Easy to get right. Whereas
proof-generation is not.

--
/|_ .-----------------------.
,' .\ / | No to Imperialist war |
,--' _,' | Wage class war! |
/ / `-----------------------'
( -. |
| ) |
(`-. '--.)
`. )----'

Kenny Tilton

unread,

May 29, 2003, 1:21:01 AM5/29/03

to

Thomas F. Burdick wrote:
> Kenny Tilton <kti...@nyc.rr.com> writes:
>
>
>>Finding Lisp
>>------------
>
> [...]
>
>>[2] Thomas cannot recall the details of all this. We may need hypnosis
>>or the Vulcan mind-meld to recover these memories.
>
>
> Partly can't, partly won't, because he's not quite sure how much he
> can say without violating an NDA ;-)

OK. I got the feeling you were describing something akin to a macro
deciding into what code to expand the macro arguments. Can you
confirm/deny that much given the constraints of recall and NDA?

>
> Kenny Tilton <kti...@nyc.rr.com> writes:
>>Testa writes a proof of the program's correctness. This bit, like the
>>disassembly, must be trusted.
>
>
> Ah, no! The generated proofs can be correct or not, just like
> everything after disassembly, because if the generated proof amounts
> to (= 0 1), it'll get caught by...
>
>
>>Then the easy part: a proof-checker checks the proof.
>
>
> *This* needs to be trusted.

I just remembered there were two things to be trusted. :)

Thomas F. Burdick

unread,

May 29, 2003, 2:08:43 PM5/29/03

to

Kenny Tilton <kti...@nyc.rr.com> writes:

> Thomas F. Burdick wrote:
> > Kenny Tilton <kti...@nyc.rr.com> writes:
> >
> >>Finding Lisp
> >>------------
> > [...]
> >
> >>[2] Thomas cannot recall the details of all this. We may need hypnosis
> >>or the Vulcan mind-meld to recover these memories.
> > Partly can't, partly won't, because he's not quite sure how much he
> > can say without violating an NDA ;-)
>
> OK. I got the feeling you were describing something akin to a macro
> deciding into what code to expand the macro arguments. Can you
> confirm/deny that much given the constraints of recall and NDA?

Yeah, pretty much. I was essentially applying Greenspun's 10th to
compiler macros. C++ being C++, though, my wanna-be compiler macros
had to expand into the compiler's internal representation.