Google Groupes

boringcc


D. J. Bernstein 21 déc. 2015 06:40
Envoyé au groupe : Boring cryptography
As a boring platform for the portable parts of boring crypto software,
I'd like to see a free C compiler that clearly defines, and permanently
commits to, carefully designed semantics for everything that's labeled
"undefined" or "unspecified" or "implementation-defined" in the C
"standard". This compiler will provide a comprehensible foundation for
people writing C code, for people auditing C code, and for people
formally verifying C code.

For comparison, gcc and clang both feel entitled to arbitrarily change
the behavior of "undefined" programs. Pretty much every real-world C
program is "undefined" according to the C "standard", and new compiler
"optimizations" often produce new security holes in the resulting object
code, as illustrated by

   https://lwn.net/Articles/342330/
   https://kb.isc.org/article/AA-01167

and many other examples. Crypto code isn't magically immune to this---
one can easily see how today's crypto code audits will be compromised by
tomorrow's compiler optimizations, even if the code is slightly too
complicated for today's compilers to screw up. A boring C compiler will
eliminate these nasty surprises; it will prioritize predictability.

I should note that this plan, throwing away gcc and clang in favor of a
boring C compiler, isn't the only possible response to these types of
security holes. Here are several other responses that I've seen:

   * Attack the messenger. "This code that you've written is undefined,
     so you're not allowed to comment on compiler behavior!" The most
     recent time I saw this, another language lawyer then jumped in to
     argue that the code in question _wasn't_ undefined---as if this
     side discussion had any relevance to the real issue.

   * Say that the C "standard" allows gcc and clang to do whatever they
     want to all of these "undefined" programs. Yes, we know that it's a
     stupid "standard"; that isn't the question. What do we do about the
     resulting security problems?

   * Blame the security holes on the C programmers who wrote "undefined"
     programs. This _might_ be reasonable if there were a plausible plan
     to reduce the fraction of "undefined" programs from ~100% to ~0%,
     but there isn't. Even if there were, how would this massive code
     revision be better than keeping the code and switching to a boring
     C compiler?

   * Claim that earthquakes in the behavior of "undefined" programs will
     teach C programmers to stop writing such programs. "That'll show
     'em!" But the reality is that this hasn't worked and won't work.

   * Claim that all we need is for some particular "undefined"-catching
     tool to be widely used. In fact, these tools are full of false
     positives and false negatives; at best they catch a few limited
     types of "undefined" behavior, not changing the big picture.

   * Claim that a boring C compiler can't possibly support the desired
     system _performance_. Even if this were true (which I very much
     doubt), why would it be more important than system _correctness_?

Overall I think that these people simply don't understand what most C
programmers want. A boring C compiler will very quickly gain users---not
merely for security but also for predictability in general; people will
appreciate, e.g., having variables automatically initialized to 0. Of
course, the compiler has to support the usual C ABI, so that programs
compiled with this compiler can be linked to libraries compiled with
other compilers (including other languages), and vice versa, allowing an
incremental upgrade process.

I'm told that the Plan 9 compiler tries to be somewhat predictable---

   http://article.gmane.org/gmane.os.plan9.general/76989

---but it still leaves many things unspecified, and I'm under the
impression that it has its own ABI. There's a "Fail-Safe C" compiler
that sounds like it defines much more, including safe semantics for
buffer overflows---

   https://staff.aist.go.jp/y.oiwa/FailSafeC/index-en.html

---but it's clear that this has its own ABI. There are various projects
such as

   http://www.doc.ic.ac.uk/~awl03/projects/miro/

that add more useful semantics to some parts of C _without_ changing the
ABI, but the same compilers will happily screw up other parts of C.
There's a "Friendly C" proposal that targets other parts of C more
relevant to core crypto---

   http://blog.regehr.org/archives/1180

---but the big picture is still tons of misguided flexibility for the
compiler writer.

---Dan