Statically enforcing Barendregt's variable convention

31 views
Skip to first unread message

nha...@gmail.com

unread,
Jan 14, 2023, 7:07:21 PM1/14/23
to Shen

I was reading a paper on the GHC compiler's inliner, which they call The Rapier, and found a follow up paper ("The Foil: Capture-Avoiding Substitution With No Sharp Edges" https://arxiv.org/pdf/2210.04729.pdf) which tries to statically enforce the invariants needed to avoid name capture during substitutions etc.

The Foil typing in that paper is really convoluted, since Haskell's type system can't really cope well. Having dependent types available makes things a lot easier.

I took a crack at something similar (with stronger requirements) in Shen: https://github.com/NHALX/shen-misc/blob/master/foil.shen

The intent is to prevent shadowed name bindings and statically prove all bindings in the expression have unique names, as well as forbidding free variables. 

Some notes:

- Names and Foil expressions are annotated with a list describing the expanding scope, which gets set in order from outer expression to inner.

- New names are generated with their scope variable unbound, waiting for let and lambda to fix them with a unique id.

- The substitution lists are able to statically grow their domain due to scope expansion.

There's some annoying flaws at the moment:

- If the unbound scope variable isn't bound to [] or something else at the top level, then it won't detect some invalid expressions. 

- Scope shrinking with foil-dissolve doesn't seem like it has proper justification.

- T* needs to be patched.

   



Mark Tarver

unread,
Jan 15, 2023, 1:12:29 AM1/15/23
to Shen
I think what you are doing here is looking for a way of statically guaranteeing that
a certain set of operations over a data structure is preserving of an invariant property P.
You are trying maybe to work at the level of purely concrete data structures and typecheck
these P-preserving functions which can be
difficult.  I think one can get to the point where one's type theory/specification
becomes as complex as the program you are specifying.

As an example we'll consider the task of specifying a sorted list as a datatype where
we want to add things to a list and maintain the invaraint property that it remains sorted.  If
you work at a purely concrete level it can be difficult.  Here is a solution which
uses a semi-abstract approach.

Shen, www.shenlanguage.org, copyright (C) 2010-2022, Mark Tarver
version: S34.2, language: Common Lisp, platform: SBCL 2.0.0
port 3.2, ported by Mark Tarver

(0-) (datatype sorted   \\ high to low

 ____________
 [] : sorted;

 N : number; L : sorted;
 ========================
 (sorted N L) : sorted;)
sorted#type

(1-) (define sorted
   N [] -> [N]
   N [M | Sorted] -> [N M | Sorted]   where (> N M)
   N [M | Sorted] -> [M | (sorted N Sorted)])
(fn sorted)

(2-) (defmacro listsort-macro
  [listsort]       -> []
  [listsort X | Y] -> [sorted X [listsort | Y]])
listsort-macro

(3-) (tc +)
true

(4+) (listsort 56 7 89 3)
[89 56 7 3] : sorted


It looks very simple until you compare it with the alternative.  Hence in your example
I'd try to build basic useful operations like sorted into your definition until you have
enough machinery to solve the problem.  I see that
the authors say that they have made several attempts to get their system right.

But it‘s really easy to screw up. We’ve messed it up again and again and again
and again and again.


That suggests to me that they are overthinking the problem.

M.

nha...@gmail.com

unread,
Jan 15, 2023, 7:19:55 AM1/15/23
to Shen
"But it‘s really easy to screw up. We’ve messed it up again and again and again
and again and again. "

The way I read it was that they are referring to having to manually maintain the invariants in the original Rapier algorithm, which is what their Foil system is supposed to automate. 

This kind of echos the debate of hygienic scheme macros vs lisp defmacro though. At what point does the cure become worse than the disease? I can't say I've spent much time having to debug macro related bugs, and I don't consider myself a particularly disciplined or precise programmer wrt dealing with annoying low level details.

"It looks very simple until you compare it with the alternative.  Hence in your example
I'd try to build basic useful operations like sorted into your definition until you have
enough machinery to solve the problem."

That's a good point that macros can enforce static invariants just as well (or better) than the type system. I suppose it should cut the problem in half, since you don't have to deal with the constructors in the type system.


 

Mark Tarver

unread,
Jan 15, 2023, 7:42:37 AM1/15/23
to Shen
There is a passage from Peyton-Jones who started this line

  Earlier versions of GHC used a brutal approach to avoid the name-capture problem: during inlining, GHC would simply rename, or clone, every single bound variable, to give: let s796 = 7 in (a+b) + s796 This renaming made use of a supply of fresh names that, in this example, has arbitrarily renamed a to s796. This approach suffers from two disadvantages: • It allocates far more fresh names than are actually necessary, and there is sure to be a compile-time performance cost to this. • Plumbing the supply of fresh names to the places those names are required is sometimes very painful.

It seems to me to be an artificial problem  that Peyton-Jones has created.  The brutal approach works, its tedious but its a computer so who cares, and the human time costs are not that great.
Compilers being fast is not as important as compilers producing fast code.   It's the kind of optimisation that would have engrossed a programmer in the '60s when machines were very slow.
PJ has instead spawned a research program into developing something that is very difficult to reliably implement.  The one plus being that it enables the group to write papers where they
correct their own work.

M.

nha...@gmail.com

unread,
Jan 15, 2023, 9:57:26 AM1/15/23
to Shen
Yeah the unique symbol plumbing issue is a Haskell only problem. Gensym can be made to be very fast anyway (chez scheme defers the generation of a unique name until they are actually printed I think).

My code above can't use their symbol-reuse optimisation anyway because I wanted some more uniqueness guarantees than their system offers.

Mark Tarver

unread,
Jan 16, 2023, 1:12:50 AM1/16/23
to Shen
I've never gone for Haskell as a language; the reason being that trying to banish state from a programming language 
carries a penalty and I see that many of the Haskellers seem to be effectively dealing with problems arising
from  that design choice.  

M.
Reply all
Reply to author
Forward
0 new messages