"First presentation" of type theory: replace lambda-abstraction by defining equation?

75 views
Skip to first unread message

Sunrise Confusion

unread,
Sep 3, 2018, 3:17:56 PM9/3/18
to HoTT Cafe
I am referring to section A.1 of the HoTT book, i.e. the "first
presentation" of
type theory using defining equations. This presentation is based on
lambda-calculus, and treats functions introduced by lambda-abstraction and
functions introduced by recursors on a somewhat different footing: The former
get their own term forming operation (lambda x.B), while the latter are implied
by rules of the form "if..., then we can introduce a defined constant...".

Is there a problem with throwing away lambda abstraction as a term forming
operation, and instead defining all functions using "defining equations"? Then
we can still introduce lambda-abstraction as syntactic sugar (as a
shorthand for
defining a new function, and referencing it). Besides my original
motivation, it
seems to me that this would simplify the definitions and discussion in A.1 and
A.4 a bit if it works, as one doesn't need to discuss the computation rule of
lambda-calculus etc. separately from the rules for defined constants.
I appended
a patch to make this more precise. Is this done somewhere in the literature?

I also have the following possibly related question on the definition of
convertibility in section A.1. The book says shortly before A.1.1 that
"Convertibility between terms t and t' is the equivalence relation generated by
the defining equations for constants,...". Does this mean that, if all defining
equations for two constants c, c' are the same, we should introduce an
equivalence between c and c' (the judgmental analog of function
extensionality)?
This seems to be done in the rest of the book (e.g. in the structural rules in
the second presentation). Is this implied by the statement here as well, or is
there an argument that we don't need to do this for functions defined by
recursors? If it is implied, I think there should be a sentence
clarifying this.
On the other hand, where is such a rule necessary/does introduce a speedup in a
proof when the final judgement we want to deduce is of the form a:A (i.e. is a
statement of constructive mathematics), rather than a==b:A?

Patch:

diff --git a/formal.tex b/formal.tex
index b22593b..ab6a7ec 100644
--- a/formal.tex
+++ b/formal.tex
@@ -40,8 +40,7 @@ inference---the manner in which judgments can be derived from
other judgments.
In this appendix, we present two formulations of Martin-L\"{o}f type
theory, and of the extensions that constitute homotopy type theory.
The first presentation (\cref{sec:syntax-informally}) describes the syntax of
-terms and the forms of judgments as an extension of the untyped
-$\lambda$-calculus, while leaving the rules of inference informal.
+terms and the forms of judgments while leaving the rules of inference informal.
The second (\cref{sec:syntax-more-formally}) defines the terms, judgments,
and rules of inference inductively in the style of natural deduction, as
is customary in much type-theoretic literature.
@@ -137,7 +136,7 @@ whose purpose is to express the fact that $x$ is ``local''
to $B$, i.e., it
is not to be confused with other occurrences of $x$ appearing
elsewhere. Bound variables are familiar to programmers, but less so to
mathematicians.
Various notations are used for binding, such as $x \mapsto B$,
-$\lam x B$, and $x \,.\, B$, depending on the situation. We may write $C[a]$
for the
+$\lam x B$, $x \,.\, B$, and $f(x) \defeq B$, depending on the situation. We
may write $C[a]$ for the
substitution of a term $a$ for the variable in the abstracted expression, i.e.,
we may define $(x.B)[a]$ to be $B[a/x]$. As discussed in
\cref{sec:function-types}, changing the name of a bound variable everywhere
within an expression (``$\alpha$-conversion'')
@@ -160,7 +159,7 @@ consisting of the expressions $A_{i+1},
\label{sec:syntax-informally}

The objects and types of our type theory may be written as terms using
-the following syntax, which is an extension of $\lambda$-calculus with
+the following syntax, which is a calculus with
\emph{variables} $x, x',\dots$,
\index{variable}%
\emph{primitive constants}
@@ -170,14 +169,13 @@ $c,c',\dots$, \emph{defined
constants}\index{constant!defined} $f,f',\dots$, and
operations
%
\[
- t \production x \mid \lam{x} t \mid t(t') \mid c \mid f
+ t \production x \mid t(t') \mid c \mid f
\]
%
The notation used here means that a term $t$ is either a variable $x$, or it
-has the form $\lam{x} t$ where $x$ is a variable and $t$ is a term, or it has
-the form $t(t')$ where $t$ and $t'$ are terms, or it is a primitive constant
-$c$, or it is a defined constant $f$. The syntactic markers '$\lambda$', '(',
-')', and '.' are punctuation for guiding the human eye.
+has the form $t(t')$ where $t$ and $t'$ are terms, or it is a primitive
constant
+$c$, or it is a defined constant $f$. The syntactic markers '(' and
+')' are punctuation for guiding the human eye.

We use $t(t_1,\dots,t_n)$ as an abbreviation for the repeated application
$t(t_1)(t_2)\dots (t_n)$. We may also use \emph{infix}\index{infix notation}
notation, writing $t_1\;
@@ -221,18 +219,13 @@ of a function on a recursive data type has been called a
\define{definition by
\index{convertibility of terms}%
\index{term!convertibility of}%
$t \conv t'$ between terms $t$
-and $t'$ is the equivalence relation generated by the defining equations for
constants,
-the computation rule\index{computation rule!for function types}
-%
-\[
- (\lam{x} t)(u) \defeq t[u/x],
-\]
-%
-and the rules which make it a \emph{congruence} with respect to
application and
$\lambda$-abstraction\index{lambda abstraction@$\lambda$-abstraction}:
+and $t'$ is the equivalence relation generated by the defining equations for
+constants (, an equivalence of any two defined constants whose defining
equations
+are equivalent,?) and the rule which makes it a \emph{congruence} with respect
to
+application:
%
\begin{itemize}
-\item if $t \conv t'$ and $s \conv s'$ then $t(s) \conv t'(s')$, and
-\item if $t \conv t'$ then $(\lam{x} t) \conv (\lam{x} t')$.
+\item If $t \conv t'$, and $s \conv s'$, then $t(s) \conv t'(s')$.
\end{itemize}
\noindent
The equality judgment $t \jdeq u : A$ is then derived by the following single
rule:
@@ -282,11 +275,10 @@ The following conversion rule allows us to replace a type
by one equal to it in

We introduce a primitive constant $c_\Pi$, but write
$c_\Pi(A,\lam{x} B)$ as $\tprd{x:A}B$. Judgments concerning
-such expressions and expressions of the form $\lam{x} b$ are introduced by the
following rules:
+such expressions are introduced by the following rules:
%
\begin{itemize}
\item if $\Gamma \vdash A:\UU_n$ and $\Gamma,x:A \vdash B:\UU_n$, then $\Gamma
\vdash \tprd{x:A}B : \UU_n$
-\item if $\Gamma, x:A \vdash b:B$ then $\Gamma \vdash (\lam{x} b) :
(\tprd{x:A}
B)$
\item if $\Gamma\vdash g:\tprd{x:A} B$ and $\Gamma\vdash t:A$ then
$\Gamma\vdash g(t):B[t/x]$
\end{itemize}
%
@@ -300,10 +292,15 @@ Using non-dependent function types and leaving implicit
the context $\Gamma$, th
%
\begin{itemize}
\item if $A:\UU_n$ and $B:A\to\UU_n$, then $\tprd{x:A}B(x) : \UU_n$
-\item if $x:A \vdash b:B$ then $ \lam{x} b : \tprd{x:A} B(x)$
\item if $g:\tprd{x:A} B(x)$ and $t:A$ then $g(t):B(t)$
\end{itemize}
%
+If $\Gamma, x:A \vdash b:B$, then we can introduce a defined constant
+$f:(\tprd{x:A} B)$, with $f$ also written as $\lam{x} b$,
+with the defining equation
+\[
+ f(x) \defeq b.
+\]

\subsection{Dependent pair types (\texorpdfstring{$\Sigma$}{Σ}-types)}

@@ -1048,13 +1045,7 @@ In $\ind{\Sn^1}$, $x$ is bound in $C$. The notation
${\dpath C \lloop b b}$ for
This section discusses the meta-theoretic properties of the type theory
presented in
\cref{sec:syntax-informally}, and similar results hold for
\cref{sec:syntax-more-formally}. Figuring out which of these still hold when we
add the features from \cref{sec:hott-features} quickly leads to open
questions,\index{open!problem} as discussed at the end of this section.

-Recall that \cref{sec:syntax-informally} defines the terms of type theory as
-an extension of the untyped $\lambda$-calculus. The $\lambda$-calculus
-has its own notion of computation, namely the computation
rule\index{computation rule!for function types}:
-\[
- (\lam{x} t)(u) \defeq t[u/x].
-\]
-This rule, together with the defining equations for the defined constants form
+The defining equations for the defined constants form
\emph{rewriting rules}\index{rewriting rule}\index{rule!rewriting} that
determine reduction steps for a rewriting
system. These steps yield a notion of computation in the sense that each rule
has a natural direction: one simplifies $(\lam{x} t)(u)$ by evaluating the
@@ -1112,7 +1103,7 @@ can explicitly describe all normal forms:
The terms in normal form can be described by the following syntax:
%
\begin{align*}
- v & \production k \mid \lam{x} v \mid c(\vec{v}) \mid f(\vec{v}), \\
+ v & \production k \mid c(\vec{v}) \mid f(\vec{v}), \\
k &\production x \mid k(v) \mid f(\vec{v})(k),
\end{align*}
%

Michael Shulman

unread,
Sep 4, 2018, 12:34:10 AM9/4/18
to hobbye...@gmail.com, HoTT Cafe
On Mon, Sep 3, 2018 at 12:17 PM Sunrise Confusion
<hobbye...@gmail.com> wrote:
> Is there a problem with throwing away lambda abstraction as a term forming
> operation, and instead defining all functions using "defining equations"? Then
> we can still introduce lambda-abstraction as syntactic sugar (as a
> shorthand for defining a new function, and referencing it).

I think you can sort of do this, but you have to be more careful about
where and how "defining new constants" is allowed. The way the system
in A.1 is presented, it appears that the collection of constants and
their defining equations is a parameter of the theory, i.e. it is
fixed once and for all at the beginning. This already doesn't match
the way "defining a function by equations" is used in the main body of
the text, where we introduce *new* functions with their defining
equations all the time. But if you are trying to do away with
lambda-abstraction completely, then I think you also need to be able
to define new constants "in context" and "inside a term". That is, if
a lambda-abstraction is constructed on-the-fly as an argument to some
other function, maybe even inside *another* lambda-abstraction:

H (\lambda x. K (\lambda y. y+x))

then you need to translate that to something like

H f
where f x = K g
where g y = y + x

(which is pretty close to actual Agda syntax), i.e. introducing a new
function f "locally" by a defining equation and then *inside* the body
of f, in particular inside the scope of the variable x, introducing
another new function g by a defining equation. This is essentially
the same as let-binding a lambda abstraction:

let f := (\lambda x. let g := (\lambda y. y + x) in K g) in H f

One would then have to give rules for manipulating such "local
definitions", which I don't think would end up being any simpler than
the rules for lambda-abstractions. (Moreover, lambda-notation is
universally used in type theory, so even if there were a slightly
simpler alternative it wouldn't be helpful to insulate the student
from it.)

> I also have the following possibly related question on the definition of
> convertibility in section A.1. The book says shortly before A.1.1 that
> "Convertibility between terms t and t' is the equivalence relation generated by
> the defining equations for constants,...". Does this mean that, if all defining
> equations for two constants c, c' are the same, we should introduce an
> equivalence between c and c' (the judgmental analog of function
> extensionality)?

I believe that this would follow, at least for functions defined by
single equations, if we add eta-conversion to the system in section
A.1 (which should really be done, since eta-conversion is used in the
main text of the book and is also present in the presentation of A.2).
For then if f and g are defined by equations

f(x) = M
g(x) = M

where M is some expression involving x, then we would have

f = (\lambda x. f(x)) = (\lambda x. M) = (\lambda x. g(x)) = g.

For functions defined by multiple equations using recursion/induction,
things are less clear. Most of the book I believe takes the
perspective of A.2, that defining functions by equations is really
just syntactic sugar for defining them to equal lambda-abstractions or
recursor-expressions, in which case two functions defined by the same
equations would indeed be judgmentally equal. However, the
presentation in A.1 does seem to be unclear about what attitude is
taken towards this question.

Mike
Reply all
Reply to author
Forward
0 new messages