What is the SKI calculus?
Q: What's a combinator?
A: A function that operates on a functions to produce a function.
Lambda calculus is the basis of Scheme or Haskell functions. It's based on substitution of parameters into function bodies. E.g.,
\x -> \y -> x
Is a function that takes a value, and returns a function that takes a second value,
but returns the first value. So, applying it to 1
and 2
:
(\x -> \y -> x) 1 2
--> 1
In Haskell, this is called const
.
It's easy to write functions in this way, and lambda
has become popular,
appearing in languages such as Ruby, Python, and C#.
But, when it comes to doing mathematical proofs about functions, it's a bit
of a pain. You have to make up variable names, and you have to worry about
clashes. For example, which x
s in the following expression refer to the
same thing?
\x -> \x -> x
In combinatory logic, you don't define functions with parameters, you just apply a small set of functions to each other. A complete set of functions that can be used to represent all computations is called a basis.
One of those bases are the three functions S
, K
, and I
: the
SKI calculus.
I'll write them in uppercase to make them stand out, but in Haskell you'd
have to define them with lowercase letters.
I f = f
I
is the identity function, called id
in Haskell. It returns its argument.
K f g = f
K
is the constant function, called const
in Haskell. It takes two (curried)
arguments and returns the first, ignoring the second argument.
S f g h = (f h) (g h)
S
is the substitution function. It's a generalization of function application,
that lets you apply both the function and the argument to an extra parameter
before applying the function to the argument.
It helps for me to think of this in terms of functional reactive programming.
I think of a function that is changing over time. Like one second it's adding
.5, but a second later it has become a function to add 1.5. That's your f
.
Then imagine g
is a value that's changing over time. Then at some specific
point in time h
, you want to apply the current snapshot of the function
to the current snapshot of the value.
Except that, in combinatory logic, there are no numbers. There are only functions. Functions that take functions as arguments, and return functions, that take functions as arguments, and return functions. It is completely untyped. We will soon see how to recover a notion of number.
It may surprise you to find that S
can also be found in the standard
libraries of Haskell. It turns out to be the (-> t)
instance of the applicative
functor operator <*>
.
You can pull lots of tricks with S
. It's where all the power comes from.
It may at first seem unsatisfying. Why is that bothersome extra h
in the way? What about a function to swap argument order like flip
in
Haskell?
flip f x y = f y x
Well, you're not alone. Curry called flip
C
, and defined
B f g h = f (g h)
Schönfinkel called them T
and Z
, and also showed how they could be defined
in terms of S and K.
But give S
a chance. Isn't it fun to be minimalist?
Let's try a sample reduction:
S I K S
I S (K S)
S (K S)
We can't reduce any further, because S needs three arguments to reduce, and only has one. Let's try another:
S K S K
K K (S K)
K
Well, that turned out nice and short!
But how exactly are you supposed to compute things with these functions, if they only return other functions? There are no datatypes whatsoever!
Think for a second about how computer memory works. Everything is stored in binary. Looking at a byte, you don't know if it's supposed to be a letter, a number, or part of a pointer to some other value. It's all untyped. The typing is all in the interpretation. If you expect something to return an integer, you regard the result as an integer. So maybe there is a way to encode numbers in combinatory logic?
Let's take a step back and look at a few ways the natural numbers can be encoded. Computer hardware uses a binary representation, so
[3] -> 00000011
The set-theoretic definition of a natural number is the set of all natural numbers smaller than itself. So,
[3] -> {0, 1, 2} -> {{}, {{}}, {{}, {{}}}}
Even though numbers are normally seen as values of all the same size, e.g. 32-bit numbers, it turns out you can treat numbers like a structured datatype. Let's make constructors for natural numbers:
data Nat = Zero | Succ Nat
Now we can translate the number three:
[3] -> Succ (Succ (Succ (Zero)))
Probably not very efficient, but it has a nice symmetry with our use of structured datatypes.
But we can't do any of those things with these untyped functions. So Alonzo Church came up with a way of encoding numerals in the lambda calculus. We call these the Church numerals.
It's closely related to the Peano numerals. Note that Succ
is a function
that takes a number and returns its successor. What if we abstracted
out the constructors, just treated them as arbitrary functions?
zero s z = z
one s z = s z
two s z = s (s z)
three s z = s (s (s z))
...
These are the Church numerals. Feed in Succ and Zero as parameters, and you get back the Peano numerals!
three Succ Zero = Succ (Succ (Succ Zero))
Alternately, you could apply it to (+1) and 0:
three (+1) 0 = ((+1) ((+1) ((+1) 0))) = 3
(Nerd note: The Church numerals are catamorphisms over the natural numbers, that have been applied to a number, but are waiting for an F-algebra as parameters.)
Because the Church numerals are purely functions and don't rely on datatypes, we can represent them in the lambda calculus.
And anything you can represent in the lambda calculus, you can also represent with the SKI-calculus.
So how can we represent zero? It needs to take two curried arguments, and return the second.
zero = K I
K I s z
I z
z
By passing I
to K
, we can skip the first argument, instead of the second!
Now how about one?
one = I
I s z
s z
Even simpler!
Two is a lot trickier. Somehow we need to take an argument, and make
it repeat itself. Note again the definition of S
:
S f g h = f h (g h)
S
is the only combinator that lets you repeat something! It repeats the h
.
But we don't want to apply s
to s
, we want to apply it to s
applied to z
.
Hmm. So what we want to do is compose s
with itself.
Let's break it down into two pieces: function composition, and duplication.
For function composition, we want a function .
such that
f . g h = f (g h)
In other words, it's shifting the parenthesization from the (implicit) left-hand side
to the right-hand side. Well, S
has parentheses on the right:
S f g h
f h (g h)
But there's that problematic h
on the left. How can we get rid of the function in
the second position? K
!
K f h (g h)
f (g h)
So our answer is K S
, right? Not so fast:
K S f g h
S g h
We want... S K
?
S K f g h
K g (f g) h
No, it's not even getting to the h
! We need K to be applied to f first:
S (K f) g h
K f h (g h)
f (g h)
Yeah. Make that K
eat the first h
. But... how do we get there? We can't
just stick the f
in with the K like that. It looks like we need the .
operator to solve our problem!
S . K f g h
S (K f) g h
So we are stuck needing .
to write .
! But it turns out that we can take
advantage of the fact that we know what functions we are trying to compose.
Consider the following, where E1 and E2 are known combinatory expressions:
S (K E1) E2 f
K E1 f (E2 f)
E1 (E2 f)
So, we can't just shove E1 in there in the case it's some parameter, but we
can do it when we know in advance what it's supposed to be. OK, let's apply
that, where E1
= S
and E2
= K
:
S (K S) K f g h
K S f (K f) g h
S (K f) g h
K f h (g h)
f (g h)
Voilà! So S (K S) S
is our composition operator:
f . g = S (K S) S f g
How can we use our handy-dandy S
combinator to replicate arguments?
Consider again the definition of S
:
S f g h = f h (g h)
What if g
were I
? Then we'd get f h h
. That's great... except we
can't control g
... we can only substitute predefined arguments starting
on the left.
But wait... what if f
were the composition operator? After all, what
we want to do is compose h
with itself. So,
S (.) I h
(.) h (I h)
(.) h h
h . h
Great! Now all we need to do is expand .
into our definition:
S (S (K S) K) I s z
S (K S) K s (I s) z
S (K S) K s s z
K S s (K s) s z
S (K s) s z
K s z (s z)
s (s z)
And we have our definition for the Church numeral two!
two = S (S (K S) K) I
Wow, that was complicated. Fortunately, it turns out all the other
numerals just involve repeating the magical S (S (K S) K)
.
three = S (S (K S) K) (S (S (K S) K) I)
You may have noticed that the SKI calculus is verbose compared to the lambda calculus. That's the down side. Due to the smaller number of symbols, it requires lengthier encodings for the same functions.
But just like with the lambda calculus, you can combine functions and give the new functions their own names, to ease the notational burden.
Although SKI-calculus is untyped, we have seen that we can regard output as representing particular values. Church numerals can be converted to Peano numerals just by applying them to Succ and Zero.
Reverse application:
S (K (S I)) K f g
K (S I) f (K f) g
S I (K f) g
I g (K f g)
g (K f g)
g f
Self-application:
S I I f
I f (I f)
f f
Crazy infinite loop:
S I I (S I I)
I (S I I) I (S I I)
S I I (S I I)
Y combinator for ultimate recursion power:
Y = S (K (S I I)) (S (S (K S) K) (K (S I I)))
Since the SKI calculus and lambda calculus are equivalent, it turns out there is a straightforward transformation from one to the other, and vice versa.
So if we can write something as a function in lambda calculus, we can mechanically translate it to SKI.
So you thought it was fun doing everything with only three combinators,
right? It turns out I
isn't necessary. Consider:
S K K f
K f (K f)
f
So S K K
is the same as I
! I
is just a convenience. You may also
notice that the second K
was ignored, so S K S
works just as well.
It turns out we can do it all with just one combinator! Chris Barker
invented a combinator called iota, or ι, that can be used to
generate S
and K
.
Sort of like how the square root of -1, when multiplied by itself, gives you
-1, iota applied to itself gives you I
! Here, I'll refer to iota as the
X
combinator for easier reading.
I = X X
Here's the definition of X
:
X f = f S K
That's it. It feeds S
and K
to a function as arguments. Let's see
what happens when we keep chaining S
and K
:
S K S K
K K (S K)
K
So, X (X X) = K
! Let's keep going:
K S K
S
So, X (X (X X)) = S
! We have recovered the original combinators from
self-application of X
. Keep going:
S S K
Well, that can't be reduced. One more cycle:
S S K S K
S S (K S) K
S K (K S K)
Hmm... that looks complicated. But wait: we know that S K
followed by
any function is I
! So it just reduces to I
.
We have a five-step rotation:
X X = I
X I = S K
X (S K) = K
X K = S
X S = S S K
X (S S K) = I
You can play with iota in Javascript on Barker's page.
Here is a step-by-step translation of iota to SKI:
\f -> (f S) K
S (\f -> f S) (\f -> K)
S (\f -> f S) (K K)
S (S (\f -> f) (\f -> S)) (K K)
S (S I (K S)) (K K)
Here is a proof that it works:
S (S I (K S)) (K K) f
S I (K S) f (K K f)
I f (K S f) (K K f)
f (K S f) (K K f)
f S (K K f)
f S K
Barker didn't stop there. He thought, don't all those parentheses make reasoning a bit hard? You need a parser to read them out. So he also invented Jot, a representation with two combinators, and no parentheses.
To facilitate writing things in binary, he called the combinators 0
and 1
. The 0
combinator is just iota
.
0 f = f S K
The 1
combinator is just function composition!
1 f g h = f (g h)
Just for fun, the empty string is defined to be I
.
Here's how to encode K
in Jot. I'll put brackets around the Jot
code, and incrementally evaluate it into SKI.
[11100]
[1110]SK
[111]SKSK
[11](SK)SK
[1](SKS)K
[](SKSK)
ISKSK
SKSK
K
We know that last step from the reasoning above. And here's S
:
[11111000]
[1111100]SK
[111110]SKSK
[11111]SKSKSK
[1111](SK)SKSK
[111](SKS)KSK
[11](SKSK)SK
[1](SKSKS)K
[](SKSKSK)
SKSKSK
S
It may seem like the 1
s are unnecessary, but they are needed if you
are nesting these inside some larger expression, to maintain grouping.