Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Phantom type representation of integers as differences of natural numbers

12 views
Skip to first unread message

vesa.k...@cs.helsinki.fi

unread,
Mar 4, 2006, 8:47:00 PM3/4/06
to
Matthias Blume's comment that

``There is still no type that, for example, would force two otherwise
arbitrary arrays to differ in size by exactly one.''

in his article

No-Longer-Foreign: Teaching an ML compiler to speak C ``natively''
(http://citeseer.ist.psu.edu/blume01nolongerforeign.html)

led me to experiment with phantom type number representation
techniques. In this article I will describe a very simple phantom type
representation of integers as differences of natural numbers that
allows us to assert, for example, that the length of the concatenation
of vectors (or arrays) of length n and m is of length n+m. Still, this
technique is by no means a panacea.

It would surprise me if this simple technique has not been discovered
earlier. However, after skimming the page

http://www.haskell.org/tmrwiki/NumberParamTypes

I got the impression that the technique may not at least be widely
known. At any rate, it was quite fun to (re)discover the technique!

...

It is probably well known that one can represent the natural numbers
as, essentially, the Peano axioms:

structure N :> (* N stands for Natural number *)
sig
eqtype z (* zero *)
eqtype 'a s (* successor of 'a *)
eqtype 'a n (* assertion that 'a is a "number" *)
val z : z n (* constant 0 *)
val s : 'a n -> 'a s n (* successor function *)
val p : 'a s n -> 'a n (* predecessor function *)
val toInt : 'a n -> int (* conversion to integer *)
end =
struct
type z = unit
type 'a s = unit
type 'a n = int
val z = 0
fun s n = n+1
fun p n = n-1
fun toInt n = n
end

This representation allows us to express constraints such that one
vector must have exactly n (where n is any natural number) more
elements than another array. For example, with an appropriate
definition for the `n_vector' type constructor, the specification

val foo : ('a N.n, 'b) n_vector -> ('a N.s N.s N.s N.n, 'b) n_vector

would essentially assert that `foo' takes a vector of any length l and
produces a vector of length l+3. Similarly,

val bar : ('a N.s N.s N.s N.n, 'b) n_vector -> ('a N.n, 'b) n_vector

would assert that `bar' takes a vector of any length l+3 and produces a
vector of length l.

However, this is pretty much the extent of the representation. The
problem is that the only operation that can be done in a constant
number of unification steps on the above representation is incrementing
or decrementing by a constant. In order to allow more general
constraints to be expressed, we must have a more "expressive" or,
rather, a more "efficient" representation of numbers. In particular,
the representation must make it possible to perform more complicated
arithmetic operations in a constant number of unification steps.

Thinking about the problem (sleeping on it a couple of nights), it
occured to me that this is exactly the kind of problem that is solved
by difference lists in logic programming and that the Hindley-Milner
type system can be seen as a very restricted logic programming
language.

The basic idea of difference lists is to represent a list as a
difference of two lists. As a consequence of the representation, two
difference lists can be concatenated in O(1) time. You can find
explanations of difference lists from various sources (try google) and
many books, for example the book The Art of Prolog, 2nd ed., by Leon
Sterling and Ehud Shapiro explains the technique. So, I will not
explain the technique further.

Applying the idea of "difference structures" to the problem of
representing integers one fairly mechanically arrives at the following
kind of representation of integers as differences of natural numbers:

structure Z :> (* Z stands for "Zahlen" *)
sig
eqtype 'a s (* successor of 'a *)
eqtype ('a, 'b) z (* difference 'a-'b *)
val z : ('a, 'a) z (* constant 0 ("zero") *)
val u : ('a s, 'a) z (* constant 1 ("unit") *)
val ~ : ('a, 'b) z -> ('b, 'a) z (* negation function *)
val + : ('a,'b) z * ('b,'c) z -> ('a,'c) z(* sum fn *)
val - : ('a,'b) z * ('c,'b) z -> ('a,'c) z(* difference fn *)
val toInt : ('a, 'b) z -> int (* conversion to int *)
end =
struct
type 'a s = unit
type ('a, 'b) z = int
val z = 0
val u = 1
val ~ = Int.~
val op+ = Int.+
val op- = Int.-
fun toInt n = n
end

The specification of Z.~ can be read as

~(a-b) = (b-a) ,

the specification of Z.+ can be reads as

(a-b) + (c-b) = (a-c) ,

and the specification of Z.- can be read as

(a-b) - (b-c) = (a-c) .

Note that the module Z is not "minimal". For example, the value Z.z
can be obtained by Z.- (Z.u, Z.u) and one of {Z.+, Z.-, Z.~} can be
implemented in terms of the other two operations and the constant Z.z.

Due to the "efficient" sum and difference functions, we can now express
equality constraints that involve sums and differences of integers.
OTOH, an obvious drawback is that we are using integers (which can be
negative) rather than natural numbers (which are non-negative).

As an example, here is a *simplistic* N_Vec implementation:

signature N_VEC =
sig
type ('n, 'a) n_vec

val fromList : ('n,'m) Z.z * 'a list -> (('n,'m) Z.z,'a) n_vec
val length : (('n, 'm) Z.z, 'a) n_vec -> ('n, 'm) Z.z
val toList : (('n, 'm) Z.z, 'a) n_vec -> 'a list
end

structure N_Vec :> N_VEC =
struct
type ('n, 'a) n_vec = 'n * 'a vector

fun fromList (n, l) =
let val a = Vector.fromList l
in if Z.toInt n<>Vector.length a then raise Size else (n,a)
end
fun length (n, _) = n
fun toList (_, a) = rev (Vector.foldl op:: [] a)
end

As you can see, the fromList function performs a run-time check of the
array size. This is not unlike the Haskell technique described on the
page

http://www.haskell.org/tmrwiki/NumberParamTypes .

However, the implementation also carries an integer value, which is not
needed in the Haskell technique. We could get rid of the extra integer
value by providing a non-abstract (and consequently unsafe) view of the
Z module, exposing the fact that

type ('a, 'b) z = int ,

for situations like this where we would like to cast a run-time integer
to a compile-time integer.

Let's then implement some further operations on N_Vecs.

open N_Vec

fun append (a, b) =
fromList (Z.+ (length a, length b), toList a @ toList b)
fun split (i, a) =
(fromList (i, List.take (toList a, Z.toInt i)),
fromList (Z.- (length a, i), List.drop (toList a, Z.toInt i)))
fun rotate (n, a) = append ((fn (a,b) => (b,a)) (split (n, a)))

Type inference yields the following types

val append = fn
: (('a,'b) Z.z,'c) n_vec * (('b,'d) Z.z,'c) n_vec
-> (('a,'d) Z.z,'c) n_vec
val split = fn
: ('a,'b) Z.z * (('c,'b) Z.z,'d) n_vec
-> (('a,'b) Z.z,'d) n_vec * (('c,'a) Z.z,'d) n_vec
val rotate = fn
: ('a,'b) Z.z * (('c,'b) Z.z,'d) n_vec -> (('c,'b) Z.z,'d) n_vec

which, as one can verify, do express the desired assertions, although
some checks are performed at run-time.

(It is getting a bit late, so I'll stop here.)

-Vesa Karvonen

Vesa Karvonen

unread,
Mar 5, 2006, 4:12:58 AM3/5/06
to
vesa.k...@cs.helsinki.fi wrote:
[...]

> val + : ('a,'b) z * ('b,'c) z -> ('a,'c) z(* sum fn *)
> val - : ('a,'b) z * ('c,'b) z -> ('a,'c) z(* difference fn *)
[...]

> the specification of Z.+ can be reads as

> (a-b) + (c-b) = (a-c) ,

> and the specification of Z.- can be read as

> (a-b) - (b-c) = (a-c) .

[...]

I guess I was already sleeping when I wrote the formulas. The correct
formulas are obviously

(a-b) + (b-c) = (a-c)

for Z.+, and

(a-b) - (c-b) = (a-c)

for Z.-.

Also, I should point out another (fairly obvious) drawback of this
technique (when used in SML). Consider the expression

let val two = Z.+ (Z.u, Z.u) in Z.+ (two, two) end

that fails to type check. The problem is that the type variables in
the type of `two' aren't generalized (due to value restriction).

-Vesa Karvonen

0 new messages