Bitvectors of statically-checked width: decimal numbers as types

ol...@pobox.com

unread,

Feb 20, 2002, 8:53:56 PM2/20/02

to

The problem is to define a datatype of bitvectors of a
statically-specified and *statically-checked* width. That is, we would
like to ensure that both
BitVector (-1) ...
and (BitVector 10 v) `b_and` (BitVector 11 v)

were _type_ errors. Attempts to specify a negative width and attempts
to `and` two bitvectors of different lengths should be flagged as
errors at compile time. We would also like to avoid Church numerals
when specifying dimensions of the vector [1]. In short, we would like
to
have a _type_ of decimal numbers. For the current application, we
don't actually need to perform any operations on 'decimal types' other
than comparison (which the typechecker does automatically for us).

The present article shows the solution. The idea belongs to Matthias
Blume, who wrote a similar code in SML. The present article is a rough
translation of that code (of whatever I remember from that code, to be
precise).

The code starts with boring details. Therefore, it is relegated to the
Appendix. The code defines classes of digits and of cardinals
(non-negative numbers):

class Card c where
c2Int:: (Num a) => c -> a -- convert to a number

Single-digit, two-digit and three-digit decimal numbers are declared
cardinals.

-- two-digit numbers are also cardinal numbers
instance (Digit d1,Digit d2) => Card (d1,d2) where
c2Int (d1,d2) = 10*(d2Int d1) + (d2Int d2)

-- three-digit numbers are also cardinal numbers
instance (Digit d1,Digit d2,Digit d3) => Card (d1,d2,d3) where
c2Int (d1,d2,d3) = 100*(d2Int d1) + 10*(d2Int d2) + (d2Int d3)

Note that we're talking about decimal number _types_ rather than
decimal number values.

We are ready to define a BitVector of statical width:

type Value = Integer
data (Card c) => BitVector c = BV c Value

bv_val (BV c v) = v
bv_width (BV c v) = c2Int c

We make our BitVector a member of Show so we can print bitvectors
nicely:

instance (Card c) => Show (BitVector c) where
show (BV c val) = "Bitvector of width " ++ (show $ c2Int c) ++ "
and value " ++ show val

Examples:
Main> BV D1 0
Bitvector of width 1 and value 0
Main> BV (D1,D2) 123
Bitvector of width 12 and value 123
Main> BV (D5,D7,D9) 31415
Bitvector of width 579 and value 31415

Let's generalize and define a class Bitable, of abstract bit fields:

class Bitable c where
b_and:: c -> c -> c
b_or:: c -> c -> c
b_not:: c -> c
b2Int:: (Num a) => c -> a
b_width:: c -> Int

and its instances Bool and BoolL [Bool]. Bool is a concrete bit field
of
width 1. The list of booleans is a concrete representation of
arbitrary bit fields.

Main> bv_to_booll $ BV (D1,D0) 10
[False,False,False,False,False,False,True,False,True,False]
Main> b2Int $ BoolL $ bv_to_booll $ BV (D1,D0) 10
10

Our static bitvector is also an implementation of a bit field, with a
compacted value:

instance (Card c) => Bitable (BitVector c) where
b_and bv1@(BV c1 v1) bv2@(BV c2 v2) =
BV c1 $ b2Int $ b_and (BoolL $ bv_to_booll bv1)
(BoolL $ bv_to_booll bv2)
b2Int = fromInteger . bv_val
b_width = bv_width

Thus, we have two implementations of arbitrary bit fields: as list of
booleans and as of bitvectors:

Main> (BoolL [False,True,True]) `b_and` (BoolL [False,False,True])
BoolL [False,False,True]
Main> (BV (D1,D0) 10) `b_and` (BV (D1,D0) 9)
Bitvector of width 10 and value 8

These two implementations have rather different static property. It's
OK to AND two bit lists of different widths:

Main> (BoolL [False,True,True]) `b_and` (BoolL [True])
BoolL [False]

However, if we try
Main> (BV (D1,D0) 10) `b_and` (BV (D1,D1) 7)
we get
ERROR - Type error in application
*** Expression : b_and (BV (D1,D0) 10) (BV (D1,D1) 7)
*** Term : BV (D1,D0) 10
*** Type : BitVector (D1,D0)
*** Does not match : BitVector (D1,D1)

This is a compile-time, TYPE error!

[1] The solution utilizing Church numerals as number types is given
in
http://pobox.com/~oleg/ftp/Haskell/number-parameterized-types.txt

Appendix. The complete code.
-- Types and constructors for individual digits
-- Depending on the context, D5 stands either for the type for numeral
-- 5, or for the data constructor for numeral 5

data D0 = D0
data D1 = D1
data D2 = D2
data D3 = D3
data D4 = D4
data D5 = D5
data D6 = D6
data D7 = D7
data D8 = D8
data D9 = D9

class Digit d where -- class of digits
d2Int:: (Num a) => d -> a -- convert to a number

instance Digit D0 where
d2Int _ = 0
instance Digit D1 where
d2Int _ = 1
instance Digit D2 where
d2Int _ = 2
instance Digit D3 where
d2Int _ = 3
instance Digit D4 where
d2Int _ = 4
instance Digit D5 where
d2Int _ = 5
instance Digit D6 where
d2Int _ = 6
instance Digit D7 where
d2Int _ = 7
instance Digit D8 where
d2Int _ = 8
instance Digit D9 where
d2Int _ = 9

-- Class of cardinal numbers
class Card c where
c2Int:: (Num a) => c -> a -- convert to a number

-- Single-digit numbers are cardinal numbers
instance Card D0 where
c2Int _ = 0
instance Card D1 where
c2Int _ = 1
instance Card D2 where
c2Int _ = 3
instance Card D3 where
c2Int _ = 3
instance Card D4 where
c2Int _ = 4
instance Card D5 where
c2Int _ = 5
instance Card D6 where
c2Int _ = 6
instance Card D7 where
c2Int _ = 7
instance Card D8 where
c2Int _ = 8
instance Card D9 where
c2Int _ = 9

-- two-digit numbers are also cardinal numbers
instance (Digit d1,Digit d2) => Card (d1,d2) where
c2Int (d1,d2) = 10*(d2Int d1) + (d2Int d2)

-- three-digit numbers are also cardinal numbers
instance (Digit d1,Digit d2,Digit d3) => Card (d1,d2,d3) where
c2Int (d1,d2,d3) = 100*(d2Int d1) + 10*(d2Int d2) + (d2Int d3)

-- BitVectors of statically-specified width
-- The value of the bitvector is an Integer
type Value = Integer
data (Card c) => BitVector c = BV c Value

instance (Card c) => Show (BitVector c) where
show (BV c val) = "Bitvector of width " ++ (show $ c2Int c) ++
" and value " ++ show val

bv_val (BV c v) = v
bv_width (BV c v) = c2Int c

-- BV (D1,D2,D0) 10

-- The class of abstract bit fields
class Bitable c where
b_and:: c -> c -> c
b_or:: c -> c -> c
b_not:: c -> c
b2Int:: (Num a) => c -> a
b_width:: c -> Int

-- A boolean is a bit field of width 1
instance Bitable Bool where
b_and = (&&)
b_or = (||)
b_not = not
b2Int True = 1
b2Int False = 0
b_width _ = 1

-- Another concrete representation for bit fields: a list of booleans

newtype BoolL = BoolL [Bool] deriving Show -- a list of booleans

instance Bitable BoolL where
b_and (BoolL l1) (BoolL l2) = BoolL $ zipWith (&&) l1 l2
b_or (BoolL l1) (BoolL l2) = BoolL $ zipWith (||) l1 l2
b_not (BoolL l) = BoolL $ map not l
b2Int (BoolL l) = foldl (\accum x -> 2*accum + b2Int x) 0 l
b_width (BoolL l) = length l

-- Convert a BitVector to a list of booleans
bv_to_booll (BV c v) = bvtb (c2Int c) v []
where
-- bvtb rem_width rem_val bitl
bvtb 0 _ bitl = bitl
bvtb rem_width 0 bitl = (replicate rem_width False) ++ bitl
bvtb rem_width rem_val bitl =
bvtb (rem_width-1) (rem_val `div` 2) (odd rem_val:bitl)

-- Our BitVector is also an implementation of abstract Bit Field:

instance (Card c) => Bitable (BitVector c) where
b_and bv1@(BV c1 v1) bv2@(BV c2 v2) =
BV c1 $ b2Int $ b_and (BoolL $ bv_to_booll bv1)
(BoolL $ bv_to_booll bv2)
b2Int = fromInteger . bv_val
b_width = bv_width

-- Examples:
ex1 = bv_to_booll $ BV (D1,D0) 10
ex1'= b2Int $ BoolL $ bv_to_booll $ BV (D1,D0) 10

ex2 = (BoolL [False,True,True]) `b_and` (BoolL [False,False,True])

-- It's OK to operate on BoolL of different length:
ex2'= (BoolL [False,True,True]) `b_and` (BoolL [True])

ex3 = (BV (D1,D0) 10) `b_and` (BV (D1,D0) 9)

-- But the following gives the type error:
-- ex3' = (BV (D1,D0) 10) `b_and` (BV (D1,D1) 7)
-- ERROR - Type error in application
-- *** Expression : b_and (BV (D1,D0) 10) (BV (D1,D1) 7)
-- *** Term : BV (D1,D0) 10
-- *** Type : BitVector (D1,D0)
-- *** Does not match : BitVector (D1,D1)

Ketil Malde

unread,

Feb 22, 2002, 3:48:04 AM2/22/02

to

ol...@pobox.com (ol...@pobox.com) writes:

> The problem is to define a datatype of bitvectors of a
> statically-specified and *statically-checked* width. That is, we would
> like to ensure that both
> BitVector (-1) ...
> and (BitVector 10 v) `b_and` (BitVector 11 v)
>
> were _type_ errors.

This sounds very similar to (or a specialization of?) (I forget whose)
a lisp-list implementation, also in Haskell, and announced, I think,
in this forum. Tuples together with clever use of classes allowed
sequences with differently typed elements, with the type of the
sequence being unique for the sequence of types in the sequence, if
you follow me.

-kzm
--
If I haven't seen further, it is by standing in the footprints of giants

Matthias Blume

unread,

Feb 22, 2002, 9:44:34 PM2/22/02

to

ol...@pobox.com wrote:
> The problem is to define a datatype of bitvectors of a
> statically-specified and *statically-checked* width.

[ ... more problem description snipped ... ]

> The present article shows the solution. The idea belongs to Matthias
> Blume, who wrote a similar code in SML. The present article is a rough
> translation of that code (of whatever I remember from that code, to be
> precise).

Good thing you added the parethetical remark. Your encoding is weaker than
the one I had because (as far as I can tell) it restricts widths to
at most three digits. Of course, you can add more instances to your Card
class, but that is more work (and needs to be redone every time you need
another place) than just writing, e.g.,

dec d1 d2 d3 d4 d5 d6 d7 d8 d9 d0 d9 d8 d7 d6 d5 d4 d3 d2 d1 dim

to specify the type for the insanely large number 1234567890987654321
and have it "just work" within the framework that has been set up once
and for all times at the beginning.

So one can write the type for any natural number that way, and there
is also an expression counterpart that lets you construct 'a dim values.
(Interestingly, I was able to make the value expression syntactically
equal to its own type expression, which is kind of neat -- although not
essential. :-)

I have used my encoding with good success in an implementation of a new FFI for
SML/NJ (where it is used to represent C array dimensions). You can read
the BABEL'01 paper on further details:

http://cm.bell-labs.com/cm/cs/who/blume/papers/nlffi-entcs.pdf

Also, check out the latest working version of SML/NJ which ships with this stuff.
(But beware: The details are still in great flux.)
While I am advertising: Remember Allen Leung's recent announcement of an
alpha version of gtk+ bindings. Those were done using the above mentioned new
FFI (although I believe that dimension types do not play an overly prominent
role in this particular application).

[ ... much more code snipped ... ]

Matthias

Louis Madon

unread,

Feb 23, 2002, 3:11:32 AM2/23/02

to

Ketil Malde wrote:

>

> This sounds very similar to (or a specialization of?) (I forget whose)
> a lisp-list implementation, also in Haskell, and announced, I think,
> in this forum. Tuples together with clever use of classes allowed
> sequences with differently typed elements, with the type of the
> sequence being unique for the sequence of types in the sequence, if
> you follow me.
>

Could you explain this a little? Do you mean that ordinary tuples are
given sequence-like operations (eg. cons, head, tail, length) by using
type classes?

(I tried doing a search on google groups for a "lisp-list" announcement
in c.l.f but your message was the only hit).

Louis.

ol...@pobox.com

unread,

Feb 23, 2002, 5:21:28 AM2/23/02

to

Matthias Blume <matt...@shimizu-blume.com> wrote in message news:<3C770293...@shimizu-blume.com>...
[regarding encoding of decimal numbers as types]

> Good thing you added the parethetical remark. Your encoding is weaker than
> the one I had because (as far as I can tell) it restricts widths to
> at most three digits. Of course, you can add more instances to your Card
> class, but that is more work (and needs to be redone every time you need
> another place) than just writing, e.g.,
>
> dec d1 d2 d3 d4 d5 d6 d7 d8 d9 d0 d9 d8 d7 d6 d5 d4 d3 d2 d1 dim
>
> to specify the type for the insanely large number 1234567890987654321
> and have it "just work" within the framework that has been set up once
> and for all times at the beginning.

The previous framework is pretty easy to generalize:

-- Class of cardinal numbers
class Card c where
c2Int:: (Num a) => c -> a -- convert to a number

c2pos:: (Num a) => c -> a -- decimal position: 100 for a
two-digit number

data Dim = Dim -- dimensionality specifier

instance Card Dim where
c2Int _ = 0
c2pos _ = 1

data (Card a) => D0 a = D0 a
data (Card a) => D1 a = D1 a
data (Card a) => D2 a = D2 a
data (Card a) => D3 a = D3 a
data (Card a) => D4 a = D4 a
data (Card a) => D5 a = D5 a
data (Card a) => D6 a = D6 a
data (Card a) => D7 a = D7 a
data (Card a) => D8 a = D8 a
data (Card a) => D9 a = D9 a

instance (Card c) => Card (D0 c) where
c2Int (D0 x) = c2Int x
c2pos (D0 x) = 10 * c2pos x
instance (Card c) => Card (D1 c) where
c2Int (D1 x) = 1*(c2pos x) + c2Int x
c2pos (D1 x) = 10 * c2pos x
instance (Card c) => Card (D2 c) where
c2Int (D2 x) = 2*(c2pos x) + c2Int x
c2pos (D2 x) = 10 * c2pos x
...
instance (Card c) => Card (D9 c) where
c2Int (D9 x) = 9*(c2pos x) + c2Int x
c2pos (D9 x) = 10 * c2pos x

And that's it. The more general notation is actually shorter than the
one restricted to cardinals less than 1000.

Note the definition of the bitvector is _exactly_ the same as in the
previous article. Actually, the rest of that article can be used as it
is. Including the examples:
Main> (BV (D1 $ D0 $ Dim) 10) `b_and` (BV (D1 $ D0 $ Dim) 9)

Bitvector of width 10 and value 8

Main> (BV (D1 $ D0 $ Dim) 10) `b_and` (BV (D1 $ D1 $ Dim) 7)

ERROR - Type error in application

*** Expression : b_and (BV (D1 $ D0 $ Dim) 10) (BV (D1 $ D1 $
Dim) 7)
*** Term : BV (D1 $ D0 $ Dim) 10
*** Type : BitVector (D1 (D0 Dim))
*** Does not match : BitVector (D1 (D1 Dim))

And now, your example:
Main> BV (D1 $ D2 $ D3 $ D4 $ D5 $ D6 $ D7 $ D8 $ D9 $ D0 $ D9 $ D8 $
D7 $ D6 $ D5 $ D4 $ D3 $ D2 $ D1 $ Dim) 1234567879
Bitvector of width 1234567890987654321 and value 1234567879

Bitvectors of that size ought to suffice for the original poster
(unless he works for NSA).

> So one can write the type for any natural number that way, and there
> is also an expression counterpart that lets you construct 'a dim values.
> (Interestingly, I was able to make the value expression syntactically
> equal to its own type expression, which is kind of neat -- although not
> essential. :-)

The same here:
Main> :type BV (D1 $ D2 $ D3 $ D4 $ Dim) 1213
BV (D1 $ D2 $ D3 $ D4 $ Dim) 1213 :: BitVector (D1 (D2 (D3 (D4
Dim))))

(D1 $ D2 $ D3 $ D4 $ Dim) is a value and the type at the same time. To
be precise, the value expression is spelled the same as the type
expression). Actually it is an essential property: this makes type
errors (as the one above) comprehensible. You can see exactly the
mis-matched dimensions.

But the restricted cardinals have their uses too. It is relatively
straightforward to define _addition_ of such cardinals (which is, at
the
same time, is the addition of their types). The next article provides
the details.