Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Caml-list] Eliminating array bounds check

14 views
Skip to first unread message

ol...@pobox.com

unread,
Sep 3, 2006, 9:06:32 PM9/3/06
to caml...@inria.fr

On Aug 31, John Skaller wrote:
> Typing is always an abbreviation (abstraction) and sometimes
> stronger or weaker than desired: for example array bounds
> checks at run time, because the type system doesn't cope
> with array sizes as part of the type.

Although that is true that making the type system track the size of a
(dynamically allocated) array is too much of a hassle, array bounds
checks at run-time can be entirely and safely avoided, in OCaml as it
is.

For example:
http://pobox.com/~oleg/ftp/ML/eliminating-array-bound-check-literally.ml

shows how to implement the bsearch (the standard Dependent ML example
from the famous Xi and Pfenning's PLDI98 paper) in the current
OCaml. The above code has exactly the same number of checks as the DML
code; there are no array bound checks -- and yet the code has the same
static assurances of the absence of out-of-bounds array access
errors. The code (given at the end of that file) even looks quite like
the original DML code (quoted at the beginning of that file), only
without any type annotations.

A more interesting example is the textbook KMP string search, which
uses mutable arrays, general recursion, and creative index expressions
(with mutable arrays storing array indices).

http://pobox.com/~oleg/ftp/Computation/lightweight-dependent-typing.html

(well, the referenced code is actually in Haskell, but it is easy to
re-write it in OCaml; I can do that if called to).

The above page points to the PLPV talk that contains formalization and
the proof method (as well some proofs in Twelf). Briefly, we rely on
the type system to propagate assurances made in a small `security
kernel' through the rest of the code. The security kernel does have to
be verified. Our examples show that the kernel is far simpler than the
rest of the code: the KMP kernel, for example, is made of
non-recursive functions performing additions and subtraction of
integers.

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Alain Frisch

unread,
Sep 4, 2006, 3:26:20 AM9/4/06
to ol...@pobox.com
Hello Oleg,

ol...@pobox.com wrote:
> For example:
> http://pobox.com/~oleg/ftp/ML/eliminating-array-bound-check-literally.ml

>From your code:
=======================================
(* First, we, on off-chance, check if we can obtain type
eigen-variables via the module system.
*)

module GenT : sig type t val v : t end =
struct type t = int let v = 1 end
;;
let module M1 = GenT in
let module M2 = GenT in
M1.v = M2.v
;;

(* Alas, the latter succeeds and reports no type error. What did we
expect: OCaml functors are applicative.
Fortunately, OCaml supports higher-rank types.
*)
=======================================

What about making GenT a functor and passing it unnamed structures as
arguments? (Ok, you must then trust the client not to apply GenT with
named structures.)

module GenT(X:sig end) : sig type t val v : t end =
struct type t = int let v = 1 end
;;
let module M1 = GenT(struct end) in
let module M2 = GenT(struct end) in
M1.v = M2.v
;;

You could then simplify the TrustedKernel so as not to use polymorphic
record fields (and also to use a direct style instead of a continuation
style for brand).


-- Alain

Alain Frisch

unread,
Sep 4, 2006, 3:44:29 AM9/4/06
to ol...@pobox.com
Alain Frisch wrote:
> What about making GenT a functor and passing it unnamed structures as
> arguments? (Ok, you must then trust the client not to apply GenT with
> named structures.)

Actually, GenT should be a functor anyway, since the abstract types in
the resulting structure must encode invariants which depends on the
(length of the) array. Unfortunaly, you cannot make a polymorphic
functor such as:

module GenT(A : sig val a : 'a array end) ...

As a work-around, you can take only the length of the array as an argument:

module TrustedKernel(L: sig val len: int end) : sig
type 'a barray
type bindex
val unbi : bindex -> int

type bindexL
type bindexH
...
end = struct
let brand a =
assert(Array.length a = L.len);
(a,0,L.len - 1)
end

On the one hand, this adds one run-time check, but on the other hand it
makes the abstract index types depend only on the length (so the trusted
kernel could be used in algorithms which work with several arrays of the
same size simultaneously).

skaller

unread,
Sep 4, 2006, 5:48:41 AM9/4/06
to ol...@pobox.com
On Sun, 2006-09-03 at 18:00 -0700, ol...@pobox.com wrote:

> For example:
> http://pobox.com/~oleg/ftp/ML/eliminating-array-bound-check-literally.ml

This is very interesting .. thanks!!

--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

ol...@pobox.com

unread,
Sep 6, 2006, 1:10:38 AM9/6/06
to Alain....@inria.fr

Hello!

> module GenT(X:sig end) : sig type t val v : t end =
> struct type t = int let v = 1 end
> ;;
> let module M1 = GenT(struct end) in
> let module M2 = GenT(struct end) in
> M1.v = M2.v
> ;;

That is very nice! Thank you.

> (Ok, you must then trust the client not to apply GenT with
> named structures.)

I can hide this code in a trusted kernel: I merely need to be able to
generate unique types. Frankly, what bsearch code really needed is
local generative modules -- something like (in SML notation)

val test = let local structure A = Kernel(val a = ...)
open A in ... end in ...

According to my reading of the Definition of the Standard ML, this is
allowed. Alas, neither SML/NJ nor Poly/ML support this pattern.


> Unfortunaly, you cannot make a polymorphic functor such as:
> module GenT(A : sig val a : 'a array end) ...

But the following close approximation works:

module GenT(A : sig type e val a : e array end)
: sig
type barray
type bindex
type bindexL
type bindexH
val init_indexL : bindexL
val init_indexH : bindexH
val the_arr : barray
val cmp : bindexL -> bindexH -> (bindex * bindex) option
val get : barray -> bindex -> A.e
end = struct
type barray = A.e array
type bindex = int
type bindexL = int
type bindexH = int
let the_arr = A.a
let init_indexL = 0
let init_indexH = Array.length A.a - 1
let cmp i j = if i <= j then Some (i,j) else None
let get = Array.get
end;;

let test1 = let module X = GenT(struct type e = int let a = [|1;2|] end) in
match X.cmp X.init_indexL X.init_indexH with
Some (i,j) -> X.get X.the_arr j
;;

let test2 = let a = [|1;2|] in
let module X = GenT(struct type e = int let a = a end) in
let module Y = GenT(struct type e = int let a = a end) in
match X.cmp X.init_indexL X.init_indexH with
Some (i,j) -> X.get Y.the_arr j
;;
(* expected error:
This expression has type Y.barray but is here used with type X.barray
*)

(* But this is OK *)
let test3 = let module N = struct type e = int let a = [|1;2|] end in
let module X = GenT(N) in
let module Y = GenT(N) in
match X.cmp X.init_indexL X.init_indexH with
Some (i,j) -> X.get Y.the_arr j;;


For formalization, the higher-rank types seem better (System F is
easier to formalize than lambda-calculus plus the module system). For
real work, the module system is obviously nicer. Thank you for the
suggestions!


> As a work-around, you can take only the length of the array as an argument:
> module TrustedKernel(L: sig val len: int end) : sig

Hmm, that means that modular arithmetics and implicit configurations
described in
http://www.eecs.harvard.edu/~ccshan/prepose/prepose.pdf
http://www.eecs.harvard.edu/~ccshan/prepose/talk.ps.gz

are immediately implementable in OCaml. That is very neat.

module ModularNum (M : sig val modulus : int end)
: sig
type t
val of_int : int -> t
val to_int : t -> int
val ( +$ ) : t -> t -> t
val ( -$ ) : t -> t -> t
val ( *$ ) : t -> t -> t
val normalize : int -> t


end = struct
type t = int

let normalize a = a mod M.modulus
let of_int x = normalize x
let to_int x = x
let ( +$ ) x y = normalize (x + y)
let ( -$ ) x y = normalize (x - y)
let ( *$ ) x y = normalize (x * y)
end;;

(* (3*3 + 5*5) modulo 4 *)
let test3 = let module XXX = struct
module M = ModularNum(struct let modulus = 4 end)
open M
let a = of_int 3 and b = of_int 5
let v = a *$ a +$ b *$ b
let result = to_int v
end in XXX.result
;;
(* The modulus is implicit, just as desired. *)
(* It is clear that with a bit of camlp4 sugar, the boilerplate
can be removed and we can write something like
let test3 = with_modulus 4 (let a = of_int 3 and b = of_int 5 in
a *$ a +$ b *$ b)
*)


(* testing non-interference: can't mix computations with different
moduli
The error message,
This expression has type M.t but is here used with type M.t
could be more helpful, though...
*)

(* A computation polymorphic over the modulus *)
module Foo(M : sig val modulus : int end) = struct
module M = ModularNum(M)
open M
let a = of_int 1000
let v = a *$ a *$ of_int 5 +$ of_int 2000
let result = to_int v
end
;;

(* Run the Foo computation over the series of moduli *)
let test4 = List.map
(fun m -> let module M = Foo(struct let modulus = m end) in M.result)
[1;2;3;4;5;6;7;8;9]
;;

Jonathan Roewen

unread,
Sep 6, 2006, 1:17:47 AM9/6/06
to ol...@pobox.com
> I can hide this code in a trusted kernel: I merely need to be able to
> generate unique types. Frankly, what bsearch code really needed is
> local generative modules -- something like (in SML notation)
>
> val test = let local structure A = Kernel(val a = ...)
> open A in ... end in ...
>
> According to my reading of the Definition of the Standard ML, this is
> allowed. Alas, neither SML/NJ nor Poly/ML support this pattern.

I don't understand what the 'generative' part means, but are modules
defined inside a let binding equivalent to the above?

Jonathan

ol...@pobox.com

unread,
Sep 7, 2006, 5:55:00 AM9/7/06
to jonatha...@gmail.com

> > val test = let local structure A = Kernel(val a = ...)
> > open A in ... end in ...
> >
>
> I don't understand what the 'generative' part means, but are modules
> defined inside a let binding equivalent to the above?

In OCaml, functors behave like pure functions: an application of a
functor to identical arguments yields structures with compatible
types. Not so in SML (with strong sealing):

http://www.cs.cmu.edu/~crary/papers/2003/thoms/thoms.pdf
Please see Figs 6 and 7 in that paper.

Here's a sample demonstration:

In Ocaml:

module type S = sig type tt end;;
module SI : S = struct type tt = unit end;;

module Foo (X : S) : sig type t val v : t end


= struct type t = int let v = 1 end;;

let module M = struct
module M1 = Foo(SI)
let x = M1.v
module M2 = Foo(SI)
let y = M2.v
let res = x == y
end in M.res
;;


In SML (poly/ML):

signature S = sig type tt end;
structure SI : S = struct type tt = unit end;

functor Foo (X : S) :> sig type t val v : t end
= struct type t = int val v = 1 end;

(* A function that takes two arguments of the same type *)
fun cmp (x:'t) (y:'t) = true;

local structure M1 = Foo(SI) val x = M1.v
structure M2 = Foo(SI) val y = M2.v
in val t = cmp x y end;

# Error:
Can't unify M1.t with M2.t (Different type constructors) Found near cmp(x)(y)


In retrospect, for bsearch application, either functor would have
sufficed. However, a generative functor seems cleaner (and, in a
higher-ranked emulation) more easily formalizable.

0 new messages