Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

generic programming in ML? (iterators, etc.)

243 views
Skip to first unread message

Jeremy Siek

unread,
Sep 13, 2001, 1:08:41 PM9/13/01
to
Hi,

I am new to ML (coming from the C++ world). I am wondering if anyone is
doing STL-style generic programming in ML? For example, are there any
libraries of generic algorithms out there that can be applied to say, both
a list and an array? It looks like the ML modules are flexible enough to
create algorithm functors that use an iterator signature... has anyone
tried this?

Cheers,
Jeremy

----------------------------------------------------------------------
Jeremy Siek http://php.indiana.edu/~jsiek/
Ph.D. Student, Indiana Univ. B'ton email: js...@osl.iu.edu
C++ Booster (http://www.boost.org) office phone: (812) 855-9761
----------------------------------------------------------------------

Thant Tessman

unread,
Sep 13, 2001, 1:28:43 PM9/13/01
to
Jeremy Siek wrote:

> I am new to ML (coming from the C++ world). I am wondering
> if anyone is doing STL-style generic programming in ML? For
> example, are there any libraries of generic algorithms out
> there that can be applied to say, both a list and an array?
> It looks like the ML modules are flexible enough to
> create algorithm functors that use an iterator signature...
> has anyone tried this?

I did it once just to prove that it could be done (and quite elegantly),
but you're better off spending your time getting comfortable with
higher-order functions, map, fold, and the like. ML is more suited to
this kind of programming--which is ultimately far more powerful anyway.

-thant


Henning Makholm

unread,
Sep 13, 2001, 3:15:12 PM9/13/01
to
Scripsit Thant Tessman <th...@acm.org>
> Jeremy Siek wrote:

> > if anyone is doing STL-style generic programming in ML? For
> > example, are there any libraries of generic algorithms out
> > there that can be applied to say, both a list and an array?

> I did it once just to prove that it could be done (and quite elegantly),

> but you're better off spending your time getting comfortable with
> higher-order functions, map, fold, and the like.

I'm not sure that is a "but" ... I'd say that the functional
abstraction that correspond to C++/STL-style iterators is precisely
a type constructor equipped with map and fold functions.
Witness, e.g., the symmetry between the List, Vector, Word8Vector
and CharVector structures from the SML Basis Library.

It would be quite possible to write functors parameterized by
either List or Vector, if only they had a common name for their
type constructor (as it is, you cannot find a nontrivial common
signature for them .. but that could be solved with a few lines
of code) - but I'm not sure that it would be often that it's
worth the trouble to do so. You need to venture out into the
modules language, and there are not that many generally interesting
algorithms that can be expressed efficiently using only a sequential
structure on the data structures.

The utility would properly be greater if one could use it with
less heavy notation. Haskellish typeclasses would (here's my
favorite hobby horse once again) be great - all they miss is
a precise specification of what they do.

--
Henning Makholm "What the hedgehog sang is not evidence."

Jeremy Siek

unread,
Sep 13, 2001, 3:16:01 PM9/13/01
to
On 13 Sep 2001, Thant Tessman wrote:

thant> Jeremy Siek wrote:
thant>
thant> > I am new to ML (coming from the C++ world). I am wondering
thant> > if anyone is doing STL-style generic programming in ML? For
thant> > example, are there any libraries of generic algorithms out
thant> > there that can be applied to say, both a list and an array?
thant> > It looks like the ML modules are flexible enough to
thant> > create algorithm functors that use an iterator signature...
thant> > has anyone tried this?
thant>
thant> I did it once just to prove that it could be done (and quite elegantly),

Great! could you post some examples?

thant> but you're better off spending your time getting comfortable with
thant> higher-order functions, map, fold, and the like. ML is more suited to

Thanks, I'm comfortable with higher-order functions.

thant> this kind of programming--which is ultimately far more powerful anyway.

Higher-order functions are indeed powerful. In generic programming, HOF's
are considered one of the main aspects of genericity, but they are not the
only aspect. The other main aspect is data-structure interchangeability. I
would claim that polymorphism over the data-structures themselves is
extremely important for code reuse. For example, suppose you have two
applications, one that needs a dynamic graph representation (lots of
insertions/deletions) while the other can use a more static,
space-efficient representation. Now suppose in both applications,
somewhere in the middle, you need to do a depth-first search. Wouldn't it
be nice to reuse the code for the DFS?

Benjamin Ylvisaker

unread,
Sep 13, 2001, 3:45:30 PM9/13/01
to
Jeremy Siek wrote:
> For example, suppose you have two applications, one that needs a
> dynamic graph representation (lots of insertions/deletions) while the
> other can use a more static, space-efficient representation. Now
> suppose in both applications, somewhere in the middle, you need to do
> a depth-first search. Wouldn't it be nice to reuse the code for the
> DFS?

This particular problem is solved by SML's module system. All you have
to do is create a signature for a graph structure (say GRAPH) that
exports a type for graphs (say graph). Then your DFS code lives in a
functor that is parametric on a structure of type GRAPH,

functor graph_algorithms (structure Graph : GRAPH) = structure

type graph = Graph.graph
fun dfs ...
...

end

I am not sure if I got all of the bits and pieces of syntax right, but
the general idea is correct. Later you can instantiate graph_algorithms
with either your dynamic graph structure or your static graph
structure. I think that there should be a simpler way to accomplish
this sort of thing, but this works and achieves your goal, I believe.

If you are interested in graph structures in particular, I highly
encourage you to take a look at Martin Erwig's Functional Graph Library
(http://www.cs.orst.edu/~erwig/fgl/).

Benjamin

Jeremy Siek

unread,
Sep 13, 2001, 8:09:58 PM9/13/01
to
On 13 Sep 2001, Benjamin Ylvisaker wrote:
benjam> Jeremy Siek wrote:
benjam> > For example, suppose you have two applications, one that needs a
benjam> > dynamic graph representation (lots of insertions/deletions) while the
benjam> > other can use a more static, space-efficient representation. Now
benjam> > suppose in both applications, somewhere in the middle, you need to do
benjam> > a depth-first search. Wouldn't it be nice to reuse the code for the
benjam> > DFS?
benjam>
benjam> This particular problem is solved by SML's module system. All you have
benjam> to do is create a signature for a graph structure (say GRAPH) that
benjam> exports a type for graphs (say graph). Then your DFS code lives in a
benjam> functor that is parametric on a structure of type GRAPH,
benjam>
benjam> functor graph_algorithms (structure Graph : GRAPH) = structure
benjam>
benjam> type graph = Graph.graph
benjam> fun dfs ...
benjam> ...
benjam>
benjam> end
benjam>
benjam> I am not sure if I got all of the bits and pieces of syntax right, but
benjam> the general idea is correct. Later you can instantiate graph_algorithms
benjam> with either your dynamic graph structure or your static graph
benjam> structure. I think that there should be a simpler way to accomplish
benjam> this sort of thing, but this works and achieves your goal, I believe.

Right, this is the kind of thing I'm looking for.

benjam> If you are interested in graph structures in particular, I highly
benjam> encourage you to take a look at Martin Erwig's Functional Graph Library
benjam> (http://www.cs.orst.edu/~erwig/fgl/).

I've looked at FGL. It's definitely a step in the right direction.
However, because it is not built on top of an iterator abstraction for
dealing with sequences, there are some efficiency concerns. For example,
the GRAPH signature specifies that functions like fwd and suc return a
list. Unless I'm missing something, the returned list is a *copy* of the
underlying adjacent nodes sequence (which may be a list or array depending
on the graph representation). It would be much more efficient to provide
an interface for accessing the adjacency information in-place, through
some sort of iterator, instead of creating a new list to pass to the user.

Thant Tessman

unread,
Sep 13, 2001, 8:13:47 PM9/13/01
to

Jeremy Siek wrote:

> [...] It looks like the ML modules are flexible enough to


> create algorithm functors that use an iterator signature...

> has anyone tried this?

I wrote:

> I did it once just to prove that it could be done (and

> quite elegantly), [...]

Jeremy Siek:

> Great! could you post some examples?

I wrote it many years ago only to refute some claims Alexander Stepanov
was making about C++ being the only language that supported the kind of
generic programming he wanted to do. I'll append a portion of my
response to Stepanov to the end of this post. (The original thread was
titled "C++ briar patch" if you want to look it up in deja for yourself.)


> [...] In generic programming, HOF's


> are considered one of the main aspects of
> genericity, but they are not the
> only aspect.

HOF's and genericity seem to me to be orthogonal concepts.

> The other main aspect is data-structure interchangeability. I
> would claim that polymorphism over the data-structures themselves is

> extremely important for code reuse. [...]

Function polymorphism versus object polymorphism has begun to smell to
me like an argument about whether photons are waves or particles.
Regardless, I think SML's modules support the kind of data-structure
interchangeability you suspect they do. I'm just not convinced iterators
in particular are a good idea. I doubt they promote reuse better than
design that leverages HOF's, and I know they're conceptually less safe.
Witness, for example, the tauntingly-named "MemoryFault" exception I had
to introduce in the code below...

-thant


---begin post---

Alexander Stepanov wrote:

> But if I was succesful, it was only because
> Bjarne Stroustrup provided me with the right tool.

I wrote:

> This is the claim I reject. It doesn't surprise me at all
> that the STL couldn't be done in Eifel or Java. But SML or
> CLOS or...you're gonna have to substantiate that one.

[...stuff...]

At the end of this post is an implementation of doubly-linked lists
implemented in SML but in the style of the STL. There's also
a "replace" function like the one in the STL just to show
how it would work.

I think it satisfies all your criteria for a genericity.

1) You can mutate what you find in the data structure

Iterators are separate from containers so that:

2) You can restart searches

3) You can have multiple search paths

4) You can consider subranges

5) You can compare two positions for equality

Also, this is fully generic, with no run-time type dispatching.

[...]

(**************)


signature DATASTRUCTURE =
sig
exception MemoryFault
eqtype ''a iterator
eqtype ''a container
val null : ''a iterator
val new : unit -> ''a container
val push_back : ''a container * ''a -> unit
val push_front : ''a container * ''a -> unit
val first : ''a container -> ''a iterator
val last : ''a container -> ''a iterator
val prev : ''a iterator -> ''a iterator
val next : ''a iterator -> ''a iterator
val set : ''a iterator * ''a -> unit
val get : ''a iterator -> ''a
end


structure Chain : DATASTRUCTURE =
struct

exception MemoryFault

datatype ''a iterator =
LNull
| Link of ''a ref * {prev: ''a iterator ref, next: ''a iterator ref}


datatype ''a container = Chain of {head: ''a iterator ref,
tail: ''a iterator ref}

fun new () = Chain {head = ref LNull, tail = ref LNull}

fun push_back (Chain {head, tail}, item) =
case tail of
ref LNull => let
val link = Link (ref item, {prev=ref LNull,
next=ref LNull})
in
tail := link;
head := link
end
| ref (Link (_, {next, ...})) => let
val link = Link(ref item,
{prev = ref (!tail),
next = ref LNull})
in
next := link;
tail := link
end

fun push_front (Chain {head, tail}, item) =
case head of
ref LNull => let
val link = Link (ref item, {prev=ref LNull,
next=ref LNull})
in
tail := link;
head := link
end
| ref (Link (_, {prev, ...})) => let
val link = Link(ref item,
{prev = ref LNull,
next = ref (!head)})
in
prev := link;
head := link
end

fun first (Chain {head as ref link, ...}) = link
fun last (Chain {tail as ref link, ...}) = link

fun next LNull = raise MemoryFault
| next (Link (_, {next as ref n, ...})) = n

fun prev LNull = raise MemoryFault
| prev (Link (_, {prev as ref p, ...})) = p

fun set (LNull, _) = raise MemoryFault
| set (Link (item, _), newValue) = (item := newValue)

fun get LNull = raise MemoryFault
| get (Link (item, _)) = !item

val null = LNull

end

functor SomeAlgos(DataStructure:DATASTRUCTURE) =
struct

(* for now all we have is replace, but you get the point *)

fun replace(first, last, oldValue, newValue) =
let
fun f i =
if (i=DataStructure.null) then ()
else
(if ((DataStructure.get i) = oldValue)
then (DataStructure.set (i, newValue))
else ();
if (i<>last) then f(DataStructure.next i) else ())
in
f(first)
end

end

(* an example of their use *)


open Chain;


val chain : int container = new ();

push_back(chain, 1);
push_back(chain, 2);
push_front(chain, ~1);

val i = first chain;

(* specialize the algorithm to chains, like instantiating a template *)

structure SomeChainAlgos = SomeAlgos(Chain);

SomeChainAlgos.replace(first chain, last chain, 1, 99);


---end post---


Benjamin Ylvisaker

unread,
Sep 13, 2001, 9:56:19 PM9/13/01
to
Jeremy Siek wrote:
>
> I've looked at FGL. It's definitely a step in the right direction.
> However, because it is not built on top of an iterator abstraction for
> dealing with sequences, there are some efficiency concerns. For
> example, the GRAPH signature specifies that functions like fwd and suc
> return a list. Unless I'm missing something, the returned list is a
> *copy* of the underlying adjacent nodes sequence (which may be a list
> or array depending on the graph representation). It would be much more
> efficient to provide an interface for accessing the adjacency
> information in-place, through some sort of iterator, instead of
> creating a new list to pass to the user.

IMPORTANT DISCLAIMER: I have never worked on or even seen the internals
of a real functional language compiler.

It is my understanding that advanced functional language compilers do a
really good job of not copying structures when they don't absolutely
have to. So if you just want to inspect an edges adjacent nodes, the
list that the library functions return is just a list of pointers into
the main graph data structure. This is what an iterator is, no? One of
the important conceptual leaps you have to make when dealing with
functional languages is that lots of independent persistant objects in
your source code do not necessarily correspond to lots of objects in
memory.

Having said that, there are still algorithms and data structures that no
one has figured out how to represent in a purely functional context as
efficiently as in an imperative context. I recommend Chris Okasaki's
book "Purely Functional Data Structures" for a good look at efficiency
of data structures in functional languages.

Benjamin

Jeremy Siek

unread,
Sep 14, 2001, 11:46:50 AM9/14/01
to
On 14 Sep 2001, Thant Tessman wrote:
thant> > Great! could you post some examples?

Thanks!

thant> > [...] In generic programming, HOF's
thant> > are considered one of the main aspects of
thant> > genericity, but they are not the
thant> > only aspect.
thant>
thant> HOF's and genericity seem to me to be orthogonal concepts.

I agree that HOF's and data-structure parameterization are orthogonal
concepts, except I would bundle both ideas under the heading of
"genericity" ;)

thant> > The other main aspect is data-structure interchangeability. I
thant> > would claim that polymorphism over the data-structures themselves is
thant> > extremely important for code reuse. [...]
thant>
thant> Function polymorphism versus object polymorphism has begun
thant> to smell to me like an argument about whether photons are
thant> waves or particles. Regardless, I think SML's modules
thant> support the kind of data-structure interchangeability you
thant> suspect they do. I'm just not convinced iterators in
thant> particular are a good idea. I doubt they promote reuse
thant> better than design that leverages HOF's, and I know they're
thant> conceptually less safe. Witness, for example, the
thant> tauntingly-named "MemoryFault" exception I had to introduce
thant> in the code below...

Hmm, previously you said HOF's and iterators are orthogonal concepts, but
your above statement makes it sound like one has to choose between
iterators and HOF's. Again, my claim is that both are vitally important.

Didn't you really mean to compare using iterators to just always using the
builtin lists of ML? (which don't have the memory fault problem)

I agree that the safety of lists in ML is nice. What would it take to
extend this kind of safety to user-defined types? Is this safety really a
characteristic of lists, or could it be extended to other data structures?

BTW, I don't think that the particulars of how iterators work in C++ is
all that important in the grand scheme... what is important is that a
common interface can be found for traversing data structures.

Cheers,
Jeremy

thant> ---begin post---
thant>
thant> Alexander Stepanov wrote:
thant>
thant> > But if I was succesful, it was only because
thant> > Bjarne Stroustrup provided me with the right tool.
thant>
thant> I wrote:
thant>
thant> > This is the claim I reject. It doesn't surprise me at all
thant> > that the STL couldn't be done in Eifel or Java. But SML or
thant> > CLOS or...you're gonna have to substantiate that one.
thant>
thant> [...stuff...]
thant>
thant> At the end of this post is an implementation of doubly-linked lists
thant> implemented in SML but in the style of the STL. There's also
thant> a "replace" function like the one in the STL just to show
thant> how it would work.
thant>
thant> I think it satisfies all your criteria for a genericity.
thant>
thant> 1) You can mutate what you find in the data structure
thant>
thant> Iterators are separate from containers so that:
thant>
thant> 2) You can restart searches
thant>
thant> 3) You can have multiple search paths
thant>
thant> 4) You can consider subranges
thant>
thant> 5) You can compare two positions for equality
thant>
thant> Also, this is fully generic, with no run-time type dispatching.
thant>
thant> [...]
thant>
thant> (**************)
thant>
thant>
thant> signature DATASTRUCTURE =
thant> sig
thant> exception MemoryFault
thant> eqtype ''a iterator
thant> eqtype ''a container
thant> val null : ''a iterator
thant> val new : unit -> ''a container
thant> val push_back : ''a container * ''a -> unit
thant> val push_front : ''a container * ''a -> unit
thant> val first : ''a container -> ''a iterator
thant> val last : ''a container -> ''a iterator
thant> val prev : ''a iterator -> ''a iterator
thant> val next : ''a iterator -> ''a iterator
thant> val set : ''a iterator * ''a -> unit
thant> val get : ''a iterator -> ''a
thant> end
thant>
thant>
thant> structure Chain : DATASTRUCTURE =
thant> struct
thant>
thant> exception MemoryFault
thant>
thant> datatype ''a iterator =
thant> LNull
thant> | Link of ''a ref * {prev: ''a iterator ref, next: ''a iterator ref}
thant>
thant>
thant> datatype ''a container = Chain of {head: ''a iterator ref,
thant> tail: ''a iterator ref}
thant>
thant> fun new () = Chain {head = ref LNull, tail = ref LNull}
thant>
thant> fun push_back (Chain {head, tail}, item) =
thant> case tail of
thant> ref LNull => let
thant> val link = Link (ref item, {prev=ref LNull,
thant> next=ref LNull})
thant> in
thant> tail := link;
thant> head := link
thant> end
thant> | ref (Link (_, {next, ...})) => let
thant> val link = Link(ref item,
thant> {prev = ref (!tail),
thant> next = ref LNull})
thant> in
thant> next := link;
thant> tail := link
thant> end
thant>
thant> fun push_front (Chain {head, tail}, item) =
thant> case head of
thant> ref LNull => let
thant> val link = Link (ref item, {prev=ref LNull,
thant> next=ref LNull})
thant> in
thant> tail := link;
thant> head := link
thant> end
thant> | ref (Link (_, {prev, ...})) => let
thant> val link = Link(ref item,
thant> {prev = ref LNull,
thant> next = ref (!head)})
thant> in
thant> prev := link;
thant> head := link
thant> end
thant>
thant> fun first (Chain {head as ref link, ...}) = link
thant> fun last (Chain {tail as ref link, ...}) = link
thant>
thant> fun next LNull = raise MemoryFault
thant> | next (Link (_, {next as ref n, ...})) = n
thant>
thant> fun prev LNull = raise MemoryFault
thant> | prev (Link (_, {prev as ref p, ...})) = p
thant>
thant> fun set (LNull, _) = raise MemoryFault
thant> | set (Link (item, _), newValue) = (item := newValue)
thant>
thant> fun get LNull = raise MemoryFault
thant> | get (Link (item, _)) = !item
thant>
thant> val null = LNull
thant>
thant> end
thant>
thant>
thant>
thant> functor SomeAlgos(DataStructure:DATASTRUCTURE) =
thant> struct
thant>
thant> (* for now all we have is replace, but you get the point *)
thant>
thant> fun replace(first, last, oldValue, newValue) =
thant> let
thant> fun f i =
thant> if (i=DataStructure.null) then ()
thant> else
thant> (if ((DataStructure.get i) = oldValue)
thant> then (DataStructure.set (i, newValue))
thant> else ();
thant> if (i<>last) then f(DataStructure.next i) else ())
thant> in
thant> f(first)
thant> end
thant>
thant> end
thant>
thant> (* an example of their use *)
thant>
thant>
thant> open Chain;
thant>
thant>
thant> val chain : int container = new ();
thant>
thant> push_back(chain, 1);
thant> push_back(chain, 2);
thant> push_front(chain, ~1);
thant>
thant> val i = first chain;
thant>
thant> (* specialize the algorithm to chains, like instantiating a template *)
thant>
thant> structure SomeChainAlgos = SomeAlgos(Chain);
thant>
thant> SomeChainAlgos.replace(first chain, last chain, 1, 99);
thant>
thant>
thant> ---end post---
thant>
thant>
thant>

Simon Helsen

unread,
Sep 14, 2001, 11:47:22 AM9/14/01
to
On 13 Sep 2001, Benjamin Ylvisaker wrote:

>This particular problem is solved by SML's module system. All you have
>to do is create a signature for a graph structure (say GRAPH) that
>exports a type for graphs (say graph). Then your DFS code lives in a
>functor that is parametric on a structure of type GRAPH,
>
>functor graph_algorithms (structure Graph : GRAPH) = structure
>
> type graph = Graph.graph
> fun dfs ...
>...
>
>end
>
>I am not sure if I got all of the bits and pieces of syntax right, but
>the general idea is correct. Later you can instantiate graph_algorithms
>with either your dynamic graph structure or your static graph
>structure. I think that there should be a simpler way to accomplish
>this sort of thing, but this works and achieves your goal, I believe.
>
>If you are interested in graph structures in particular, I highly
>encourage you to take a look at Martin Erwig's Functional Graph Library
>(http://www.cs.orst.edu/~erwig/fgl/).

As a side-note: this is all related to how you abstract, that is, over
data or functionality or both (also known as the extensibility problem).
Functional Programming is good at extending functionality, OO is good at
extending data and the combination is well-handled by so-called "mixins"
(see a.o. http://citeseer.nj.nec.com/bracha92programming.html and
http://citeseer.nj.nec.com/findler98modular.html) Now, mixins roughly
parametrise over their superclass. In some points, this is very much what
ML-functors do, although in other points it is not, in particular
ML-functors do not have dynamic method lookup, mixins do. I don't know
what consequences this has, but what I do know is that it shows that OO is
not the end in programming/modelling...

regards,

Simon


Thant Tessman

unread,
Sep 14, 2001, 3:05:01 PM9/14/01
to
Jeremy Siek wrote:

[...]

> Hmm, previously you said HOF's and iterators are orthogonal
> concepts, but your above statement makes it sound like one
> has to choose between iterators and HOF's. Again, my claim
> is that both are vitally important.

I said higher-order functions and genericity are orthogonal concepts.
Cross-cultural use of terminology is always fraught with the danger of
misunderstanding. My understanding is that "genericity" is synonymous
with parameterized types--a concept supported by both SML and C++.

Iterators, on the other hand, are a concept that developed as part of
object-oriented programming methodology. The C++ STL happens to heavily
leverage C++'s support for parameterized types to promote code reuse
between data structures, algorithms and iterators, but (it is my
understanding that) iterators were in use long before templates were
supported in C++.


> BTW, I don't think that the particulars of how iterators
> work in C++ is all that important in the grand scheme...
> what is important is that a common interface can be found
> for traversing data structures.

My claim is that support for higher-order functions makes it easy to
glue a wide variety of algorithms to a wide variety of data structures,
and it is their absense in languages like C++ that make a formalized
interface between data structures and algorithms in the form of
iterators appear necessary. (This is stated a bit strongly. But it comes
from someone who has been prejudiced against C++ out of years of
experience with it.)

-thant


Jeremy Siek

unread,
Sep 16, 2001, 3:09:07 PM9/16/01
to
On 14 Sep 2001, Thant Tessman wrote:
thant>
thant> My claim is that support for higher-order functions makes it easy to
thant> glue a wide variety of algorithms to a wide variety of data structures,

I've never seen HOF's used to glue algorithms to a variety of data
structures. Could you give an example of how this is done?

Cheers,
Jeremy

Thant Tessman

unread,
Sep 17, 2001, 2:30:54 PM9/17/01
to

I wrote:

> My claim is that support for higher-order functions makes

> it easy to glue a wide variety of algorithms to a wide
> variety of data structures,

Jeremy Siek wrote:

> I've never seen HOF's used to glue algorithms to a
> variety of data structures. Could you give an example
> of how this is done?

The SML Standard Basis Library includes support for vectors and lists.
Both data structures include things like 'map', 'foldl', 'foldr',
'tabulate', etc. All take as arguments functions which are typically
built dynamically by the program. And the function passed to the list's
version of map will work when passed to the vector's version of map. And
it will work with your own version of map designed to traverse, for
example, a tree data structure. Plenty of reuseability derived from
nothing more than the fact that the concept of 'map' is clear and offers
an obvious implementation for a given data structure. No need for iterators.

But this is just the beginning. It has been my experience that
higher-order functions offer opportunities to refactor that go way
beyond merely iterating through data structures. I once wrote a regular
expression pattern matcher in something like two pages of SML code. You
would build pattern matching functions by composing other pattern
matching functions. It was scary. (Sorry, I don't have the code any
more, but I'm sure I'm not the only one to have done this.)

-thant


Jeremy Siek

unread,
Sep 18, 2001, 9:26:20 AM9/18/01
to
On 17 Sep 2001, Thant Tessman wrote:
thant>
thant> The SML Standard Basis Library includes support for vectors
thant> and lists. Both data structures include things like 'map',
thant> 'foldl', 'foldr', 'tabulate', etc. All take as arguments
thant> functions which are typically built dynamically by the
thant> program. And the function passed to the list's version of
thant> map will work when passed to the vector's version of map.
thant> And it will work with your own version of map designed to
thant> traverse, for example, a tree data structure. Plenty of
thant> reuseability derived from nothing more than the fact that
thant> the concept of 'map' is clear and offers an obvious
thant> implementation for a given data structure. No need for
thant> iterators.

Ahh, I see what you are getting at now. Basically, any iterator style loop
that looks like this:

for (iter = c.begin(); iter != c.end(); ++iter) {
value_type x = *i
// do something
}

gets transformed into something like this:

Container.app (fn (x) => (* do something *) ) c;


So all these containers in the SML Basis have a similar interface... is
there any reason that a signature for this has not been created? (or has
it, and where would I find it)

Thant Tessman

unread,
Sep 18, 2001, 1:16:32 PM9/18/01
to
Jeremy Siek wrote:

[...]

> So all these containers in the SML Basis have a
> similar interface... is there any reason that a
> signature for this has not been created? (or has
> it, and where would I find it)

The purpose of the SML Standard Basis Library is different than that of
the C++ Standard Template Library. Implementing data structures is much
harder in C++ than it is in SML. (Note the difference in my SML
implementation of doubly-linked lists compared to that of the STL.)
Consequently, off-the-shelf components are far more valuable in C++ than
they are in SML--as is the effort attempting to guarantee their
interoperability via standards embodied in the STL.

In contrast, the purpose of SML's SBL is only partially to guarantee
consistency. Its main purpose is to provide features that, because of
type safety and performance concerns, are best implemented as part of
the language itself. (I hope I'm not being too presumptuous speaking for
the designers of the language here.)

-thant


Jeremy Siek

unread,
Sep 19, 2001, 4:23:07 PM9/19/01
to
On 17 Sep 2001, Thant Tessman wrote:
thant> The SML Standard Basis Library includes support for vectors and lists.
thant> Both data structures include things like 'map', 'foldl', 'foldr',
thant> 'tabulate', etc. All take as arguments functions which are typically
thant> built dynamically by the program. And the function passed to the list's
thant> version of map will work when passed to the vector's version of map. And
thant> it will work with your own version of map designed to traverse, for
thant> example, a tree data structure. Plenty of reuseability derived from
thant> nothing more than the fact that the concept of 'map' is clear and offers
thant> an obvious implementation for a given data structure. No need for iterators.

Upon further inspection of the SML Basis, List was the only structure with
'map'. I suppose the reason has something to do with deciding how to
represent the output. Here's what I created for a "common" signature;

signature CONTAINER =
sig
type elem
type container
val length : container -> int
val app : (elem -> unit) -> container -> unit
val foldl : ((elem * 'a) -> 'a) -> 'a -> container -> 'a
val foldr : ((elem * 'a) -> 'a) -> 'a -> container -> 'a
val tabulate : (int * (int -> elem)) -> container
end;

Note that you'd need to have functors for each Basis struct to set the
"container" type.

thant> But this is just the beginning. It has been my experience that
thant> higher-order functions offer opportunities to refactor that go way
thant> beyond merely iterating through data structures. I once wrote a regular
thant> expression pattern matcher in something like two pages of SML code. You
thant> would build pattern matching functions by composing other pattern
thant> matching functions. It was scary. (Sorry, I don't have the code any
thant> more, but I'm sure I'm not the only one to have done this.)

Ok... to continue in the exploration of the power of HOF's versus
iterators, how about a generic implementation of inner product? By generic
here I mean that the two input sequences can be arbitrary container
data-structures, perhaps one is a list and the other is an array. Can you
implement a generic inner product using ML container HOF's? I don't think
so (but I would be happy to be proved wrong!). However, it is quite
straightforward to write a generic inner product using some kind of
iterator.

Henning Makholm

unread,
Sep 20, 2001, 8:35:54 AM9/20/01
to
Scripsit Jeremy Siek <js...@osl.iu.edu>

> Upon further inspection of the SML Basis, List was the only structure with
> 'map'.

You need more careful inspection. Vector, String, Word8Vector, and
Option all also have 'map'.

> Ok... to continue in the exploration of the power of HOF's versus
> iterators, how about a generic implementation of inner product? By generic
> here I mean that the two input sequences can be arbitrary container
> data-structures, perhaps one is a list and the other is an array.

functor inner(structure X: CONTAINER where type elem = real
structure Y: CONTAINER where type elem = real)
= struct
fun xcont * ycont = case B.foldl (fn (y,([],acc)) => raise TooManyYs
| (y,(x::xs,acc)) => (xs,x*y+acc))
(A.foldr op:: [] xcont, 0.0) ycont
of ([],result) => result
| _ => raise TooMayXs
end

This solution does make a temporary list for storing one of the
operands, but it is exactly temporary, and it is not unreasonable to
expect an aggressive compiler to optimize it out of existense once
once it knows the implementations of X and Y.

--
Henning Makholm "Jeg kber intet af Sulla, og selv om uordenen griber
planmssigt om sig, s er vi endnu ikke net dertil hvor
ordentlige mennesker kan tillade sig at stjle slaver fra
hinanden. S er det ligegyldigt, hvor strke, politiske modstandere vi er."

Andreas Rossberg

unread,
Sep 20, 2001, 8:40:56 AM9/20/01
to
Jeremy Siek wrote:
>
> Upon further inspection of the SML Basis, List was the only structure with
> 'map'. I suppose the reason has something to do with deciding how to
> represent the output. Here's what I created for a "common" signature;

No, option and vectors provide map and variants, too. Arrays don't,
because they are supposed to be stateful data structures which you don't
map but modify in place. There are no more collection types in the basis
library, but if you look at the SML/NJ lib for example, all functional
containers provide map functions.

> signature CONTAINER =
> sig
> type elem
> type container
> val length : container -> int
> val app : (elem -> unit) -> container -> unit
> val foldl : ((elem * 'a) -> 'a) -> 'a -> container -> 'a
> val foldr : ((elem * 'a) -> 'a) -> 'a -> container -> 'a
> val tabulate : (int * (int -> elem)) -> container
> end;
>
> Note that you'd need to have functors for each Basis struct to set the
> "container" type.

Yes, and that is a fundamental problem. Because the signature does not
suite polymorphic containers you essentially loose the advantage of
polymorphic types if you try to use it as a collection framework.

It is easier (but not easy) to have a unified framework of collection
types within Haskell's type class system. See Chris Okasaki's design of
the `Edison' library for example:

http://citeseer.nj.nec.com/318544.html

The introduction of the paper mentions the plan to port it to SML in the
future, but I am a bit sceptical.

- Andreas

--
Andreas Rossberg, ross...@ps.uni-sb.de

"Computer games don't affect kids; I mean if Pac Man affected us
as kids, we would all be running around in darkened rooms, munching
magic pills, and listening to repetitive electronic music."
- Kristian Wilson, Nintendo Inc.

0 new messages