Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Offering abstract data structures (with simple and efficient implementation)

5 views
Skip to first unread message

Robert Will

unread,
Dec 6, 2003, 2:09:38 PM12/6/03
to
hi,

Everyone knows that abstract data structures are cool. I mean that
they make it much, much easier to write good software. Incidentally
languages like Haskell, for example, also allow to offer several data
structure implementations behind the same interface: using type
classes.

In spite of this reality is far behind:
1. there are only very few implementations of such bread and butter
data structures as Sets, (Finite) Maps and Sequences available for,
e.g., languages like Haskell.

2. most of Haskell's standard functions are centered around a concrete
data structure without any abstraction.

3. there is no agreed standard on which features functional versions
of the common abstract data structure should have.

Note how the first point significantly differs from the two others in
scale: if we had some good implementations, I could just download what
I need and use it for my project, adapting it, where needed. But of
course, if several people do this, their programs will not work
together in a nice way -- one can't just pass a set from one program
to the other. And one can't just exchange the concrete data structure
against another one.

Recently I have implemented some concrete data structures -- they
implement all three abstract structures (Set, Seq, Map) and even offer
several alternatives. (Other structures, such as Bags and Relations
can also be implemented using this framework.) I also attempted to
specify nice type classes which could serve as a standard for other
implementations. However, this work proved to be very difficult:
Haskell tradition is tightly bound to its built-in implementation of
Sequences usually called Lists. I have to think about this for more
time and maybe I can't do without help. That's why I already offer my
work, although it is not in very showable state:

www.stud.tu-ilmenau.de/~robertw/dessy/fun

You can have a look at the proposed abstract structures and learn
about the issues surrounding them. Furthmore you can grap the
implementations and use them in the not-nice way mentioned above. I
say this because my implementation of tree balancing is simpler by
orders of magnitude than everything on the market.

My data structure specifications put a great emphasis on abstraction,
they are meant for general-purpose programming, prototyping and
(executable) specifications, not for the implementation of algorithms.
(The latter makes the difference to Okasaki's Edison library which
focuses on algorithmic data structures and has not even one single
balanced tree on board.) My approach on data structures is different
from traditional ones by this emphasis on abstraction. More on this
can be found on the page on the imperative version of my little
library:

www.stud.tu-ilmenau.de/~robertw/dessy

Finally, Abstract Data Strucutes begin a new era in functional
programming, not only because they become much more powerful, but also
they need new methodology. Books like Bird's "introduction to
functional programming" for example present a theory of lists with
axioms and theorems which is now obsolete. Sequences have another
theory. And different data structures have even common axioms and
laws. We need a new book. But I will explain this in other messages
and articles.

For time being, have fun with the code.

Bob.

Feuer

unread,
Dec 6, 2003, 3:51:54 PM12/6/03
to
Robert Will wrote:

> In spite of this reality is far behind:
> 1. there are only very few implementations of such bread and butter
> data structures as Sets, (Finite) Maps and Sequences available for,
> e.g., languages like Haskell.

True. But have you looked into Okasaki's Edison library? If not,
you should. Dr. Okasaki may be a good person to talk to if you would
like to work on making such a standard library, as he has already
carefully considered many of the issues involved.

David

Joachim Durchholz

unread,
Dec 6, 2003, 4:46:59 PM12/6/03
to
Robert Will wrote:
>
> www.stud.tu-ilmenau.de/~robertw/dessy/fun

Got it. Quite interesting (but avoid terminology like "ground-breaking"
when describing your work *g*).

> You can have a look at the proposed abstract structures and learn
> about the issues surrounding them. Furthmore you can grap the
> implementations and use them in the not-nice way mentioned above. I
> say this because my implementation of tree balancing is simpler by
> orders of magnitude than everything on the market.
>
> My data structure specifications put a great emphasis on abstraction,
> they are meant for general-purpose programming, prototyping and
> (executable) specifications, not for the implementation of algorithms.

This reminds me of the "taxonomical approach" used for the Eiffel base
libraries.
Since Eiffel is an imperative language, it must make distinctions that
are irrelevant to a functional one, so the Eiffel classes aren't too
interesting for Haskell, but the approach as such has proved to be quite
effective.


On your treatment of Maps: they don't just share properties with Sets,
they /are/ sets (in a "is a proper subtype" sense), namely (please
forgive my rusty Haskell)
Map a b = Set (Pair a b)
(the set of pairs from a domain and a codomain, aka the set of key-value
pairs). This implies multi-parameter type classes.

Here's why I advocate this model:
The alternative model that one would come up with was used in the
abstract Eiffel class TABLE (equivalent to your Map): a set of "keys",
with additional functions for mapping these keys to associated values.
This approach, while untuitive, has very unfortunate consequences for
equality. I have worked with it, both on the library designer and
application programmer level, and I seriously recommend avoiding the
hassles and sticking with the simple model of "a Map is a Set of Pairs".

Since iteration and membership tests are relevant for both domain and
codomain of a Map, this implies that the Map type class must define two
functions:
keys :: Map a b -> Set a
values :: Map a b -> Set b
"keys" must be written so that iteration and membership tests are
efficient, for "values", this would be a nice extra but isn't required.

Regards,
Jo

Aaron Denney

unread,
Dec 6, 2003, 7:18:12 PM12/6/03
to
On 2003-12-06, Joachim Durchholz <joachim....@web.de> wrote:
>
> Since iteration and membership tests are relevant for both domain and
> codomain of a Map, this implies that the Map type class must define two
> functions:
> keys :: Map a b -> Set a
> values :: Map a b -> Set b
> "keys" must be written so that iteration and membership tests are
> efficient, for "values", this would be a nice extra but isn't required.

I think in most cases, having values return a Bag, or MultiSet of B
would be preferable.

--
Aaron Denney
-><-

Joachim Durchholz

unread,
Dec 7, 2003, 5:31:08 AM12/7/03
to

Ah, right, indeed.

Regards,
Jo

Robert Will

unread,
Dec 7, 2003, 4:14:59 PM12/7/03
to
Joachim Durchholz <joachim....@web.de> wrote in message news:<bqtino$o8m$1...@news.oberberg.net>...

>
> This reminds me of the "taxonomical approach" used for the Eiffel base
> libraries.

Incidentally, the imperative version of Dessy is written in Eiffel.

And the reason to start Dessy was that I was utterly disappointed by
the lack of abstraction in EiffelBase and the lack of discussion of
higher-level design issues in the accompanying book "reusable
software".

On the other hand, alone the use of contracts and the fact that it was
designed at all (not hacked together like the libraries of a language
that is very popular today), yes, this makes EiffelBase a very good
library. But nevertheless, EiffelBase captures all the good stuff of
the 1980s and that's some time ago. In my opinion it is just too
complex, and it offers too many low-level and outdated features. (But
I'll stop bashing what may be the best DS lib on the market ;-)
Technical arguments are on www.stud.tu-ilmenau.de/~robertw/dessy.)

Also, the functional Dessy is a subset of the imperative one. When
the functional classes are stable, I will port them to OO. On the
page above is also a short paper on why functional (aka immutable)
data structures are important even in imperative languages. One
reason is persistency: you can have several versions share memory.
Another reason is use in contracts. Another is that functional style
is simply higher-level, also in OO languages.


> On your treatment of Maps: they don't just share properties with Sets,
> they /are/ sets (in a "is a proper subtype" sense)

Attention! I have studied mathematics and the theory of progamming.
;-)

Precisely, any statement of the form A is B, must be read "A is
isomorphic to B", which means that both structures share the same
laws. In the use of mathematics we can exploit this, by using the
same terminology for both things and otherwise treat them as the same.
(Well, I was not very good student at the math part.)

Dessy (and by the way Eiffel) is based on the specification language
Z, where indeed, Maps (called Functions there) are a special case of
sets of pairs. This has proved to be very practical.

(I had the luck to learn about Z at the faculty. If you're
interested, you can read a book on B (Z's successor).
http://www3.inrets.fr/B@INRETS/B-Bibliography/B-Bibliography002.html
This pages lists some. I cannot recommend one, but only dissuade from
Abrial's book. Although he is the inventor of Z and B and one of the
greatest computer scientists around, his book is too "mathematics" to
be read as an introduction.)

However, in spite of Abrial's glory, Z and Eiffel are old, and there
are already better approaches. Phillip Wadler gave a talk on "19th
century logic for 21th software" and that's the point: Most theories
just recycle old mathematics, which is very interesting and also
useful, but not the best possible thing to do. On the other hand, we
can design a theory that includes just the stuff that is needed for
programming, and such a theory is that of Eric Hehner, explained in
his book "A practical theory of Programming", whose second edition is
to appear soon.

To make a long story short: In Hehner's theory Functions are not Sets,
and that has inspired me to take an approach different from Z and
classical math. But in the last days I have reconsidered that, and in
some work I did today, I went back to:

type OrdMap a b = OrdSet (Maplet a b)

data Maplet a b = a :> b

instance Eq a => Eq (Maplet a b) where
(a :> _) == (o :> _) = a == o

instance Ord a => Ord (Maplet a b) where
(a :> _) <= (o :> _) = a <= o

(Only that in Dessy OrdMap and OrdSet are typeclasses, so there is
still some work to do...)

Since I'm already phantasising a lot, let me make another blatant
statement: The best imperative programming language of today is based
on Z and thereby on Russell's theory of sets and so on. The
programming language of tomorrow will be based on Hehner's theory.
(But don't tell him I have said so, since he would put the quote on
his homepage.)


> Since iteration and membership tests are relevant for both domain and
> codomain of a Map, this implies that the Map type class must define two
> functions:
> keys :: Map a b -> Set a
> values :: Map a b -> Set b

I used the Z names "domain" and "range". And "range" gives a set,
like in Z (and in mathematics).

> "keys" must be written so that iteration and membership tests are
> efficient,

iteration in a functional language. I am currently writing a book
about that ;-) At time, Dessy has fold (aka reduce) and map (aka
apply) and filter, and a lot of operations to dissect the thing.

I also have Z's 'image' and 'compose' operations, which are very
useful in specifications and I expect them to be suberp in
programming, too.

Anyway, the specification language of today is the programming
language of tomorrow.

Have a good time digesting this. (And have a drink, if you feel
strange.) And don't blame me, if the future should be different from
what I predicted. (We know that industry is satisfied with the worst.
And researchers are satisfied with early 20th century math...)


Ciao, Bobby

ke...@ii.uib.no

unread,
Dec 7, 2003, 4:29:21 PM12/7/03
to
Joachim Durchholz <joachim....@web.de> writes:

> This reminds me of the "taxonomical approach" used for the Eiffel base
> libraries.

Would you mind elaborating a bit?

> On your treatment of Maps: they don't just share properties with Sets,
> they /are/ sets (in a "is a proper subtype" sense), namely (please
> forgive my rusty Haskell)

> Map a b = Set (Pair a b)
> (the set of pairs from a domain and a codomain, aka the set of
> key-value pairs). This implies multi-parameter type classes.

With the restriction that the a's also constitute a set; i.e. each a
is unique?

I.e. {(1,a),(1,b)} is a set of pairs, but not a map. Or if you really
mean to have no distinction, what would the result of 'lookup 1' be?

-kzm
--
If I haven't seen further, it is by standing in the footprints of giants

Joachim Durchholz

unread,
Dec 7, 2003, 8:28:25 PM12/7/03
to
ke...@ii.uib.no wrote:

> Joachim Durchholz <joachim....@web.de> writes:
>
>>This reminds me of the "taxonomical approach" used for the Eiffel base
>>libraries.
>
> Would you mind elaborating a bit?

Eiffel organized types in a hierarchy (actually a directed acyclic
graph) of subtypes.
The base library exploit this to set up taxonomies, such as
Numeric -> Complex -> Real -> Fractional -> Integer
or
Container 'a -> Bag 'a -> Set 'a -> Table 'a 'b

Let me focus on the containers, since these provide the best example.
Basically, there are three basic distinctions made here:
a) whether a container can have the same element once (Set) or more than
once (Bag)
b) whether the container has a fixed maximum capacity (Fixed), an
adjustable capacity (Bounded), or can always accept additional data
(Unbounded).
c) what kinds of iteration are possible (OneWay, TwoWay, no iterators)
d) random access: none, keyed (Map) or positional (Indexed).

For all interesting combinations of the above (and other) properties,
there's an implementation class: LinkedList, DoublyLinkedList,
HashTable, BinaryTree, etc.
For example, a HashTable (seen as a set of pairs, not its actual
incarnation in the Eiffel libraries which exhibit some infelicities) is:
* a Set (namely of Pairs)
* Unbounded
* OneWay iteratible (backwards iteration happens to be unavailable)
* a Map

In other words, the taxonomical types (which define just a few
signatures and laws) are just building blocks for the semantics of the
various concrete container classes.

Taxonomy is nice, but there's one caveat: it's better to first design
the concrete types, then the taxonomy. If you do it the other way round,
you end up with a gazillion of tiny taxonomical types which don't have
much of a semantics. The Eiffel base libraries have fallen to this trap,
at least to some extent.
On the other hand, the result was still one of the better container
libraries that I have seen. Carrying taxonomy through, in a less
overengineered manner, could lead to stunning results. (Maybe Dessy is
it. I don't know, I haven't seen it.)

>>On your treatment of Maps: they don't just share properties with Sets,
>>they /are/ sets (in a "is a proper subtype" sense), namely (please
>>forgive my rusty Haskell)
>>
>> Map a b = Set (Pair a b)
>>(the set of pairs from a domain and a codomain, aka the set of
>>key-value pairs). This implies multi-parameter type classes.
>
> With the restriction that the a's also constitute a set; i.e. each a
> is unique?

Right. I overlooked that.

Regards,
Jo

Joachim Durchholz

unread,
Dec 7, 2003, 8:28:30 PM12/7/03
to
Robert Will wrote:
> Joachim Durchholz <joachim....@web.de> wrote in message
> news:<bqtino$o8m$1...@news.oberberg.net>...
>
>> This reminds me of the "taxonomical approach" used for the Eiffel
>> base libraries.
>
> [Praise&Critique of EiffelBase snipped]

That's why I recommended the approach, not the library :-)

> But nevertheless, EiffelBase captures all the good stuff of the 1980s
> and that's some time ago. In my opinion it is just too complex, and
> it offers too many low-level and outdated features. (But I'll stop
> bashing what may be the best DS lib on the market ;-) Technical
> arguments are on www.stud.tu-ilmenau.de/~robertw/dessy.)

Basically, there are two things that I didn't like about EiffelBase:
a) Lack of a good container that could efficiently iterate, access by
key, and update.
b) Having containers manage the state of a single iteration running over
them.

(b) was just the worst of many misdecisions - the original author of
EiffelBase is good at Grand Sweeping Designs, but detail work is clearly
not his strong point.

> Also, the functional Dessy is a subset of the imperative one. When
> the functional classes are stable, I will port them to OO. On the
> page above is also a short paper on why functional (aka immutable)
> data structures are important even in imperative languages. One
> reason is persistency: you can have several versions share memory.
> Another reason is use in contracts. Another is that functional style
> is simply higher-level, also in OO languages.

Fully agreed.

>> On your treatment of Maps: they don't just share properties with
>> Sets, they /are/ sets (in a "is a proper subtype" sense)
>
> Attention! I have studied mathematics and the theory of progamming.
> ;-)

:-)))

> Precisely, any statement of the form A is B, must be read "A is
> isomorphic to B", which means that both structures share the same
> laws. In the use of mathematics we can exploit this, by using the
> same terminology for both things and otherwise treat them as the
> same. (Well, I was not very good student at the math part.)

Right.

> Dessy (and by the way Eiffel) is based on the specification language
> Z, where indeed, Maps (called Functions there) are a special case of
> sets of pairs. This has proved to be very practical.
>
> (I had the luck to learn about Z at the faculty. If you're
> interested,

You can bet I am!

> you can read a book on B (Z's successor).
> http://www3.inrets.fr/B@INRETS/B-Bibliography/B-Bibliography002.html
> This pages lists some. I cannot recommend one, but only dissuade
> from Abrial's book. Although he is the inventor of Z and B and one
> of the greatest computer scientists around, his book is too
> "mathematics" to be read as an introduction.)

Unfortunately, there doesn't seem to be an online version of any book/paper.

> However, in spite of Abrial's glory, Z and Eiffel are old, and there
> are already better approaches. Phillip Wadler gave a talk on "19th
> century logic for 21th software" and that's the point: Most theories
> just recycle old mathematics, which is very interesting and also
> useful, but not the best possible thing to do. On the other hand, we
> can design a theory that includes just the stuff that is needed for
> programming, and such a theory is that of Eric Hehner, explained in
> his book "A practical theory of Programming", whose second edition is
> to appear soon.

I'm going to read up on this.
If Eric's book available online somewhere? ("Available" as in "freely
downloadable" - my budget is currently sorely limited.)

> To make a long story short: In Hehner's theory Functions are not
> Sets, and that has inspired me to take an approach different from Z
> and classical math.

Well, tables aren't best described as functions either, so it's OK if
you don't follow modern mathematics here :-)

> Since I'm already phantasising a lot, let me make another blatant
> statement: The best imperative programming language of today is based
> on Z

If you mean Eiffel: if it's indeed the best one, then this says a lot
about the others :-(
Or, in yet other words: Eiffel has some extremely nice ideas, but it
also has some very serious blunders.

> and thereby on Russell's theory of sets and so on. The programming
> language of tomorrow will be based on Hehner's theory. (But don't
> tell him I have said so, since he would put the quote on his
> homepage.)

Don't fear, I don't even known him :-)

>> Since iteration and membership tests are relevant for both domain
>> and codomain of a Map, this implies that the Map type class must
>> define two functions: keys :: Map a b -> Set a values :: Map a b ->
>> Set b
>
> I used the Z names "domain" and "range". And "range" gives a set,
> like in Z (and in mathematics).

Hmm... "range" seems to be misleading to me... it implies, among other
things, a total order on the result set, at least to me.

>> "keys" must be written so that iteration and membership tests are
>> efficient,
>
> iteration in a functional language.

Yes, of course. When I say "iteration" in a functional context, I mean
things likey fold and map - or, even better, something that returns
something that's equivalent to a list (in Haskell, that would be a lazy
list) so that all the higher-order machinery can be applied to it.
Yet it has to be efficient, whatever the means of implementation :-)

> I am currently writing a book about that ;-) At time, Dessy has fold
> (aka reduce) and map (aka apply) and filter, and a lot of operations
> to dissect the thing.

Good.

> I also have Z's 'image' and 'compose' operations, which are very
> useful in specifications and I expect them to be suberp in
> programming, too.

You lost me here - I know next to nothing about Z.

> Anyway, the specification language of today is the programming
> language of tomorrow.

Yup... it's another instance of the old "let's do programmable
specifications" cycle.

> Have a good time digesting this. (And have a drink, if you feel
> strange.) And don't blame me, if the future should be different from
> what I predicted. (We know that industry is satisfied with the
> worst. And researchers are satisfied with early 20th century math...)

No worry.

Regards,
Jo

Alaric B Snell

unread,
Dec 8, 2003, 5:26:38 AM12/8/03
to
Joachim Durchholz wrote:
> Robert Will wrote:
>
>>
>> www.stud.tu-ilmenau.de/~robertw/dessy/fun
>
>
> Got it. Quite interesting (but avoid terminology like "ground-breaking"
> when describing your work *g*).
>
>> You can have a look at the proposed abstract structures and learn
>> about the issues surrounding them. Furthmore you can grap the
>> implementations and use them in the not-nice way mentioned above. I
>> say this because my implementation of tree balancing is simpler by
>> orders of magnitude than everything on the market.
>>
>> My data structure specifications put a great emphasis on abstraction,
>> they are meant for general-purpose programming, prototyping and
>> (executable) specifications, not for the implementation of algorithms.

What's always annoyed me with Haskell pattern matching is that the
fundamental matching operators are defined for inbuilt types like lists
and so on - like (x:xs) - and (last time I looked; stuff may have
changed since) you can't define your own for abstract data types.

Pattern matching operators would need to be defined as clusters of
funtions that can deal with various combinations of parameters being
'missing', and having to return a list of possible values for the
missing parameters, which may be empty. Or something like that.

So one might define (I forget the syntax, so am making up my own here):

SequenceCons to be of type (Sequence x) x -> (Sequence x)

%SequenceCons to be of type
(Maybe Sequence x) (Maybe x) (Maybe Sequence x)
->
[((Maybe Sequence x),(Maybe x),(Maybe Sequence x))]

If a pattern of the form:

... SequenceCons x xs ...

where to be matched against the sequence s, the system would call the
%SequenceCons function with arguments:

%SequenceCons Nothing Nothing (Just s)

The function would return:

[(Just a, Just b, Nothing)]

...where a is the first element in the sequence and b is the rest.

Or something of the sort.

Does this sound sensible, or has the problem already been solved while I
wasn't looking?

<public service announcement>


> keys :: Map a b -> Set a
> values :: Map a b -> Set b

Awooga! The values may contain duplicates (unlike the keys) so the
values should be a Bag.
</public service announcement>

> Regards,
> Jo

ABS

ke...@ii.uib.no

unread,
Dec 8, 2003, 5:39:23 AM12/8/03
to
Joachim Durchholz <joachim....@web.de> writes:

> Container 'a -> Bag 'a -> Set 'a -> Table 'a 'b

I've occasionally felt Haskell was lacking a more structured approach
to its containers (List, Array, Set, FiniteMap,..) OTOH, I've toyed a
bit with it without getting anywhere useful. Perhaps this is a good
starting point?

> Let me focus on the containers, since these provide the best example.
> Basically, there are three basic distinctions made here:
> a) whether a container can have the same element once (Set) or more
> than once (Bag)

I suppose that with referential transparency this would be equivalent
to a (finite)map from 'a to integer counts?

> b) whether the container has a fixed maximum capacity (Fixed), an
> adjustable capacity (Bounded), or can always accept additional data
> (Unbounded).

> c) what kinds of iteration are possible (OneWay, TwoWay, no iterators)

I guess we'd use maps and folds for iteration.

> d) random access: none, keyed (Map) or positional (Indexed).

Isn't positional just a special case of keyed, i.e. a list is kind of
a map from Int to a'?

Vincenzo aka Nick Name

unread,
Dec 8, 2003, 5:45:06 AM12/8/03
to
Alaric B Snell wrote:

> What's always annoyed me with Haskell pattern matching is that the
> fundamental matching operators are defined for inbuilt types like
> lists and so on - like (x:xs) - and (last time I looked; stuff may
> have changed since) you can't define your own for abstract data types.

I don't follow you here, since you can pattern-match any constructor of
any datatype, even infix constructors, as in:

data Rlist a = Nil | Rlist a ::: a

infixl 5 :::

head (hd ::: tl) = hd
tail (hd ::: tl) = tl

len Nil = 0
len (hd ::: tl) = 1 + len hd

and as you see you can even define the associativity and precedence of
an infix datatype constructor.

Bye

Vincenzo

Darius

unread,
Dec 8, 2003, 6:45:52 AM12/8/03
to
On Mon, 08 Dec 2003 10:26:38 +0000
Alaric B Snell <ala...@alaric-snell.com> wrote:

> What's always annoyed me with Haskell pattern matching is that the
> fundamental matching operators are defined for inbuilt types like
> lists and so on - like (x:xs) - and (last time I looked; stuff may
> have changed since) you can't define your own for abstract data types.

You can for your own data types, but not for abstract data types.
However, there is/was an extension, views, that allowed you to define
what pattern matching meant. It apparently wasn't too popular. I think
there was a semantics issue and there's obviously the issue that pattern
matching could then imply large amounts of computation which wouldn't be
obvious from the source. Google (with suitable refining words like
Haskell and pattern matching) should turn up something or I think
there's a link to the paper on the Haskell site. Pattern guards are a
half-way solution that are more general and useable in some respects,
but doesn't have the transparency (for better and worse) of views. The
paper describing pattern guards has some commentary about views.

Joachim Durchholz

unread,
Dec 8, 2003, 7:56:29 AM12/8/03
to
ke...@ii.uib.no wrote:

> Joachim Durchholz <joachim....@web.de> writes:
>
>> Container 'a -> Bag 'a -> Set 'a -> Table 'a 'b
>
> I've occasionally felt Haskell was lacking a more structured approach
> to its containers (List, Array, Set, FiniteMap,..) OTOH, I've toyed
> a bit with it without getting anywhere useful. Perhaps this is a
> good starting point?

I don't know. If I were to seriously consider applying the approach to a
functional language, I'd transfer most of the Eiffel classes to a
functoinal subtype hierarchy, adapt the types to the absence of effects
(possibly eliminating some types, and most certainly eliminating a whole
lot of questionable decisions - mutability has led to some serious
misdecisions in EiffelBase).
But I don't know whether the result would be any good. Writing a library
requires that you know a whole lot about how it might be used, and my
experience with functional idioms is simply not sufficient to make the
right guesses.
Besides, people seem to have been content with ubiquitous lists for
decades, so I'm not even sure I'd be addressing a serious need here.

Which is why I never seriously attempted anything in this direction.

>> Let me focus on the containers, since these provide the best
>> example. Basically, there are three basic distinctions made here:
>> a) whether a container can have the same element once (Set) or more
>> than once (Bag)
>
> I suppose that with referential transparency this would be equivalent
> to a (finite)map from 'a to integer counts?

A bag can be represented by a 'a->Integer mapping, right.
It's still not the same. For example, a Queue is (among other things) a
Bag (since there is a sensible answer to the question of "how many
instances of X are in this queue?"), but the map would destroy the
ordering properties of the Queue.

Bag is just an abstract type. Essentially, a set of a few functions and
associated laws, similar to the definition of Monad with its associated
monad laws.


Here's the last reason why this kind of taxonomy makes not too much
sense: lack of assertions as part of the language.

In Eiffel, I can spell out the Bag laws and have them checked (if only
at runtime during testing, but even that helps tremendously). Further,
Eiffel enforces that the laws are also checked in subtypes, so an Eiffel
supertype is not just a loose and arbitrary bag of function signatures,
there's real semantics behind such a subtype relationship. (Eiffel is a
bit overoptimistic in what consistutes a subtype, which has resulted in
an unsound type system. However, the basic approach is still valuable,
and I regret that no other language has ever picked up on that lead.)

A taxonomy that imposes semantic constraints on subtypes is really
helpful. It can even make sense to split the thing up into microtypes,
each describing just a tiny aspect of the whole - it's essentially a kit
of reusable semantics building blocks that can help shaping any type
that's being constructed during design.
Also, having these building blocks helps shape one's thoughts. A known
set of Container laws is like a canned set of decisions that are known
to work well together; this avoids redesigning the wheel whenever
somebody builds a new container type.
For the users, the resulting uniformity helps because programmers don't
have to look up the functions for each kind of use - they are the same
(and have the same semantics) anyway.

Without language support, it's enough to simply stick with some
conventions about functions and their standard usage.

>> b) whether the container has a fixed maximum capacity (Fixed), an
>> adjustable capacity (Bounded), or can always accept additional data
>> (Unbounded).
>>
>> c) what kinds of iteration are possible (OneWay, TwoWay, no
>> iterators)
>
> I guess we'd use maps and folds for iteration.

Right - but the differences are still relevant.
For example, OneWay would have just foldl, while TwoWay would have foldl
and foldr.

>> d) random access: none, keyed (Map) or positional (Indexed).
>
> Isn't positional just a special case of keyed, i.e. a list is kind of
> a map from Int to a'?

It is. It's still an important special case (else nobody would talk
about arrays).

One way to approach taxonomically would be to first look at the concrete
container types one would want to see implemented, then factor out the
recurring but not-quite-ubiquitous interfaces into abstract supertypes.

Note that such an approach can have some surprising effects on
terminology. For example, it turns out that the push and pop functions
from Stack are just a special case of the put and get functions of the
more general Dispenser type (which includes stacks, queues, and if you
will all data structures that one can iterate over).

Regards,
Jo

Joachim Durchholz

unread,
Dec 8, 2003, 8:06:50 AM12/8/03
to
Darius wrote:

> On Mon, 08 Dec 2003 10:26:38 +0000 Alaric B Snell
> <ala...@alaric-snell.com> wrote:
>
>
>> What's always annoyed me with Haskell pattern matching is that the
>> fundamental matching operators are defined for inbuilt types like
>> lists and so on - like (x:xs) - and (last time I looked; stuff may
>> have changed since) you can't define your own for abstract data
>> types.
>
> You can for your own data types, but not for abstract data types.
> However, there is/was an extension, views, that allowed you to define
> what pattern matching meant. It apparently wasn't too popular. I
> think there was a semantics issue

There are several issues; pattern matching is quite different from
ordinary computations:
* For lazy languages, it forces some degree of evaluation (in fact
pattern matching seems to be one of the main driving forces for
evaluation in a lazy language).
* Constructor patterns are inferred automatically from the constructors;
no such automatics is possible for user-defined patterns (views). This
creates an asymmetry between predefined patterns and views, which is
undesirable.

Personally, I don't think that any of these problems are insurmountable,
but I fear that it's all too easy to create something that has all sorts
of semantics gaps and strange interactions. I hope that somebody with a
good grasp for the concrete issues involved and strong language design
skills will settle the issue :-)
(I'd consider that more interesting than yet another even more
complicated, even more powerful type system. Type systems are
interesting, but having a well thought-out user-defined pattern matching
would give me more immediate footage out of any language. YMMV.)

> and there's obviously the issue that pattern matching could then
> imply large amounts of computation which wouldn't be obvious from the
> source.

That's what led to the exclusion of views from Erlang.
(I don't buy the reasoning: the issue is the same with function calling.
And if you exclude pattern matching from costly computations, this
essentially means that the computations will move elsewhere, so this
doesn't make much of a difference for performance.)

Regards,
Jo

Darius

unread,
Dec 8, 2003, 9:11:18 AM12/8/03
to

Indeed, they even have guaranteed terminating guards.

> (I don't buy the reasoning: the issue is the same with function
> calling. And if you exclude pattern matching from costly computations,
> this essentially means that the computations will move elsewhere, so
> this doesn't make much of a difference for performance.)

I don't think the problem was that it would lower performance directly,
but rather, that it was a non-obvious cost. The "elsewhere" that the
computation would move is someplace where you would be able to see it.
The problem may be largely sociological; namely, that Haskellers have
grown accustom to pattern matching having a relatively trivial cost and
programmers in general tend to identify short language constructs with
simple code. Thus, the creators of pattern guards probably consider the
following to be more beneficial than detrimental,
-- a view solution
fOfTop f (a:::as) = f a:::as

-- pattern guards
fOfTop f stk | (a,as) <- pop stk = push (f a) as

Alaric B Snell

unread,
Dec 8, 2003, 9:30:21 AM12/8/03
to

Yep, but that's referring to the concrete structure of the thing, not an
abstract interface. Take the good fellow's Sequence type class, which
could have instances including:

MySequence a ::= MyCons a (MySequence a) | Empty

OtherSequence a ::= Singleton a |
Append (OtherSequence a) (OtherSequence a) |
Empty

...which would require pattern matching on different constructors.

>
> Vincenzo
>

ABS

Joachim Durchholz

unread,
Dec 8, 2003, 10:00:21 AM12/8/03
to
Darius wrote:
> The "elsewhere" that the computation would move is someplace where
> you would be able to see it.

That's exactly my point: that "elsewhere" would be in a function call
(or, in Erlang, an asynchronous message send), where you don't "see" the
overhead just as with a pattern match.

> The problem may be largely sociological; namely, that Haskellers have
> grown accustom to pattern matching having a relatively trivial cost

Agreed.

> and programmers in general tend to identify short language constructs
> with simple code.

Pattern matching is actually a longer, syntactically and semantically
more complex construct than a function call. I think it's more being
accustomed to a certain behaviour than a matter of direct perception.

Regards,
Jo

Chris Okasaki

unread,
Dec 8, 2003, 1:26:25 PM12/8/03
to
Darius <dda...@hotpop.com> wrote:
> Alaric B Snell <ala...@alaric-snell.com> wrote:
>
> > What's always annoyed me with Haskell pattern matching is that the
> > fundamental matching operators are defined for inbuilt types like
> > lists and so on - like (x:xs) - and (last time I looked; stuff may
> > have changed since) you can't define your own for abstract data types.
>
> You can for your own data types, but not for abstract data types.
> However, there is/was an extension, views, that allowed you to define
> what pattern matching meant. It apparently wasn't too popular. I think
> there was a semantics issue and there's obviously the issue that pattern
> matching could then imply large amounts of computation which wouldn't be
> obvious from the source. [...]

I've heard this last point many times in my crusading for views, but
I've never understood it. When you see a function call foo(x), you
know there's the potential for arbitrary amounts of computation there.
So why should it be any different when you see a pattern Foo(x)?
Many times, you know something about
foo that lets you conclude something about how long that function call
will take, and the same would be true about Foo. I agree that it
could be confusing for people who are expecting old-style pattern
matching to accidentally run across code using views, but that kind of
reasoning would forbid any language changes at all.

I particularly don't understand this argument in the context of
Haskell, where a pattern like (x:xs) *already* has the potential for
causing arbitrary amounts of computation because of lazy evaluation.

-- Chris

ke...@ii.uib.no

unread,
Dec 8, 2003, 1:40:20 PM12/8/03
to
Alaric B Snell <ala...@alaric-snell.com> writes:

> What's always annoyed me with Haskell pattern matching is that the
> fundamental matching operators are defined for inbuilt types like
> lists and so on - like (x:xs) - and (last time I looked; stuff may
> have changed since) you can't define your own for abstract data types.

[...]

> Does this sound sensible, or has the problem already been solved while
> I wasn't looking?

I don't quite follow your sketch, but if you need to "pattern match"
on the "outside" of the ADT, isn't it natural to provide the means to
do so as part of the ADT interface? So you could have e.g.

module Sequence (cons,nil,...) where ...

module Main ...
:
map f s | s == cons x xs = cons (f x) (map f xs)
| s == nil = nil

Not quite pattern matching, but fairly close, and makes it explicit
that there may be larger computations involved.

> Awooga! The values may contain duplicates (unlike the keys) so the
> values should be a Bag.

Depends what you want, doesn't it? If you just want to know which
values are stored, it's a set. If you want individual counts, it's a
bag.

Marcin 'Qrczak' Kowalczyk

unread,
Dec 8, 2003, 3:49:29 PM12/8/03
to
On Mon, 08 Dec 2003 02:28:25 +0100, Joachim Durchholz wrote:

> Basically, there are three basic distinctions made here:

That's four.

> a) whether a container can have the same element once (Set) or more than
> once (Bag)
> b) whether the container has a fixed maximum capacity (Fixed), an
> adjustable capacity (Bounded), or can always accept additional data
> (Unbounded).
> c) what kinds of iteration are possible (OneWay, TwoWay, no iterators)
> d) random access: none, keyed (Map) or positional (Indexed).

What is interesting, this taxonomy is very different from the one I use
(probably inspired by Edison).

The first split is into
a) sequences
b) sets and bags
c) dictionaries
which are disjoint, i.e. a dictionary is not a set nor vice versa.
They have some functions in common though. In Haskell it's hard to make
functions from different kinds common because they are differently
parametrized by types.

Next, each collection type is immutable or mutable. In non-pure languages
the interface of a mutable collection is a superset of the immutable
interface; in Haskell they would need to be different (and there is an
additional problem that some collections have immutable size but mutable
contents).

A set is isomorphic to a dictionary with dummy values, but its interface
is customized to sets.

When a dictionary is converted to a sequence, it yields a sequence of
(key,value) pairs. Such sequence is also used to specify the contents
of a dictionary. A set converted to a sequence obviously yields a sequence
of values. There is no conversion between sets and dictionaries which is
considered a conversion, but there is a function which gives the set of
keys of a dictionary.

Each collection has a forward iterator. There is a subtype of sequences
which informs that indexing by integers is efficient. The combination of
forward iterators and the knowledge which sequences can be efficiently
indexed by integers covers most of iteration needs.

A multidimensional array is considered a dictionary.

--
__("< Marcin Kowalczyk
\__/ qrc...@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

Darius

unread,
Dec 8, 2003, 1:58:03 PM12/8/03
to
On Mon, 08 Dec 2003 16:00:21 +0100
Joachim Durchholz <joachim....@web.de> wrote:

> Darius wrote:
> > The "elsewhere" that the computation would move is someplace where
> > you would be able to see it.
>
> That's exactly my point: that "elsewhere" would be in a function call
> (or, in Erlang, an asynchronous message send), where you don't "see"
> the overhead just as with a pattern match.

But, it's obvious when you call a function that it could take
arbitrarily long. With views it's not obvious whether a pattern match
will be a cheap fixed cost or an arbitrarily large cost and it looks
like something that's cheap.

A C++ example that comes to mind, is that the more common pattern (at
least in my experience) of,
for (i = 0;i < max;i++) ...;
is becoming
for (i = 0;i < max;++i) ...;
because it can have a significant performance difference when ++ is
overloaded. Also the issues with intermediates with the arithmetical
operators.

> > The problem may be largely sociological; namely, that Haskellers
> > have grown accustom to pattern matching having a relatively trivial
> > cost
>
> Agreed.
>
> > and programmers in general tend to identify short language
> > constructs with simple code.
>
> Pattern matching is actually a longer, syntactically and semantically
> more complex construct than a function call. I think it's more being
> accustomed to a certain behaviour than a matter of direct perception.

That's why I threw in "language construct". Function -application- is a
language construct and cheap syntactically and semantically in Haskell.
The actual function called is not a language construct that it takes an
unknown amount of time is obvious.

Anyways, the paper "Pattern Guards and Transformational Patterns" goes
over these issues and others and makes a compelling case for
pattern guards and transformational patterns (the latter isn't
implemented; in GHC at least). I agree with most of their analysis and
conclusions.

Darius

unread,
Dec 8, 2003, 3:32:25 PM12/8/03
to
On 8 Dec 2003 10:26:25 -0800
coka...@acm.org (Chris Okasaki) wrote:

> Darius <dda...@hotpop.com> wrote:
> > Alaric B Snell <ala...@alaric-snell.com> wrote:
> >
> > > What's always annoyed me with Haskell pattern matching is that the
> > > fundamental matching operators are defined for inbuilt types like
> > > lists and so on - like (x:xs) - and (last time I looked; stuff may
> > > have changed since) you can't define your own for abstract data
> > > types.
> >
> > You can for your own data types, but not for abstract data types.
> > However, there is/was an extension, views, that allowed you to
> > define what pattern matching meant. It apparently wasn't too
> > popular. I think there was a semantics issue and there's obviously
> > the issue that pattern matching could then imply large amounts of
> > computation which wouldn't be obvious from the source. [...]
>
> I've heard this last point many times in my crusading for views, but
> I've never understood it. When you see a function call foo(x), you
> know there's the potential for arbitrary amounts of computation there.
> So why should it be any different when you see a pattern Foo(x)?

Because functions always mean arbitrary amounts of computation. While
pattern matching means primitve destructuring... except when it's a
view, then it means arbitrary amounts of computation.

> Many times, you know something about
> foo that lets you conclude something about how long that function call
> will take, and the same would be true about Foo. I agree that it
> could be confusing for people who are expecting old-style pattern
> matching to accidentally run across code using views, but that kind of
> reasoning would forbid any language changes at all.

Changing something so primitive to a language (especially in Haskell's
case) with no indication brings up some resistance is surprising?
Especially when (now, at least) there are alternatives that are less
mutative and more additive?



> I particularly don't understand this argument in the context of
> Haskell, where a pattern like (x:xs) *already* has the potential for
> causing arbitrary amounts of computation because of lazy evaluation.

There's a difference between -triggering- computation and -adding- it.
Though laziness does bring up another issue, when I pattern match say
(Cons a as) I know that I'm not triggering evaluation of a and as, with
views I don't.

At any rate, I like the explicit indication (naming conventions for
free!) in pattern guards/transformational patterns as well as the fact
that I can just use it with no preparation. Transformational patterns
also look like they capture much of the convenience of views. The
f env k (Just val)!(lookup k env) = val example is also pretty
attractive.

Ralph Becket

unread,
Dec 8, 2003, 5:35:12 PM12/8/03
to
coka...@acm.org (Chris Okasaki) wrote in message news:<75ab7aaf.03120...@posting.google.com>...

>
> I've heard this last point many times in my crusading for views, but
> I've never understood it. When you see a function call foo(x), you
> know there's the potential for arbitrary amounts of computation there.
> So why should it be any different when you see a pattern Foo(x)?

I agree. If you need to look at a value in a particular way,
you might as well have a convenient notation. The work is going to
be done *somewhere*.

There is an argument that says there should be a relatively simple
mapping from code to performance model, but that's a complete non-
starter with lazy languages anyway.

-- Ralph

Neelakantan Krishnaswami

unread,
Dec 8, 2003, 5:46:10 PM12/8/03
to
In article <75ab7aaf.03120...@posting.google.com>, Chris Okasaki
wrote:

>
> I've heard this last point many times in my crusading for views, but
> I've never understood it. When you see a function call foo(x), you
> know there's the potential for arbitrary amounts of computation
> there. So why should it be any different when you see a pattern
> Foo(x)?

I don't understand that objection, either, and I agree that views or
something like them are a pressing need in ML and Haskell. But what
bothers me about them is that it's not clear how to make non-
exhaustiveness and redundant match tests integrate properly with them.

It's a tricky problem; obviously in the limit of views doing arbitrary
computation it's undecidable, but I wonder if there are practical
hacks that will preserve coverage-test-ability without giving up /too/
much expressiveness....

--
Neel Krishnaswami
ne...@cs.cmu.edu

Joachim Durchholz

unread,
Dec 8, 2003, 8:49:55 PM12/8/03
to
Darius wrote:

> Joachim Durchholz <joachim....@web.de> wrote:
>
>>That's exactly my point: that "elsewhere" would be in a function call
>>(or, in Erlang, an asynchronous message send), where you don't "see"
>>the overhead just as with a pattern match.
>
> But, it's obvious when you call a function that it could take
> arbitrarily long.

That's just as (non-)obvious as the overhead for a view.
Seriously, I see no real difference, other than programmer expectations.

> With views it's not obvious whether a pattern match
> will be a cheap fixed cost or an arbitrarily large cost and it looks
> like something that's cheap.

Same with functions: you don't see whether it's a cheap or an expensive one.

>>>The problem may be largely sociological; namely, that Haskellers
>>>have grown accustom to pattern matching having a relatively trivial
>>>cost
>>
>>Agreed.
>>
>>
>>>and programmers in general tend to identify short language
>>>constructs with simple code.
>>
>>Pattern matching is actually a longer, syntactically and semantically
>>more complex construct than a function call. I think it's more being
>>accustomed to a certain behaviour than a matter of direct perception.
>
> That's why I threw in "language construct". Function -application- is a
> language construct and cheap syntactically and semantically in Haskell.
> The actual function called is not a language construct that it takes an
> unknown amount of time is obvious.

If you want to split hairs, I can do likewise: the pattern match itself
would be just as simple, it's the evaluation of the selecting
expressions that takes time.
No, this is not really the issue.

Reordering pattern matches may have serious impact on performance with
views (it pays off to move the expensive views towards the end of a
pattern match).
Or, a standard pattern match is O(1) and a view-based pattern match
could be expected to be O(N), where N is the number of patterns. I agree
that's a relevant difference.
However, I'm quite confident that this isn't serious. If the
transformation rule for a view is "split the view into a pattern and a
guard expression", efficiency will be the same as with just patterns.
And having views would allow far better handling of user-defined types.

> Anyways, the paper "Pattern Guards and Transformational Patterns" goes
> over these issues and others and makes a compelling case for
> pattern guards and transformational patterns (the latter isn't
> implemented; in GHC at least). I agree with most of their analysis and
> conclusions.

The paper is available online at
http://www.elsevier.nl/gej-ng/31/29/23/76/33/38/41.1.012.pdf (241 K).

Regards,
Jo

Joachim Durchholz

unread,
Dec 8, 2003, 8:54:45 PM12/8/03
to
Neelakantan Krishnaswami wrote:

> But what bothers me about [views] is that it's not clear how to make


> non- exhaustiveness and redundant match tests integrate properly with
> them.

There are two options (I think one could have both at the same time, but
I'm just thinking aloud right now):

1) Annotate the views so that the compiler knows which are exhaustive.
(Since exhaustion is undecidable, some amount of declarations is
necessary. Unless you're prepared to add a general programmer-assisted
theorem prover to the language, that is.)
2) Let programmers add a _ pattern whenever the compiler can't infer
exhaustiveness, just as before.

Regards,
Jo

Daniel C. Wang

unread,
Dec 8, 2003, 10:31:09 PM12/8/03
to
I probably, have plugged this paper to death in the past, but it really
does address most if not all of the issues raised.

http://www.cs.princeton.edu/~danwang/drafts/recursion-schemes.pdf

I think, it offers a very elegant solution, without any language
extensions. It also provides for a nice semantics for views in an eager
language, as well polytpic programming operators to boot. I really need
to revise it and resubmit it.

Any, feedback on how to improve the presentation for the readers of this
group would be appreciated.

(Note, the email address above is for some place for spam to go.)
See the webpage for my real address.

Joachim Durchholz

unread,
Dec 9, 2003, 7:02:41 AM12/9/03
to
Darius wrote:
>
> Anyways, the paper "Pattern Guards and Transformational Patterns" goes
> over these issues and others and makes a compelling case for
> pattern guards and transformational patterns (the latter isn't
> implemented; in GHC at least). I agree with most of their analysis and
> conclusions.

As they authors say: neither do pattern guards cover views, nor do views
cover pattern guards. In other words, they address different issues than
those addressed by views.
Their critique of views seems to address a particular way of doing
views. Having to define extra data types just for getting views into the
language is indeed clunky. This doesn't mean it's impossible to have a
good idea of views (but it's entirely possible that a good view
mechanism cannot be properly integrated /into Haskell/).

When reading this article, I had another idea: pattern guards are
pushing towards integrating pattern matching more into normal
evaluation. How about making pattern matching a part of the normal
evaluation machinery?
Patterns would simply be normal expressions; they'd augment the
environment with bindings instead of (or in addition to) returning a value.
There are many issues that would have to be addressed, syntactically and
semantically, and I'm unsure whether the result would have any
resemblance with Haskell. However, I'm pretty sure that I'm not the
first one to have this idea; does anybody know references to literature
or a programming language that does this?
(Actually I know a programming language that does it: Lisp. It does so
by introducing association lists into the language, and it would be easy
to give access to association lists of functions and closures if it
isn't already the case. However, I'm more after a language that gives
pattern matching a first-class status while making them a separate thing.)

Regards,
Jo

Darius

unread,
Dec 9, 2003, 7:48:32 AM12/9/03
to
On Tue, 09 Dec 2003 13:02:41 +0100
Joachim Durchholz <joachim....@web.de> wrote:

> Darius wrote:
> >
> > Anyways, the paper "Pattern Guards and Transformational Patterns"
> > goes over these issues and others and makes a compelling case for
> > pattern guards and transformational patterns (the latter isn't
> > implemented; in GHC at least). I agree with most of their analysis
> > and conclusions.
>
> As they authors say: neither do pattern guards cover views, nor do
> views cover pattern guards. In other words, they address different
> issues than those addressed by views.

True, but, especially with transformational patterns, there's
significant overlap. I wouldn't want to see both transformational
patterns and views in the same language, because given one there isn't
much reason to add the other.

> Their critique of views seems to address a particular way of doing
> views. Having to define extra data types just for getting views into
> the language is indeed clunky.

Yes. Transformational patterns look like they do a decent job of being
"lightweight" views. E.g.
view List a of Stack a = Nil | a ::: Stack a
where list stk | isEmpty stk = Nil
| otherwise = let (a,as) = pop stk in a:::as

fOfTop f (a:::as) = push (f a) as
and with transformational patterns,
fOfTop f (a,as)!pop = push (f a) as

I haven't seen a views proposal that doesn't require declaring an
auxilary "datatype", and I'm not sure how that would work.

> This doesn't mean it's impossible to
> have a good idea of views (but it's entirely possible that a good view
> mechanism cannot be properly integrated /into Haskell/).

I'm not sure what you mean by "having a good idea of views".

> When reading this article, I had another idea: pattern guards are
> pushing towards integrating pattern matching more into normal
> evaluation. How about making pattern matching a part of the normal
> evaluation machinery?
> Patterns would simply be normal expressions; they'd augment the
> environment with bindings instead of (or in addition to) returning a
> value. There are many issues that would have to be addressed,
> syntactically and semantically, and I'm unsure whether the result
> would have any resemblance with Haskell. However, I'm pretty sure that
> I'm not the first one to have this idea; does anybody know references
> to literature or a programming language that does this?

There were some references to "first-class" pattern matching... uh,
things, in the above paper's bibliography.

> (Actually I know a programming language that does it: Lisp. It does so
> by introducing association lists into the language, and it would be
> easy to give access to association lists of functions and closures if
> it isn't already the case. However, I'm more after a language that
> gives pattern matching a first-class status while making them a
> separate thing.)

? Current Lisps?

Joachim Durchholz

unread,
Dec 9, 2003, 4:40:42 PM12/9/03
to
Darius wrote:

> Joachim Durchholz <joachim....@web.de> wrote:
>
>>As the authors say: neither do pattern guards cover views, nor do


>>views cover pattern guards. In other words, they address different
>>issues than those addressed by views.
>
> True, but, especially with transformational patterns, there's
> significant overlap.

Agreed.
I was thinking more about pattern guards.

>>Their critique of views seems to address a particular way of doing
>>views. Having to define extra data types just for getting views into
>>the language is indeed clunky.
>
> Yes. Transformational patterns look like they do a decent job of being
> "lightweight" views. E.g.
> view List a of Stack a = Nil | a ::: Stack a
> where list stk | isEmpty stk = Nil
> | otherwise = let (a,as) = pop stk in a:::as
>
> fOfTop f (a:::as) = push (f a) as
> and with transformational patterns,
> fOfTop f (a,as)!pop = push (f a) as
>
> I haven't seen a views proposal that doesn't require declaring an
> auxilary "datatype", and I'm not sure how that would work.

Make the views part of the data type's definition.

The main issue is the question how to give the compiler enough
information so that it can infer which sets of views cover all cases.
For example, the programmer might have to specify "view groups"
explicitly (i.e. groups of "pseudo constructors", with a complete
mapping of all existing constructors into cases for the view's pseudo
constructors).
(I'm typing along as I'm thinking, so I'm pretty sure there are lots of
holes in this reasoning.)

>>This doesn't mean it's impossible to
>>have a good idea of views (but it's entirely possible that a good view
>>mechanism cannot be properly integrated /into Haskell/).
>
> I'm not sure what you mean by "having a good idea of views".

"Having a good idea of how to design a good view mechanism for a language".

Regards,
Jo

Robert Will

unread,
Dec 10, 2003, 7:23:29 AM12/10/03
to
Joachim Durchholz <joachim....@web.de> wrote ...

>
> Unfortunately, there doesn't seem to be an online version of any book/paper.
> [on B or Z]

At least to give you an idea:
http://spivey.oriel.ox.ac.uk/~mike/zrm/

I hope they will really open-source the B tools soon...

> > "A practical theory of Programming", whose second edition is
> > to appear soon.

He said that he wants the 2nd ed to be much cheaper than the first.
(Btw: german translation next year, which means 2005, since with "this
year" I already mean 2004...)

> Well, tables aren't best described as functions either, so it's OK if
> you don't follow modern mathematics here :-)

Functions are really not as bad. Anyway, I want to allow both views
as far as possible.

> Or, in yet other words: Eiffel has some extremely nice ideas, but it
> also has some very serious blunders.

It has contracts and that compensates for a lot of weaknesses.
(From my post to the other thread, you know that I don't expect so
much from a language.)

Bob.

Robert Will

unread,
Dec 10, 2003, 7:28:57 AM12/10/03
to
Alaric B Snell <ala...@alaric-snell.com> wrote
>
> What's always annoyed me with Haskell pattern matching is that the
> fundamental matching operators are defined for inbuilt types like lists
> and so on - like (x:xs) - and (last time I looked; stuff may have
> changed since) you can't define your own for abstract data types.

try www.haskell.org/development/views.html
or the approach I took in Dessy with 'split'.

Joachim Durchholz

unread,
Dec 10, 2003, 8:06:29 AM12/10/03
to
Daniel C. Wang wrote:
> I probably, have plugged this paper to death in the past, but it
> really does address most if not all of the issues raised.
>
> http://www.cs.princeton.edu/~danwang/drafts/recursion-schemes.pdf

Actually I didn't know about that paper, so the plug actually had an
effect :-)

> I think, it offers a very elegant solution, without any language
> extensions. It also provides for a nice semantics for views in an
> eager language, as well polytpic programming operators to boot. I
> really need to revise it and resubmit it.
>
> Any, feedback on how to improve the presentation for the readers of
> this group would be appreciated.

My problem is: there seems to be so much related work that I'm having
trouble judging relative merits. (I'm not in academia, so I simply don't
know enough about type theory, refinement types, etc...)
Oh, by the way: what's a refinement type?

Regards,
Jo

Alaric B Snell

unread,
Dec 10, 2003, 8:16:23 AM12/10/03
to
ke...@ii.uib.no wrote:
> I don't quite follow your sketch, but if you need to "pattern match"
> on the "outside" of the ADT, isn't it natural to provide the means to
> do so as part of the ADT interface? So you could have e.g.

Yep, the idea was that at the same time as expressing a function like
cons (that acts like a constructor of an algebraic type, but actually
invokes Bob's Balanced Tree Algorithm to update the underlying set), you
would also express some code that does pattern matching on sets to
'undo' the cons, the same way you can use an actual algebraic type
constructor as a DE-constructor.

> module Sequence (cons,nil,...) where ...
>
> module Main ...
> :
> map f s | s == cons x xs = cons (f x) (map f xs)
> | s == nil = nil
>
> Not quite pattern matching, but fairly close, and makes it explicit
> that there may be larger computations involved.

I'm not sure what you're aiming at there...

>>Awooga! The values may contain duplicates (unlike the keys) so the
>>values should be a Bag.
>
> Depends what you want, doesn't it? If you just want to know which
> values are stored, it's a set. If you want individual counts, it's a
> bag.

This is true. What was concerning me was an intuitive idea that the
sizes of the keys and values 'collections' should be the same since they
are meant to 'match up' in some way :-) But thought of as domain and
range, indeed, there is no need for them to be the same size.

Then again, perhaps that is better handled by a Map type that is even
more general - both keys and values can be non-unique. Then subtypes for
surjective, injective, and bijective versions.

>
> -kzm
>

ABS

Daniel C. Wang

unread,
Dec 10, 2003, 10:08:45 AM12/10/03
to
Joachim Durchholz wrote:
{stuff deleted}

>
> My problem is: there seems to be so much related work that I'm having
> trouble judging relative merits. (I'm not in academia, so I simply don't
> know enough about type theory, refinement types, etc...)
> Oh, by the way: what's a refinement type?

Yeah, originally, I was very space limited, and tried to squeeze too
much into it. I need to either make it much more expanded, and present a
big detailed example, or I have to shrink it to a small subset.

http://www.cs.princeton.edu/~danwang/lam.sml

Has a full lambda-calculus compiler written using the techniques
described, so you can see how they all fit together. There is a lot more
code reuse in the example above than you would normally find, while
still using types to enforce various invariants. Also, if you look
carefully, I don't think I every explicitly recurse on the
data-structures at all. Most everything is defined in terms in fold and
unfold operators. So, I also have an informal termination proof for the
entire system.

Darius

unread,
Dec 10, 2003, 10:52:11 AM12/10/03
to
On Wed, 10 Dec 2003 14:06:29 +0100
Joachim Durchholz <joachim....@web.de> wrote:

> Daniel C. Wang wrote:
> > I probably, have plugged this paper to death in the past, but it
> > really does address most if not all of the issues raised.
> >
> > http://www.cs.princeton.edu/~danwang/drafts/recursion-schemes.pdf
>
> Actually I didn't know about that paper, so the plug actually had an
> effect :-)

I did, but I had forgotten about it. ;)



> > I think, it offers a very elegant solution, without any language
> > extensions. It also provides for a nice semantics for views in an
> > eager language, as well polytpic programming operators to boot. I
> > really need to revise it and resubmit it.
> >
> > Any, feedback on how to improve the presentation for the readers of
> > this group would be appreciated.
>
> My problem is: there seems to be so much related work that I'm having
> trouble judging relative merits. (I'm not in academia, so I simply
> don't know enough about type theory, refinement types, etc...)
> Oh, by the way: what's a refinement type?

I'm not exactly in academia either, let alone working on this in
particular, but I think I have a handle on the basics.

I'll use a similar but simpler example: We have a functional
language and we want letrecs, but it's a strict language and we don't
want to have to deal with circular structures so we want to restrict the
rhs expressions to be functional values, so let's say the following is
the datatype representing the AST for the language,
data Term
= Var Identifier
| App Term Term
| Lambda [Identifier] Term
| If Term Term Term
| Let [(Identifier,Term)] Term
| Letrec [(Identifier,Term)] Term

We want to allow
letrec f = \n -> if (n == 0) then 1 else n*f (n-1) in f 5
but not,
letrec ones = 1 : ones in g ones
whether typed in or from program transformations. One thing we could do
is have an isFunctionalTerm check that we could add to make sure that
this holds, it would look like,
isFunctionalTerm (Var _) = False
isFunctionalTerm (App _ _) = False
isFunctionalTerm (Lambda _ _) = True
isFunctionalTerm (If _ t e) = isFunctionalTerm t && isFunctionalTerm e
isFunctionalTerm (Let _ e) = isFunctionalTerm e
isFunctionalTerm (Letrec bs e)
= all (isFunctionalTerm . snd) bs && isFunctionalTerm e

and in the midst of transformations we'd have guards like,
transform (Letrec bindings body)
| all (isFunctionalTerm . snd) bindings = ...
| otherwise = error "we did something wrong"

But, where's the bondage? where's the discipline? Suffice to say, no
self-respecting static typer would be happy with this. What we could do
today is the following,
data Term = ... | Letrec [(Identifier,FunctionalTerm)] Term
data FunctionalTerm
= FLambda [Identifier] Term
| FIf Term FunctionalTerm FunctionalTerm
| FLet [(Identifier,Term)] FunctionalTerm
| FLetrec [(Identifier,FunctionalTerm)] FunctionalTerm

Now, it's right by construction and mistakes will be caught statically.
Of course, we now need two freeVars functions, two pretty print
functions, etc. despite that they will be exactly the same. And of
course, later we decide we want to allow circular values so we have to
go through and change everything to use the old constructors. With the
above solution we could simply change isFunctionalTerm to const True as
a stopgap measure and then later get rid of them in the code.

With (some kinds of) refinement types you can avoid the replication.
You'd have something like,
data Term = ... | Letrec [(Identifer,FunctionalTerm)] Term
datasort FunctionalTerm
= Lambda [Identifier] Term
| If Term FunctionalTerm FunctionalTerm
| Let [(Identifier,Term)] FunctionalTerm
| Letrec [(Identifier,FunctionalTerm)] FunctionalTerm

The datasort declaration only puts restrictions on how you can use the
Term datatype when the expected type is FunctionalTerm, but otherwise
all Term code works on FunctionalTerms. If we change our minds, then we
only have to change the types. You can kind of see how this
(loosely) relates to views in that we can have multiple datasort
declarations that restrict the type in different ways.

Joachim Durchholz

unread,
Dec 11, 2003, 2:35:32 AM12/11/03
to
Robert Will wrote:

> Joachim Durchholz <joachim....@web.de> wrote ...
>
>>Unfortunately, there doesn't seem to be an online version of any book/paper.
>>[on B or Z]
>
> At least to give you an idea:
> http://spivey.oriel.ox.ac.uk/~mike/zrm/

Thanks, got it.

>>Or, in yet other words: Eiffel has some extremely nice ideas, but it
>>also has some very serious blunders.
>
> It has contracts and that compensates for a lot of weaknesses.

I may be over-critical here, but an inconsistent type system is bad
enough. Given that you're also pretty much left out in the cold when it
comes to living with these inconsistencies (since Eiffel isn't supposed
to have type inconsistencies), I'd flatly give a "disqualified" mark.
(The alternatives aren't better though, at least when it comes to
mainstream OO languages with a static type system.)

Regards,
Jo

Joachim Durchholz

unread,
Dec 11, 2003, 2:47:10 AM12/11/03
to
Alaric B Snell wrote:
>>> Awooga! The values may contain duplicates (unlike the keys) so the
>>> values should be a Bag.
>>
>> Depends what you want, doesn't it? If you just want to know which
>> values are stored, it's a set. If you want individual counts, it's a
>> bag.
>
> This is true. What was concerning me was an intuitive idea that the
> sizes of the keys and values 'collections' should be the same since they
> are meant to 'match up' in some way :-) But thought of as domain and
> range, indeed, there is no need for them to be the same size.
>
> Then again, perhaps that is better handled by a Map type that is even
> more general - both keys and values can be non-unique. Then subtypes for
> surjective, injective, and bijective versions.

Trying to be symmetrical is a good design guideline, but it's not an end
in itself. So if I were to judge such an approach, my first question
would be: what's to be gained by this form of symmetry, under the
constraint that key-to-value lookups must remain efficient and convenient?
Personally, I think there is a place for a data structure like the one
you propose, but it's different than Map. If the programmer knows
there's at most one value for each key, he'll want to use Map (and use
pattern matching to distinguish between "found" and "not found"), if he
doesn't know how many values are associated with a key, he expects to be
returned a list and is prepared to iterate over it.

Regards,
Jo

Andreas Rossberg

unread,
Dec 11, 2003, 5:48:39 AM12/11/03
to
Darius wrote:
>
> We want to allow
> letrec f = \n -> if (n == 0) then 1 else n*f (n-1) in f 5
> but not,
> letrec ones = 1 : ones in g ones
> whether typed in or from program transformations. One thing we could do
> is have an isFunctionalTerm check that we could add to make sure that
> this holds, it would look like,
> isFunctionalTerm (Var _) = False
> isFunctionalTerm (App _ _) = False
> isFunctionalTerm (Lambda _ _) = True
> isFunctionalTerm (If _ t e) = isFunctionalTerm t && isFunctionalTerm e
> isFunctionalTerm (Let _ e) = isFunctionalTerm e
> isFunctionalTerm (Letrec bs e)
> = all (isFunctionalTerm . snd) bs && isFunctionalTerm e

Just an aside: I don't think the cases for If and Let are that simple.
Consider:

letrec f = (let ones = 1 : f() in \().ones) in f()

or

letrec b = (if b() then \().not b() else \().b()) in ...

On the other hand, you can simplify the Letrec case - its bindings' RHSs
should be functional by induction.

- Andreas

--
Andreas Rossberg, ross...@ps.uni-sb.de

"Computer games don't affect kids; I mean if Pac Man affected us
as kids, we would all be running around in darkened rooms, munching
magic pills, and listening to repetitive electronic music."
- Kristian Wilson, Nintendo Inc.

Darius

unread,
Dec 11, 2003, 12:43:50 PM12/11/03
to
On Thu, 11 Dec 2003 11:48:39 +0100
Andreas Rossberg <ross...@ps.uni-sb.de> wrote:

> Darius wrote:
> >
> > We want to allow
> > letrec f = \n -> if (n == 0) then 1 else n*f (n-1) in f 5
> > but not,
> > letrec ones = 1 : ones in g ones
> > whether typed in or from program transformations. One thing we
> > could do is have an isFunctionalTerm check that we could add to make
> > sure that this holds, it would look like,
> > isFunctionalTerm (Var _) = False
> > isFunctionalTerm (App _ _) = False
> > isFunctionalTerm (Lambda _ _) = True
> > isFunctionalTerm (If _ t e) = isFunctionalTerm t && isFunctionalTerm
> > e isFunctionalTerm (Let _ e) = isFunctionalTerm e
> > isFunctionalTerm (Letrec bs e)
> > = all (isFunctionalTerm . snd) bs && isFunctionalTerm e
>
> Just an aside: I don't think the cases for If and Let are that simple.
> Consider:
>
> letrec f = (let ones = 1 : f() in \().ones) in f()
>
> or
>
> letrec b = (if b() then \().not b() else \().b()) in ...

I'd thought about this a little bit, but decided the idea was clear
enough. I didn't want something completely trivial like

isFunctionalTerm (Lambda _ _) = True

isFunctionalTerm _ = False
Anyways, I decided not to be too picky about it because it would result
in various paragraphs that I wrote in reply to this post, but decided to
erase.

> On the other hand, you can simplify the Letrec case - its bindings'
> RHSs should be functional by induction.

I had it the simpler way originally, then I changed it, then I changed
it back. Apparently gremlins then changed it back again... or maybe I
did to be symmetrical with the latter part.

Joachim Durchholz

unread,
Dec 11, 2003, 3:55:26 PM12/11/03
to
I hate to admit to not understanding a post which shows some effort,
but, to be honest: no, I didn't understand the points.
Well, the arguments were clear enough, but I'm entirely unsure where on
the spectrum of offered solutions one would say "this is what
constitutes a refinement type".

So, maybe, somebody can come with an explanation of refinement type that
doesn't work by example, yet doesn't use too much type-theoretic
terminology. Something analogous to defining a parametric type as "a
type with a parameter that can be substituted into the type's
definition; usually that parameter will be a type as well".

Once I have that definition, I will (hopefully) be able to put the
various options of the previous post into a context...

Regards,
Jo

Daniel C. Wang

unread,
Dec 11, 2003, 6:03:58 PM12/11/03
to

A refinement type, is nothing more than a fancy name for "subtype" of a
type. The fancy bits comes in how it interacts with the rest of the
system, in particular type inference and how functions are typed.

i.e. given the natural numbers. The even numbers and odd numbers are
refinements of the natural numbers.

Refinement types typically are defined inductively. i.e.

datatype nat = Zero | Succ of nat
(* refinements of nat *)
datasort even <: nat = Zero | Succ of odd
and odd <: nat = Succ of even

fun add (Zero, m) = m
| add (Succ n, m) = add(n,Succ m)
withtype (odd * odd) -> even
and (even * even) -> even
and (even * odd) -> odd
and (odd * even) -> odd
and (nat * nat) -> nat

The refinement type system is smart enough to infer/check that those
declarations are used sensibly. In particular it can verify that the
function add has all those five different type signatures.
Depending on the context of its use it infers the right signature to use.

Gory details and more examples at
http://www-2.cs.cmu.edu/afs/cs/user/rowan/www/sorts.html

Joachim Durchholz

unread,
Dec 12, 2003, 9:55:44 AM12/12/03
to
Thanks - that was an *extremely* useful answer!

I have to check out the details yet (including the gory ones), but now I
know what to look for.

Regards,
Jo

Albert Lai

unread,
Dec 14, 2003, 4:07:54 PM12/14/03
to
Alaric B Snell <ala...@alaric-snell.com> writes:

> Then again, perhaps that is better handled by a Map type that is even
> more general - both keys and values can be non-unique. Then subtypes
> for surjective, injective, and bijective versions.

At that point you no longer call it Map; you call it Rel.
A relation ADT will allow me to implement directed graphs nicely.

Albert Lai

unread,
Dec 14, 2003, 6:21:11 PM12/14/03
to
Joachim Durchholz <joachim....@web.de> writes:

> On your treatment of Maps: they don't just share properties with Sets,
> they /are/ sets (in a "is a proper subtype" sense), namely (please
> forgive my rusty Haskell)
> Map a b = Set (Pair a b)
> (the set of pairs from a domain and a codomain, aka the set of
> key-value pairs). This implies multi-parameter type classes.

Map "is a subtype" of set, in the exact same sense that pair "is a
subtype" of set:

Pair a b = Set (Either a (Set (Either a b)))
mkpair a b :: a -> b -> Pair a b
mkpair x y = {left a, right {left a, right b}}

These "is-a" relations simply give one way of implementing maps and
pairs in a seldom-used programming language called ZF, and they are
not very helpful in conveying and understanding what maps and pairs
are about. If maps and pairs are thought of as abstract data types,
an abstract specification is called for, and chances are it is more
enlightening, e.g.,

(x,y)=(a,b) iff x=a and y=b

is infinitely more readable, useful, and to the point than any
set-theoretic encoding.

Furthermore, sets are not always more primitive, and maps are not
always derived from sets in implementations. In PalmOS, maps or
relations are primitive, and if you want a set, you ought to derive
one from relations.

A specification for maps, whether in the form of a bunch of axioms
concerning lookups, or as an equation with a certain set, must be
judged on its merit as a specification (is it understandable, does it
help reasoning about programs using it), not on how faithfully it
copies from the foundations of mathematics. No one specifies the
integer data type as equivalence classes of pairs of natural numbers
under the relation (m,n)~(j,k) iff m+k=n+j. (On the other hand, when
specifying stacks, I still find it easier to encourage the audience to
imagine consing and tailing a cons list.)

The reason explained below may or may not be a good reason to say maps
are special sets; I don't know, because I don't understand it:

> Here's why I advocate this model:
> The alternative model that one would come up with was used in the
> abstract Eiffel class TABLE (equivalent to your Map): a set of "keys",
> with additional functions for mapping these keys to associated values.
> This approach, while untuitive, has very unfortunate consequences for
> equality.

What are the unfortunate consequences for equality?

Are they general shortcomings of all abstract axiomatizations of
lookups, or just a failing of the particular Eiffel specification?

Joachim Durchholz

unread,
Dec 15, 2003, 2:56:41 AM12/15/03
to
Albert Lai wrote:
> Joachim Durchholz <joachim....@web.de> writes:
>
>>On your treatment of Maps: they don't just share properties with Sets,
>>they /are/ sets (in a "is a proper subtype" sense), namely (please
>>forgive my rusty Haskell)
>> Map a b = Set (Pair a b)
>>(the set of pairs from a domain and a codomain, aka the set of
>>key-value pairs). This implies multi-parameter type classes.
>
> Map "is a subtype" of set, in the exact same sense that pair "is a
> subtype" of set:

No, not at all.
The properties of a set don't carry over to pairs: it doesn't make sense
to ask whether some value foo is in a pair (well it does, but set
membership will not give you the right answer).
Set properties do carry over to a map: it does make sense to ask whether
a given key-value pair is in a map.

I'm not sure what model theory has to say on the issue, but I suspect
there is some difference here as well. (Anybody with a firm background
on that topic is welcome to comment.)

> If maps and pairs are thought of as abstract data types,
> an abstract specification is called for, and chances are it is more
> enlightening, e.g.,
>
> (x,y)=(a,b) iff x=a and y=b
>
> is infinitely more readable, useful, and to the point than any
> set-theoretic encoding.

Right.
However, I didn't mean to encode maps as sets in a set-theoretic way, I
found this a quite natural interpretation.
The base question to ask when establishing this kind of relationship
between types is: does it provide for some interface that I want to have?
In the case of maps and sets, it turns out yes (at least for me, YMMV):
occasionally, I want to know whether a given key-value pair is in the
map. In itself, it's not an overwhelmingly important aspect of maps, but
it's relevant, and it does become important when you want to define
equality between maps.

> Furthermore, sets are not always more primitive, and maps are not
> always derived from sets in implementations. In PalmOS, maps or
> relations are primitive, and if you want a set, you ought to derive
> one from relations.

Right. Given a sufficiently rich structure, you can derive everything
from it. "Sufficiently rich" being mere sets, which have a quite
primitive definition (the only operation defined on them is set
membership) (I know that I'm doing injustice to mathematical sets; sets
in a Turing machine are, however, indeed quite primitive since
everything is countable).
However, I think one should derive complicated structures from simple
ones: if done the other way round, the simple structures will have a
complicated interface.

> A specification for maps, whether in the form of a bunch of axioms
> concerning lookups, or as an equation with a certain set, must be
> judged on its merit as a specification (is it understandable, does it
> help reasoning about programs using it), not on how faithfully it
> copies from the foundations of mathematics.

Fully agreed.
My motive of deriving maps from sets wasn't to copy mathematical
tradition, though I have to admit I stole the idea from mathematics - I
have reviewed it for usefulness in a computing background though, and I
still consider it a good idea to do it that way. YMMV.

> The reason explained below may or may not be a good reason to say maps
> are special sets; I don't know, because I don't understand it:
>
>>Here's why I advocate this model:
>>The alternative model that one would come up with was used in the
>>abstract Eiffel class TABLE (equivalent to your Map): a set of "keys",
>>with additional functions for mapping these keys to associated values.
>>This approach, while untuitive, has very unfortunate consequences for
>>equality.
>
> What are the unfortunate consequences for equality?

If a map is a set of pairs, equality can be defined in a very simple
fashion: two maps are equal if they obey set equality.
If a map is a set of keys plus a function, you need an equality on
functions. Yuck.
Or you ignore the function, but then you end up assuming that two maps
are equal if the sets of keys are equal. Double yuck. (And if the
library designer avoids this trap, there will always be library users
who don't.)
Well, of course one can allow just special functions, and define
equalities on these. But that avenue is getting ugly and complicated
quickly.

In Eiffel (or, rather, in that specific container library), equality was
defined just on the keys. Later, the library designers found that
equalities on values and on key-value pairs were also warranted, and
added special functions. Worse, I found bugs in client code where one
form of equality was assumed but another one was called (those bugs
never manifested themselves because the set of keys was fixed, but a
maintenance change here might have activated that bug any time).

> Are they general shortcomings of all abstract axiomatizations of
> lookups, or just a failing of the particular Eiffel specification?

Not a shortcoming of axiomatization, but a shortcoming of
misinterpreting a map as a set of keys and not thinking enough about the
key-to-value mapping function.

Actually, when I think about it, I see three ways to define a map:
1. Via axioms.
2. As a set of keys plus a function that maps values to the keys.
3. As a set of key-value pairs.

I think we're talking a bit at cross-purpose here: I was advocating 3
over 2 (and not thinking much about 1), while you were advocating 1 over
3 (and maybe not taking 2 into consideration). Apples and oranges :-)

When is comes to judging the axiomatic approach: it's nice, but it has
the undesirable property that most programmers will not understand a set
of axioms or their ramifications. It works well for simple data
structures like pairs and tuples, but I don't think that it scales.

Defining structures by composing them from more primitive constructs is
easier to understand. I assume that library designers know enough about
the axioms that they can use them to double-check their definitions, and
to distinguish between essential and ephemeral parts of a specification.
And that they are smart enough to push the essential part into a more
abstract type, so that others have a chance to build a different type
that can stand in for the original type (the latter is mostly wishful
thinking though, even the most insightful library designers tend to
overlook the full abstraction potential until the library has undergone
at least three revisions).

Regards,
Jo

Tomasz Zielonka

unread,
Dec 15, 2003, 4:55:52 AM12/15/03
to
>> Joachim Durchholz <joachim....@web.de> writes:
>>
>>>On your treatment of Maps: they don't just share properties with Sets,
>>>they /are/ sets (in a "is a proper subtype" sense), namely (please
>>>forgive my rusty Haskell)
>>> Map a b = Set (Pair a b)

Ironically, in Haskell's most popular implementation of sets and maps
Set a = FiniteMap a ()

() is a trivial type with one element - ()
(well, there is also bottom in it).

> However, I think one should derive complicated structures from simple
> ones: if done the other way round, the simple structures will have a
> complicated interface.

It doesn't have to - you can hide these details.

> If a map is a set of pairs, equality can be defined in a very simple
> fashion: two maps are equal if they obey set equality.

Is it going to be a multimap or you are adding an invariant that the map
doesn't contain two pairs with equal first elements?

Well, you can't really represent a multimap as a set of pairs, because
that won't allow duplicated key-value pairs.

I think that in practice it's easier to define Sets using Maps. You
immediately get all desired properties. If you want to define maps
using sets of pairs you have to do something with duplicated keys.

> If a map is a set of keys plus a function, you need an equality on
> functions. Yuck.

Perhaps I misunderstood something, but...

Because you think that comparing these functions is difficult I
suppose you mean functions with the (possibly infinite) domain greater
than the (always finite) set of keys. What for? Why would you want to
compare these functions for arguments outside the set of keys?

Take two maps with the same keys and functions which have the same values
for arguments from the key-set but differing elsewhere? Should these
maps be considered different?

If you limit the function's domain to the set of keys you end up
representing map by a set and a map.

> 3. As a set of key-value pairs.

That's a broken multimap.

> Defining structures by composing them from more primitive constructs is
> easier to understand.

I know that's not what you meant, but comparisons, conditional
expressions, functions and datatypes can be more primitive than sets
from programming language perspective (it's reversed in maths) ;)

Best regards,
Tom

--
.signature: Too many levels of symbolic links

Tomasz Zielonka

unread,
Dec 15, 2003, 4:57:31 AM12/15/03
to
Tomasz Zielonka wrote:
> Is it going to be a multimap or you are adding an invariant that the map
^^^
set

> doesn't contain two pairs with equal first elements?

--

Joachim Durchholz

unread,
Dec 15, 2003, 6:50:14 AM12/15/03
to
Tomasz Zielonka wrote:

>>However, I think one should derive complicated structures from simple
>>ones: if done the other way round, the simple structures will have a
>>complicated interface.
>
> It doesn't have to - you can hide these details.

Agreed.
Since you're the second person to publicly misunderstand this, I
probably didn't make clear enough that I'm talking about derivation of
interfaces, i.e. subtype relationships.
Implementations can merrily derive and reuse any direction they want.

>>If a map is a set of pairs, equality can be defined in a very simple
>>fashion: two maps are equal if they obey set equality.
>
> Is it going to be a multimap or you are adding an invariant that the map
> doesn't contain two pairs with equal first elements?
>
> Well, you can't really represent a multimap as a set of pairs, because
> that won't allow duplicated key-value pairs.
>
> I think that in practice it's easier to define Sets using Maps. You
> immediately get all desired properties. If you want to define maps
> using sets of pairs you have to do something with duplicated keys.

Hmm... well, if your definition of sets includes functions like
intersection and union, then the relationship between sets and maps
indeed becomes brittle.

>>If a map is a set of keys plus a function, you need an equality on
>>functions. Yuck.
>
> Perhaps I misunderstood something, but...
>
> Because you think that comparing these functions is difficult I
> suppose you mean functions with the (possibly infinite) domain greater
> than the (always finite) set of keys. What for? Why would you want to
> compare these functions for arguments outside the set of keys?

I didn't mean that the problem is in principle unsolvable. Since the
domain is finite, it's actually easy to define an equality.
The problem is that most languages don't give an easy way to work with
equalities over functions. I'd expect that a set-of-keys-plus-a-function
definitions requires some hackery to get around the language restrictions.
The set-of-pairs constructions avoids this.

> Take two maps with the same keys and functions which have the same values
> for arguments from the key-set but differing elsewhere? Should these
> maps be considered different?
>
> If you limit the function's domain to the set of keys

That seems indeed reasonable with that approach.

> you end up representing map by a set and a map.

That sounds a bit circular to me.
Not that I'd consider the basic concept flaky. It's just that if a
construction triggers such sentences, I tend to scrap the construction
on the grounds that it will be difficult to explain to the programmers
who're going to use this all. Things must be conceptually simple, do
avoid misunderstandings and misuse by application programmers - that's a
prime guideline. (I have been doing a lot of library work lately, and it
shows - assuming that application programmers are dumb is a great mental
model. It's also doing them injustice, they aren't dumb, they are
usually occupied with understanding the customer's grandest ideas and
simply don't have the energy or time left to understand the library
designer's grand ideas.)

Regards,
Jo

Aatu Koskensilta

unread,
Dec 16, 2003, 9:22:42 AM12/16/03
to
Joachim Durchholz wrote:

> Albert Lai wrote:
>
>> Joachim Durchholz <joachim....@web.de> writes:
>>
>>> On your treatment of Maps: they don't just share properties with Sets,
>>> they /are/ sets (in a "is a proper subtype" sense), namely (please
>>> forgive my rusty Haskell)
>>> Map a b = Set (Pair a b)
>>> (the set of pairs from a domain and a codomain, aka the set of
>>> key-value pairs). This implies multi-parameter type classes.
>>
>>
>> Map "is a subtype" of set, in the exact same sense that pair "is a
>> subtype" of set:
>
>
> No, not at all.
> The properties of a set don't carry over to pairs: it doesn't make sense
> to ask whether some value foo is in a pair (well it does, but set
> membership will not give you the right answer).
> Set properties do carry over to a map: it does make sense to ask whether
> a given key-value pair is in a map.
>
> I'm not sure what model theory has to say on the issue, but I suspect
> there is some difference here as well. (Anybody with a firm background
> on that topic is welcome to comment.)

Pretty much anything can be considered a 'subtype' of set in mathematics
- in the sense that whatever operations are defined for sets are also
defined for objects of these types - , due to the standard reduction of
the various mathematical concepts into set theoretic concepts. A map is
a special set, namely a set of ordered pairs, s.t. for any x, there is
at most one (x,y) in the map. A pair is similarly a set, but the
membership relation will indeed give you "wrong" answers. The only cases
in which you have non-set entities are those which deal with 'too big'
collections, such as the class of all sets or the class of ordinal
numbers. Perhaps you're thinking of something more interesting?

--
Aatu Koskensilta (aatu.kos...@xortec.fi)

"Wovon man nicht sprechen kann, daruber muss man schweigen"
- Ludwig Wittgenstein, Tractatus Logico-Philosophicus

0 new messages