How do you implement the set monad using constructor classes?
This is a good question, and one that I have been pondering myself
for some time. Indeed, if you'd visited me here any time over the
last year and a half, you might have noticed the words `Monad Set'
written in one corner of my board as a reminder of this very topic.
The short answer is that it depends on which `set monad' you are
referring to. If you're not interested in a more detailed response,
now's the time to move to the next message.
The long answer starts with the observation that we must be careful
to distinguish between conventional mathematics and the `mathematics
of computation', for want of a better phrase. Let's consider a
simpler example, before we get on to the subject of set monads:
Equality:
---------
In mathematics, if x,y are elements of a set A, then we can use
the expression x=y to represent the assertion that these two elements
are equal. It doesn't matter what `type' of values are in A, or
whether A is finite or infinite. Equality is an example of an
equivalence relation, and satisfies useful laws; for example, for
any value x, we can be sure that x=x.
In computing, we have to settle for an approximation, because
equality is not computable. We cannot, for example, test for bottom
values (i.e. non-terminating computations) or for equality of
infinite structures. Simple laws, that hold in mathematics, fail
when we restrict ourself to this computable approximation. None
of the following Haskell expressions evaluate to True:
error "foo" == error "bar"
[1..] == [1..]
(\x -> x) == (\y -> y)
Sets:
-----
Now, back to sets. In mathematics, for *any* set A, we can construct
the powerset, Set A, whose members are precisely the subsets of A.
We can define familiar operations on the elements of Set A such as
union, intersection, membership etc., and prove that they satisfy
the usual laws. In particular, we can show that the Set constructor
forms a monad with a join function given by:
join :: Set (Set a) -> Set a -- `bigunion'
join xss = { x | xs in xss, x in xs }
It is very easy to implement a powerset type constructor ... in
fact, there are a several different approaches that we can use:
data Set a = Set [a] -- lists, no repeated elems
data Set a = Empty | Node a (Set a) (Set a) -- search trees
data Set a = Empty -- set expressions
| Singleton a
| Union (Set a) (Set a)
| ...
But it is more difficult to define the set operations. For example,
to test for membership in a set of type (Set a), we need to assume
that values of type a can be tested for equality (or, in the second
case, using search trees, a total ordering). Even then, our
implementation will only be a computable approximation of the
membership relation that is used in mathematics.
How about the join function described above? For the three set
constructors above, the best we can do is to define functions with
the following types (using the notation of Haskell type classes):
Eq a => Set (Set a) -> Set a
Ord a => Set (Set a) -> Set a
Set (Set a) -> Set a
Only the third of these types is sufficiently general to be used
as a monad join ... the functional programmers definition of monads
requires a polymorphic function of type M (M a) -> M a, with no
restrictions on the choice of a.
So, unlike conventional mathematics, we do not really have a powerset
type constructor. Instead, we have several different ways to
implement approximations of the powerset constructor. The choice
of a set constructor in any particular application is motivated by
concerns about the operations that are available, but also by
assumptions about performance -- another non-issue in standard
mathematics. A question like `is the Set type constructor a monad?'
cannot be answered until we've decide *which* Set constructor we're
talking about.
Constructor classes:
--------------------
How does all this fit in with constructor classes? The first
example, equality, should make it clear that there are some strong
parallels, since that was one of the main motivations for introducing
type classes in Haskell.
The standard constructor class definition of monads looks something
like:
class Monad m where
unit :: a -> m a
join :: m (m a) -> m a
If we use the third representation for sets (i.e. set expressions),
then we can define a monad of Sets using:
instance Monad Set where
unit = Singleton
join = foldr Union Empty
But we cannot define either of the other two set constructors as
instances of Monad because the types of the corresponding join
functions are not sufficiently general. This, incidentally, is
not something new for constructor classes or type classes ... the
same thing happens if we give a type signature that is too general:
notId :: a -> a
notId xs = xs ++ xs -- a type error, lists expected ...
Would it solve the problem if member functions in classes could
include extra class constraints? The soon-to-be-released version
2.30 of Gofer does permit definitions like:
class Monad m where
unit :: a -> m a
join :: Eq a => m (m a) -> m a
Alas, this *doesn't* really do what we want! Restricting the choice
of a in the join function to equality types works ok for the first
representation of sets (lists with no repeats), but:
o In many cases, it is unnecessary. The third implementation of
sets, and many of the standard examples of monads (e.g. I/O,
state, exceptions, continuations) do not require any restrictions
on the element types.
o In some cases, it's not strong enough. For the second
implementation of sets, we need an ordering on the element
types, not just an equality.
Extra constraints in member function types are useful, for example,
to study pseudomonads (that's why I decided to implement them :-),
but they don't help here.
Maybe class constraints in the definition of datatypes would help?
Again, the soon-to-be-released version 2.30 of Gofer does permit
definitions like:
data Eq a => Set a = Set [a] -- lists, no repeated elements
data Ord a => Set a = ... -- search trees ...
Sadly, the answer is still no. Gofer follows the definition of
Haskell where the only effect of this is to add add extra class
constraints in the types of constructor functions. Thus, in the
first example, Set :: Eq a => [a] -> Set a. However, if we
instantiate a polymorphic type m (m a) -> m a with the constructor
m = Set, we just get Set (Set a) -> Set a, without the Eq
constraint, so the problem remains.
The observation that I've made several times in the past is that
it might still be possible to define each of the set constructors
above as instances of Monad if we change the semantics of the
datatype definitions above so that, for example, the definition:
data Eq a => Set a = ...
introduces a new constructor Set with kind Eq -> * (rather than
* -> *). Now, when we instantiate m (m a) -> m a with Set, we
get Eq a => Set (Set a) -> Set a as required. I believe that this
was probably the semantics that the Haskell committee may have had
in mind when they decided to allow class constraints in datatype
definitions. There are some unresolved problems that need to be
investigated before we can hope to see an implementation of this.
For example, there are some awkward interactions between kinds and
types, and some concerns that the resulting system may be a tad
too complicated for practical use ... It remains an interesting
topic for future research.
All the best,
Mark