Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Haskell-cafe] Monad Set via GADT

1 view
Skip to first unread message

Roberto Zunino

unread,
Jan 3, 2007, 6:39:42 PM1/3/07
to haskel...@haskell.org
To improve my understanding of GADT, I tried to define a Set datatype,
with the usual operations, so that it can be made a member of the
standard Monad class. Here I report on my experiments.

First, I recap the problem. Data.Set.Set can not be made a Monad because
of the Ord constraint on its parameter, while e.g. (return :: a -> m a)
allows any type inside the monad m. This problem can be solved using an
alternative monad class (restricted monad) so that the Ord context is
actually provided for the monad operations. Rather, I aimed for the
standard Monad typeclass.

To avoid reinventing Data.Set.Set, my datatype is based on that.
Basically, when we know of an Ord context, we use a Set. Otherwise, we
use a simple list representation. Using a list is just enough to allow
the implementation of return (i.e. (:[])) and (>>=) (i.e. map and (++)),
so while not being very efficient, it is simple. Other non-monadic
operators require an Ord context, so that we can turn lists into Set.

My first shot at this was:

data SetM a where
L :: [a] -> SetM a
SM :: Ord a => Set.Set a -> SetM a

However, this was not enough to convince the type checker to use the Ord
context stored in SM. Specifically, performing union with

union (SM m1) (SM m2) = SM (m1 `Set.union` m2)

causes GHC to report "No instance for (Ord a)". After some experiments,
I found the following, using a type equality witness:

data Teq a b where Teq :: Teq a a
data SetM a where
L :: [a] -> SetM a
SM :: Ord w => Teq a w -> Set.Set w -> SetM a

Now if I use

union (SM p1 m1) (SM p2 m2) =
case (p1,p2) of
(Teq,Teq) -> SM Teq (m1 `Set.union` m2)

it typechecks! This rises some questions I cannot answer:

1) Why the first version did not typececk?
2) Why the second one does?
3) If I replace (Teq a w) with (Teq w a), as in
SM :: Ord w => Teq w a -> Set.Set w -> SetM a
then union above does not typecheck! Why? I guess the type variable
unification deriving from matching Teq is not symmetric as I expect it
to be...

Below, I attach the working version. Monad and MonadPlus instances are
provided for SetM. Conversions from/to Set are also provided, requiring
an Ord context. "Efficient" return and mzero are included, forcing the
Set representation to be used, and requiring Ord (these could also be
derived from fromSet/toSet, however).

Comments are very welcome, of course, as well as non-GADT related
alternative approaches.

Regards,
Roberto Zunino.

============================================================
\begin{code}
{-# OPTIONS_GHC -Wall -fglasgow-exts #-}
module SetMonad
( SetM()
, toSet, fromSet
, union, unions
, return', mzero'
) where

import qualified Data.Set as S
import Data.List hiding (union)
import Control.Monad

-- Type equality witness
data Teq a b where Teq :: Teq a a

-- Either a list or a real Set
data SetM a where
L :: [a] -> SetM a
SM :: Ord w => Teq a w -> S.Set w -> SetM a

instance Monad SetM where
return = L . (:[])
m >>= f = case m of
L l -> unions (map f l)
SM Teq s -> unions (map f (S.toList s))

instance MonadPlus SetM where
mzero = L []
mplus = union

-- Efficient variants for Ord types
return' :: Ord a => a -> SetM a
return' = SM Teq . S.singleton

mzero' :: Ord a => SetM a
mzero' = SM Teq S.empty

-- Set union: use the best representation
union :: SetM a -> SetM a -> SetM a
union (L l1) (L l2) = L (l1 ++ l2)
union (SM p1 m1) (SM p2 m2) = case (p1,p2) of
(Teq,Teq) -> SM Teq (m1 `S.union` m2)
union (L l1) (SM p m2) = case p of
Teq -> SM Teq (m2 `S.union` S.fromList l1)
union s1 s2 = union s2 s1

-- Try to put a SM first before folding, to improve performance
unions :: [SetM a] -> SetM a
unions = let isSM (SM _ _) = True
isSM _ = False
in foldl' union (L []) . uncurry (++) . break isSM

-- Conversion from/to Set requires Ord
toSet :: Ord a => SetM a -> S.Set a
toSet (L l) = S.fromList l
toSet (SM p m) = case p of Teq -> m

fromSet :: Ord a => S.Set a -> SetM a
fromSet = SM Teq

-- Tests
test :: IO ()
test =
do let l = [1..3] :: [Int]
s = fromSet (S.fromList l)
g x = return' x `mplus` return' (x+100)
print $ S.toList $ toSet $
do x <- s
y <- s
return' (x+y)
-- [2,3,4,5,6]
print $ S.toList $ toSet $
do x <- s
g x
-- [1,2,3,101,102,103]
print $ S.toList $ toSet $
do x <- s
y <- g x
return' y
-- [1,2,3,101,102,103]
print $ S.toList $ toSet $
do x <- s
y <- g x
g y
-- [1,2,3,101,102,103,201,202,203]
print $ S.toList $ toSet $
do x <- s
y <- return (const x) -- no Ord!
return' (y ())
-- [1,2,3]
\end{code}
_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Jim Apple

unread,
Jan 8, 2007, 8:52:13 PM1/8/07
to haskel...@haskell.org
On 1/3/07, Roberto Zunino <zun...@di.unipi.it> wrote:
> 1) Why the first version did not typececk?
> 2) Why the second one does?
> 3) If I replace (Teq a w) with (Teq w a), as in
> SM :: Ord w => Teq w a -> Set.Set w -> SetM a
> then union above does not typecheck! Why? I guess the type variable
> unification deriving from matching Teq is not symmetric as I expect it
> to be...

These are very interesting questions that I forgot about until
reminded by Haskell Weekly News. Thanks, HWN!

1) Class constraints can't be used on pattern matching. They ARE
restrictive on construction, however. This is arguably bug in the
Haskell standard. It is fixed in GHC HEAD for datatypes declared in
the GADT way, so as not to break H98 code:

http://article.gmane.org/gmane.comp.lang.haskell.cvs.all/29458/match=gadt+class+context

2) The second one works because Class constraints can be used when
pattern matching existentials.

3) I imagine this might have something to do with the coercions that
System FC uses. With one ordering, a coercion might occur that in
another one is unnecessary. This coercion might allow the use of Ord w
by using it before the coercion from S.Set a to S.Set w.

#3 is just a guess.

Jim

Jim Apple

unread,
Jan 8, 2007, 9:00:57 PM1/8/07
to haskel...@haskell.org
On 1/3/07, Roberto Zunino <zun...@di.unipi.it> wrote:
> I tried to define a Set datatype,
> with the usual operations, so that it can be made a member of the
> standard Monad class.

Also, we can do this with oleg's technique of "Restricted Data Types Now":

http://article.gmane.org/gmane.comp.lang.haskell.prime/483

Jim

Simon Peyton-Jones

unread,
Jan 9, 2007, 8:55:58 AM1/9/07
to Jim Apple, haskel...@haskell.org
Concerning (1) yes, this is a recent and substantial improvement in GHC (HEAD only).

Concerning (3), this is a plain bug. I've just fixed it in the HEAD. I very much doubt I'll fix it in 6.6, because type inference in the HEAD now works rather differently.

Thanks for the example; I've added it to the test suite!

Simon

Benjamin Franksen

unread,
Jan 11, 2007, 4:07:35 PM1/11/07
to haskel...@haskell.org
Jim Apple wrote:
> On 1/3/07, Roberto Zunino <zun...@di.unipi.it> wrote:
>> 1) Why the first version did not typececk?
> 1) Class constraints can't be used on pattern matching. They ARE
> restrictive on construction, however. This is arguably bug in the
> Haskell standard. It is fixed in GHC HEAD for datatypes declared in
> the GADT way, so as not to break H98 code:
>
http://article.gmane.org/gmane.comp.lang.haskell.cvs.all/29458/match=gadt+class+context

To quote from there: "I think this is stupid, but it's what H98 says."

Maybe it is time to consider it deprecated to follow the Haskell 98
standard /to the letter/. The above is an example where the default
(without flags) should (arguably) be the 'fixed' standard. We would need an
equivalent of gcc's -pedantic flag, meaning "Follow the Haskell 98 standard
to the letter, even on issues where the standard is generally considered
bad".

I hope this will be handled in a better way with Haskell'. It should be
possible to revise the standard (every few years or so, /very/
conservatively i.e. no extensions, etc) so that we can eliminate 'bugs'
from the language spec.

Cheers
Ben

Simon Peyton-Jones

unread,
Jan 12, 2007, 3:04:43 AM1/12/07
to Benjamin Franksen, haskel...@haskell.org
| > On 1/3/07, Roberto Zunino <zun...@di.unipi.it> wrote:
| >> 1) Why the first version did not typececk?
| > 1) Class constraints can't be used on pattern matching. They ARE
| > restrictive on construction, however. This is arguably bug in the
| > Haskell standard. It is fixed in GHC HEAD for datatypes declared in
| > the GADT way, so as not to break H98 code:
| >
| http://article.gmane.org/gmane.comp.lang.haskell.cvs.all/29458/match=gadt+class+context
|
| To quote from there: "I think this is stupid, but it's what H98 says."
|
| Maybe it is time to consider it deprecated to follow the Haskell 98
| standard /to the letter/.

GHC follows this strange standard when you write data type decls in H98 form

data Eq a => T a = C a Int | D

Here, pattern-matching on either C or D will cause an (Eq a) constraint to be required.

However, GHC does *not* follow this convention when you write the data type decl in GADT style syntax:

data T a where
C :: Eq a => a -> Int -> T a
D :: T a

Here, (a) you can be selective; in this case, C has the context but D does not.
And (b) GHC treats the constraints sensibly:
- (Eq a) is *required* when C is used to construct a value
- (Eq a) is *provided* when C is used in pattern matching

In short, in GADT-style syntax, GHC is free to do as it pleases, so it does the "right" thing. In this case, then, you can avoid the H98 "bug" by using GADT-style syntax.

All of this is documented in the user manual. If it's not clear, please help me improve it.

Simon

Benjamin Franksen

unread,
Jan 12, 2007, 5:57:39 AM1/12/07
to haskel...@haskell.org

Crystal clear. My remark was meant merely as a general observation.

Cheers
Ben

0 new messages