[Haskell-cafe] A question on dependent types in Haskell

43 views
Skip to first unread message

Todd Wilson

unread,
Sep 26, 2016, 1:49:22 AM9/26/16
to Haskell
I haven't yet had the pleasure of exploring the subtleties of
dependently-typed programming in Haskell, but I'm finding myself in
need of a little bit of it in advance of that exploration and was
hoping for some suggestions.

I have a type of regular expressions:

data RegExp = Empty
| Letter Char
| Plus RegExp RegExp
| Cat RegExp RegExp
| Star RegExp

and a type of (deterministic) finite state machines that is
polymorphic in the state type:

data FSM a = FSM {
states :: [a],
start :: a,
finals :: [a],
delta :: [(a,Char,a)]
}

I've fixed an alphabet sigma :: [Char], and I want to write the
function that converts a regular expression to its associated FSM. The
machines associated with Empty and Letter are given by

data EmptyFSM = Etrap
emptyFSM :: FSM EmptyFSM
emptyFSM = FSM {
states = [Etrap],
start = Etrap,
finals = [],
delta = [(Etrap, c, Etrap) | c <- sigma]
}

data LetterFSM = Lstart | Lfinal | Ltrap
letterFSM :: Char -> FSM LetterFSM
letterFSM c = FSM {
states = [Lstart, Lfinal, Ltrap],
start = Lstart,
finals = [Lfinal],
delta = [(Lstart, c', if c' == c then Lfinal else Ltrap) | c' <- sigma] ++
[(q, c', Ltrap) | q <- [Lfinal, Ltrap], c' <- sigma]
}

Suppose I can code the constructions of the union machine,
concatenation machine, and star machine so that they have the types

unionFSM :: FSM a -> FSM b -> FSM (a,b)
catFSM :: FSM a -> FSM b -> FSM (a,[b])
starFSM :: FSM a -> FSM [a]

Now what I want to do is to put all of this together into a function
that takes a regular expression and returns the associated FSM. In
effect, my function should have a dependent type like

reg2fsm :: {r : RegExp} -> FSM (f r)

where f is the function

f :: RegExp -> *
f Empty = EmptyFSM
f (Letter a) = LetterFSM
f (Plus r1 r2) = (f r1, f r2)
f (Cat r1 r2) = (f r1, [f r2])
f (Star r) = [f r]

and be given by

reg2fsm Empty = emptyFSM
reg2fsm (Letter c) = letterFSM c
reg2fsm (Plus r1 r2) = unionFSM (reg2fsm r1) (reg2fsm r2)
reg2fsm (Cat r1 r2) = catFSM (reg2fsm r1) (reg2fsm r2)
reg2fsm (Star r) = starFSM (reg2fsm r)

What is the suggested approach to achieving this in Haskell?

--
Todd Wilson A smile is not an individual
Computer Science Department product; it is a co-product.
California State University, Fresno -- Thich Nhat Hanh
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.

Guillaume Bouchard

unread,
Sep 26, 2016, 4:23:28 AM9/26/16
to Todd Wilson, Haskell
Hi Todd,

Sorry, I will not answer your question, but discuss some points.

On Mon, Sep 26, 2016 at 7:48 AM, Todd Wilson <twi...@csufresno.edu> wrote:

> Suppose I can code the constructions of the union machine,
> concatenation machine, and star machine so that they have the types
>
> unionFSM :: FSM a -> FSM b -> FSM (a,b)
> catFSM :: FSM a -> FSM b -> FSM (a,[b])
> starFSM :: FSM a -> FSM [a]

What is the type of the union of an union ? For example :

union fsmA (union fsmB fsmC)

where fsmX are of type FSM x.

Are you waiting for a triple (i.e. FSM (a, b, c)) or does a recursive
tuple structures is ok ? (Such as FSM (a, (b, c))) such as your
example depict ?

> Now what I want to do is to put all of this together into a function
> that takes a regular expression and returns the associated FSM. In
> effect, my function should have a dependent type like
>
> reg2fsm :: {r : RegExp} -> FSM (f r)

As far as I understand depend type in Haskell, you cannot do that.

You need to "pack" your "typed" FSM inside a existentially qualified
generic fsm, such as :

data AnyFSM = forall t. AnyFsm (FSM t)

And then, later, "unpack" it using a case. However I think you also
need a constraint on `t` else you will not be able to recover anything
from it.

Sorry, my knowledge stops here.

> where f is the function
>
> f :: RegExp -> *
> f Empty = EmptyFSM
> f (Letter a) = LetterFSM
> f (Plus r1 r2) = (f r1, f r2)
> f (Cat r1 r2) = (f r1, [f r2])
> f (Star r) = [f r]

As far as I know, and I'll be happy to be proven wrong here, you
cannot write a function from a value to a type. However you can
"promote" your value level RegExp to a type level 'RegExp (using the
DataKind extension), and in this case, you will need to use tick in
from of you promoted kind (such as 'Empty, or 'Plus), but as far as I
know it is still impossible to write function this way, you need to
write type families, something such as :

type family f (t :: RegexExp) = * where
f 'Empty = EmptyFSM
f ('Letter a) = LetterFSM

I apologies for this partial answer, but I'm interested by your
question so I wanted to contribute at my level.

--
Guillaume.

Li-yao Xia

unread,
Sep 26, 2016, 8:50:28 AM9/26/16
to Todd Wilson, haskel...@haskell.org
Hello Todd,

This seems like a neat place to use GADTs!

Instead of trying to define f :: RegExp -> * by fully lifting RegExp to
the type level, you can use GADTs to reflect the structure of a RegExp
in its type:


data RegExp a where
-- a is the state type of the FSM associated with the regexp
Empty :: RegExp EmptyFSM
Letter :: RegExp LetterFSM
Plus :: RegExp a -> RegExp b -> RegExp (a, b)
Cat :: RegExp a -> RegExp b -> RegExp (a, [b])
Star :: RegExp a -> RegExp [a]

reg2fsm :: Eq a => RegExp a -> FSM a
-- ^ I assume you'll need an Eq constraint here to normalize
-- the "non-deterministic states".


It may be much less flexible than what you originally intended, because
the RegExp type is now tied to your FSM implementation. RegExps might
also become painful to pass around because of their too explicit type.
If that happens to be the case, you can clean up an API which uses this
regexp with an existential wrapper.


data SomeRegExp where
SomeRegExp :: Eq a => RegExp a -> SomeRegExp

data SomeFSM where
SomeFSM :: Eq a => FSM a -> SomeFSM


Another simpler solution is to only use such a wrapper around FSM (i.e.,
your output), keeping the RegExp type as you first wrote it. You'd have:


reg2fsm :: RegExp -> SomeFSM


Here you totally lose the relationship between the RegExp and the FSM,
so it seems to move away from what you were looking for. But if
ultimately you don't care about the actual type of the state of a FSM,
this gets the job done.

Regards,
Li-yao

Richard Eisenberg

unread,
Sep 27, 2016, 11:28:35 AM9/27/16
to Todd Wilson, Haskell
Hi Todd,

My `singletons` library kludges dependent types into Haskell via much dirty Template Haskell hackery. But it works in your use case. I was able to write your reg2fsm function much as you desired.

See attached.

Note that my type signature for reg2fsm (`Sing (r :: RegExp) -> FSM (F r)`) is essentially a dependent type: you can view `Sing (r :: RegExp)` to be like an argument `(r : RegExp)` in a proper dependently-typed language. Alternatively, you can pronounce `Sing` as `Π`.

Richard

Regex.hs

Todd Wilson

unread,
Sep 28, 2016, 2:41:10 AM9/28/16
to Li-yao Xia, Haskell Cafe
I actually tried the approach of indexing the RegExp type before, but
I didn't like it for the reason you indicated: I was also using RegExp
for many other purposes that have nothing to do with FSMs and I didn't
want to be forced to include the superfluous type indices in those
situations. On the other hand, for the reg2fsm function, I wanted to
use the extra type information to limit the space of solutions: the
existence of my function f :: RegExp -> * was meant to guide the
constructions that students might write.

--Todd

Todd Wilson

unread,
Sep 28, 2016, 2:50:02 AM9/28/16
to Richard Eisenberg, Haskell
Richard,

Thanks for this suggestion. I wasn't aware of the singletons library;
I'll have to take a look.

To you and the other readers of this thread: are there other
approaches that have not yet been suggested? Since I'm using this for
a series of assignments for students that have seen Haskell before but
are still relative beginners, I'm looking for something that will be
relatively easy to explain, won't get in the way of them writing the
constructions in the straightforward way, and yet will provide some
additional type-checking to help catch their errors.

--Todd

> -=-=-=-=-=-=-=-=-=-=-
> Richard A. Eisenberg
> Asst. Prof. of Computer Science
> Bryn Mawr College
> Bryn Mawr, PA, USA
> cs.brynmawr.edu/~rae

Richard Eisenberg

unread,
Sep 28, 2016, 12:39:13 PM9/28/16
to Todd Wilson, Haskell
Hi Todd,

I’m afraid dependent types in today’s Haskell are not for the faint-of-heart, or any but the most advanced students. Indeed, as I’ve been thinking about my new course on typed functional programming for the spring, I’ve debated on whether I should skip Haskell and go straight to Idris/Agda because of this. I won’t do that, I’ve decided, but this means I have to give up on some of the dependent-type material I’d like to cover.

To be fair, *some* slices of dependently typed programming are accessible. Stephanie Weirich’s red-black tree GADT has much of the flavor of dependently typed programming without as much of the Haskell-induced awkwardness. There is a decent description of this example in https://themonadreader.files.wordpress.com/2013/08/issue221.pdf

I hope this helps!
Richard

-=-=-=-=-=-=-=-=-=-=-
Richard A. Eisenberg
Asst. Prof. of Computer Science
Bryn Mawr College
Bryn Mawr, PA, USA
cs.brynmawr.edu/~rae

Reply all
Reply to author
Forward
0 new messages