[Haskell-cafe] Text.Regex.Base throws exceptions with makeRegexOptsM

23 views
Skip to first unread message

Omari Norman

unread,
Dec 29, 2011, 5:52:46 PM12/29/11
to haskel...@haskell.org
Hi folks,

I'm using Text.Regex.Base with the TDFA and PCRE backends. I want to
compile regular expressions first and make sure the patterns were
actually valid, so I used makeRegexOptsM, which indicates a bad regular
expression by calling fail. That allows you to use makeRegexOptsM with
Maybe or with (Either String) (assuming that Either String is an
instance of Monad, which of course is defined in Control.Monad.Error.)

Doing this with Maybe Regex works like it should--bad pattern gives you
a Nothing. But if you want to see the error message by using Either
String, an exception gets thrown with the bad pattern, rather than
getting a Left String.

Why is this? Seems like an odd bug somewhere. I am a Haskell novice, but
I looked at the code for Text.Regex.Base and for the TDFA and PCRE
backends and there's nothing in there to suggest this kind of
behavior--it should work with Either String.

The attached code snippet demonstrates the problem. I'm on GHC 7.0.3
(though I also got the problem with 6.12.3) and regex-base-0.93.2 and
regex-tdfa-1.1.8 and regex-pcre-0.94.2. Thanks very much for any tips or
ideas. --Omari

RegexTest.hs

Daniel Fischer

unread,
Dec 29, 2011, 7:24:02 PM12/29/11
to haskel...@haskell.org
On Thursday 29 December 2011, 23:52:46, Omari Norman wrote:
> Hi folks,
>
> I'm using Text.Regex.Base with the TDFA and PCRE backends. I want to
> compile regular expressions first and make sure the patterns were
> actually valid, so I used makeRegexOptsM, which indicates a bad regular
> expression by calling fail. That allows you to use makeRegexOptsM with
> Maybe or with (Either String) (assuming that Either String is an
> instance of Monad, which of course is defined in Control.Monad.Error.)
>
> Doing this with Maybe Regex works like it should--bad pattern gives you
> a Nothing. But if you want to see the error message by using Either
> String, an exception gets thrown with the bad pattern, rather than
> getting a Left String.
>
> Why is this?

The cause is that a pattern-match failure in a do-block or equivalent
causes the Monad's 'fail' method to be invoked.

For Maybe, we have

fail _ = Nothing

For Either, there used to be

instance Error e => Monad (Either e) where
...
fail s = Left (strMsg s)

in mtl's Control.Monad.error, and all was fine if one used the regex
functions with e.g. (Either String) as the Monad.

Recently, however, it was decided to have

instance Monad (Either e) where
...
fail s = error s -- not explicitly, but by Monad's default method

in Control.Monad.Instances. So now, if you have a pattern-match failure
using (Either String), you don't get a nice 'Left message' but an error.

So why was it decided to have that change?

'fail' doesn't properly belong in the Monad class, it was added for the
purpose of dealing with pattern-match failures, but most monads can't do
anything better than abort with an error in such cases.
'fail' is widely considered a wart.

On the other hand, the restriction to Either's first parameter to belong to
the Error class is artificial, mathematically, (Either e) is a Monad for
every type e. And (Either e) has use-cases as a Monad for types which
aren't Error members.

So the general consensus was that it was better to get rid of the arbitrary
(Error e) restriction.

Now, what can you do to get the equivalent of the old (Either String)?

Use 'ErrorT String Identity'.

It's a bit more cumbersome to get at the result,

foo = runidentity . runErrorT $ bar

but it's clean.

> Seems like an odd bug somewhere.

A change in behaviour that was accepted as the price of fixing what was
widely considered a mistake.

> I am a Haskell novice, but
> I looked at the code for Text.Regex.Base and for the TDFA and PCRE
> backends and there's nothing in there to suggest this kind of
> behavior--it should work with Either String.

It used to.

>
> The attached code snippet demonstrates the problem. I'm on GHC 7.0.3
> (though I also got the problem with 6.12.3) and regex-base-0.93.2 and
> regex-tdfa-1.1.8 and regex-pcre-0.94.2. Thanks very much for any tips or
> ideas. --Omari


_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Chris Wong

unread,
Dec 29, 2011, 11:20:18 PM12/29/11
to Daniel Fischer, haskel...@haskell.org
On Fri, Dec 30, 2011 at 1:24 PM, Daniel Fischer
<daniel.i...@googlemail.com> wrote:
> On Thursday 29 December 2011, 23:52:46, Omari Norman wrote:
>> [...]

>
> 'fail' doesn't properly belong in the Monad class, it was added for the
> purpose of dealing with pattern-match failures, but most monads can't do
> anything better than abort with an error in such cases.
> 'fail' is widely considered a wart.

I thought I'd add my own reason why I don't like fail.

Take these two functions, for example:

test :: Maybe Int
test = do
(Right v) <- Just (Left 1)
return v

test' :: Maybe Int
test' = do
let (Right v) = Left 1
return v

The first returns Nothing. The second crashes with a pattern match failure.

Why should a pattern failure cause a crash everywhere *except* a do
binding? It makes no sense. It violates the principle of least
surprise by behaving differently to every other occurrence of pattern
matching in the whole language.

As for custom failures, I'd recommend either Michael Snoyman's Failure
class or MonadPlus, which were both designed for this sort of thing.
But I'd stay away from using fail, since as Omari Norman said, it's a
wart.

Omari Norman

unread,
Dec 29, 2011, 11:20:40 PM12/29/11
to haskel...@haskell.org
On Fri, Dec 30, 2011 at 01:24:02AM +0100, Daniel Fischer wrote:
> For Either, there used to be
>
> instance Error e => Monad (Either e) where
> ...
> fail s = Left (strMsg s)
>
> in mtl's Control.Monad.error, and all was fine if one used the regex
> functions with e.g. (Either String) as the Monad.
>
> Recently, however, it was decided to have
>
> instance Monad (Either e) where
> ...
> fail s = error s -- not explicitly, but by Monad's default method
>
> in Control.Monad.Instances. So now, if you have a pattern-match failure
> using (Either String), you don't get a nice 'Left message' but an error.

Thanks so much, I would never have figured all this out. Spent a lot of
time tonight rummaging through mtl and transformers and
Control.Monad.Instances.

> Now, what can you do to get the equivalent of the old (Either String)?
>
> Use 'ErrorT String Identity'.

This I tried. It turned out that it didn't work though and I had the
same problem. I am guessing it is because my module has some imports at
the top that are bringing the instances in Control.Monad.Instances into
scope. Then it seems the Monad instance in Control.Monad.Instances
(which is using the default "fail", which calls "error") is being used,
rather than the instance from Control.Monad.Trans.Error. Only now do I
really understand why orphan instances are bad:

http://www.haskell.org/haskellwiki/Orphan_instance

A simple fix for it all was to wrap Either in a newtype and then define
a Monad instance for the newtype. --Omari

Reply all
Reply to author
Forward
0 new messages