I'm trying to implement FixStyle[1] program of PLEAC[2] for Haskell.
(It's missing in the Haskell version of PLEAC.) For this purpose, I
needed regular expressions. (Now, I have two problems!) Despite, my
code is, IMHO, quite purely functional, Regex.PCRE.String functions
totally mess up the design with impure functions they introduce even for
really simple string matching stuff.
If you'd check out the sources[3], you'll see below impure functions.
transDictRegex :: IO PCRE.Regex
matchRegex :: String -> IO (String, String, String)
translate :: String -> IO String
However, if I'm not mistaken, above functions _should_ be absolutely
pure. Is it possible to encapsulate impure PCRE functions into their
pure equivalents?
BTW, why are PCRE functions impure?
Regards.
[1] http://pleac.sourceforge.net/pleac_perl/strings.html
[2] http://pleac.sourceforge.net/
[3] http://hpaste.org/fastcgi/hpaste.fcgi/view?id=14422
_______________________________________________
Beginners mailing list
Begi...@haskell.org
http://www.haskell.org/mailman/listinfo/beginners
Under the hood PCRE.regex uses withCStringLen which has the type:
withCStringLen :: String -> (CStringLen -> IO a) -> IO a
You can see the code here:
http://hackage.haskell.org/packages/archive/regex-pcre-builtin/0.94.2.1.7.7/doc/html/src/Text-Regex-PCRE-String.html
So the implementation is impure - it is built on impure code.
In the Real Word Haskell book, there is another implementation of a
PCRE interface, here the authors decided that their use of PCRE is
pure. The functions they have that use side effects - allocating
memory and freeing it - use the side effects entirely locally inside
each function body and the side effects never leak out of any
function.
http://book.realworldhaskell.org/read/interfacing-with-c-the-ffi.html
The implementation from the book is here:
http://hackage.haskell.org/package/pcre-light
Potentially the author of Text.Regex.PCRE could have decided that the
use of PCRE was pure and gone some way to hiding the IO type of
withCStringLen and similar functions (withCStringLen already
encapsulates its side effects). But as a design decision - if you are
using IO it is entirely reasonable to keep within IO even if what you
are doing appears pure (and personally I would choose to use the IO
version as I like to know when a library I'm using is performing side
effects).
Best wishes
Stephen
2009/12/19 Volkan YAZICI <yazi...@ttmail.com>:
Mainly because they _are_ the core that _is_ encapsulated as you
suggested : regex-pcre as many other regex library, provides the same
basic mostly pure interface which is in regex-base.
That interface comes with many specific pure function (matchText,
matchOnce, matchAll and so on, as well as makeRegex and others) and
some "magic" operators like =~ which decides what kind of matching you
want to do depending on the type of the result.
See this tutorial for some example of usage :
http://www.serpentine.com/blog/2007/02/27/a-haskell-regular-expression-tutorial/
--
Jedaï
OTH, I'm looking at my "translate" and it fails on infinite strings. Is
it because of something lacking in my implementation, or a limitation of
Regex.PCRE?
Regards.
I don't know whether the regex library can deal well with infinite input, but in
translate input = do
ᅵ (head, word, tail) <- matchRegex input
ᅵ tailTrans <- (translate tail)
ᅵ return $ head ++ (transWord word) ++ tailTrans
the IO-semantics require that the whole input is processed before anything is returned
(translating tail might throw an exception, after all). Maybe it'll work with a little
unsafeInterleaveIO magic:
import System.IO.Unsafe
translate input = do
ᅵ (head, word, tail) <- matchRegex input
ᅵ tailTrans <- unsafeInterleaveIO (translate tail)
ᅵ return $ head ++ (transWord word) ++ tailTrans
No, it can't. It uses withCStringLen, that's not capable of dealing with infinite input.
You didn't miss anything (I think it was even true with 6.10), that is
because the GHC developers have decided to get out of the library
business. In other words they decided to restrain themselves to the
absolute core of the Haskell libraries necessary to get GHC running
and nothing else. So, what should an Haskell developer do, when he
don't absolutely need the _latest_ GHC release, the _latest_ libraries
in all domain, and so on ? The Haskell Platform is the project that is
supposed to replace the old big GHC release from the past, with
regular releases of an integrated package with a stable and useful
library set and a compiler version that are supposed to work nicely
together :
http://hackage.haskell.org/platform/
_This_ is what most Haskell developer should use, especially if
they're beginners !
It contains regex-posix and regex-base (which is the common interface
of most regex-* package, regex-pcre included).