[1, 2, 3]
However, instead of a comma, you can also use an EOL:
[1, 2
3]
Anywhere else, EOL is considered ignorable whitespace. So it's not as simple as
just making EOL a token and looking for (comma | eol).
I've implemented this functionality in a hand-written parser (basically a hack
that keeps track of whether the last read token was preceded by an EOL,
without making EOL itself a token). Does anybody have ideas about how to
do this with Parsec?
_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Untested, but hopefully enough so you get an idea of where to start:
> -- End of line parser. Consumes the carriage return, if present.
> eol :: Parser ()
> eol = eof <|> char '\n'
> -- list-element separator.
> listSep :: Parser ()
> listSep = eol <|> (char ',' >> spaces)
> -- list parser. The list may be empty - denoted by "[]"
> myListOf :: Parser a -> Parser [a]
> myListOf p =
> char '[' >>
> sepBy p listSep >>= \vals ->
> char ']' >>
> return vals
This would probably be better off with a custom version of the
'spaces' parser that didn't parse newlines.
Antoine
You can do exactly the same with Parsec:
* create a lexer that yields a [Token], including EOL tokens;
* write a function of type [Token] -> [(Token, Bool)] that discards EOLs
and tells for each token whether it was preceded by a (now discarded) EOL;
* write your pToken :: Token -> Parsec Token function (I omitted some
type variables there) that recognises one (Token, Bool)-tuple from the
input stream.
Or, perhaps easier:
* create a lexer that yields a [Token], including EOL tokens;
* write a function of type [Token] -> [Token] that discards only those
EOL tokens that aren't needed -- for example, those EOL tokens that
occur when there are no open ['s, then parse those EOL's explicitly in
your parser.
Hope this helps,
Martijn.
Hi Kannan,
let's construct the parser top-down. On the top level, you have opening
and closing characters, '[' and ']'. Parsec has a function for that:
between (char '[') (char '])
And what's in between? A list of elements separated by something. Parsec
provides a sepBy function for that:
element `sepBy` separator
which parses a list of elements separated by separator. What's your
separator? Well it's either ',' or a new line and spaces before and
after that:
mySpaces >> (newline <|> char ',') >> mySpaces -- [1]
Let's combine what we've got:
myListOf :: (Parsec String () a) -> Parsec String () [a]
myListOf elem = between
(char '[')
(char ']')
(elem `sepBy` (mySpaces >> (newline <|> char ',') >> mySpaces))
where
mySpaces = many (oneOf (" \t"))
And test it in ghci:
*Main> parseTest (myListOf anyChar) "[a , b, d ,d\np]"
"abddp"
Hope this helps!
Stephan
PS: The important thing is that there are a lot solutions for tricky
situations (like yours) in Text.Parsec.Combinator (especially the sepBy
and many families). Knowing them can save a lot of work :)
[1] I don't use parsec's spaces function because it also accepts newline
characters.
>
> I've implemented this functionality in a hand-written parser (basically a hack
> that keeps track of whether the last read token was preceded by an EOL,
> without making EOL itself a token). Does anybody have ideas about how to
> do this with Parsec?
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskel...@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
--
Früher hieß es ja: Ich denke, also bin ich.
Heute weiß man: Es geht auch so.
- Dieter Nuhr