Match Any Except Pattern?

166 views
Skip to first unread message

Sean Woods

unread,
Feb 9, 2010, 12:38:50 PM2/9/10
to sprache
I have been playing around with Sprache, just getting the hang of it
on some files that I have in the past used a PEG style parser on. So
far it has been going well and I am enjoying it but I ran into a
situation that has me stumped.

A file I was trying to parse allows C# style multi-line comments:

/*
* A Comment
*/

If I was writing a PEG style grammar for this I might write something
like:

comment: '/*' (!'*/' .)* '*/'

Basically, consume /*, optionally consume any number of characters
except for the pattern */, and finally consume */.

Is there currently an easy or elegant solution to this in Sprache or
will I need to write an extension method that keeps track of the input
position, compares the parsed input with the exception pattern, and
upon matching the exception pattern rolls back the input to the stored
position (when we first started to see a match) and returns success?

Thanks
Sean

Nicholas Blumhardt

unread,
Feb 10, 2010, 6:07:15 PM2/10/10
to sprache...@googlegroups.com
Hi Sean,

Sprache doesn't have a built-in for this, but one would be welcome.

Something along these lines _might_ work:

Parse.CharExcept('*").Or(Parse.Char('*').Then(c => Parse.CharExcept('/'))).Many()

Good luck!

Nick

Sean Woods

unread,
Feb 11, 2010, 4:21:26 PM2/11/10
to sprache
I came up with a solution that seems to work. I added Until (which
includes the stop condition) and XUntil (which excludes the stop
condition). I am not certain if Until and the first version of XUntil
are truly useful. They were just stepping stones that got me to the
second version of XUntil. The second version of XUntil allows me to
do the following:

static readonly Parser<string> CommentParser =
from cs in Parse.Char('/').Then(v => Parse.Char('*'))
from c in Parse.Any.XUntil(Parse.Char('*').Then(v =>
Parse.Char('/')))
from ce in Parse.Char('*').Then(v => Parse.Char('/'))
select new string(c.ToArray());

Currently Until and XUntil always return Sucess (like Many) though my
gut says that if the end of file is reached without ever matching the
stop condition then they should probably return Failure.

Anyways... if you find any of the below useful then you are more than
welcome to include it in future releases of Sprache.

public static readonly Parser<char> Any = Char(c => true, "any"); //
Because sometimes you just don't care.

public static Parser<IEnumerable<T>> Until<T>(this Parser<T> parser, T
stop)
{
Enforce.ArgumentNotNull(parser, "parser");

return i =>
{
var remainder = i;
var result = new List<T>();
var r = parser(i);
while (r is Success<T>)
{
var s = r as Success<T>;
if (remainder == s.Remainder)
break;

result.Add(s.Result);
remainder = s.Remainder;

if (s.Result.Equals(stop))
break;

r = parser(remainder);
}


return new Success<IEnumerable<T>>(result, remainder);
};
}


public static Parser<IEnumerable<T>> XUntil<T>(this Parser<T> parser,
T stop)
{
Enforce.ArgumentNotNull(parser, "parser");

return i =>
{
var remainder = i;
var result = new List<T>();
var r = parser(i);
while (r is Success<T>)
{
var s = r as Success<T>;
if (remainder == s.Remainder)
break;

if (s.Result.Equals(stop))
break;

result.Add(s.Result);
remainder = s.Remainder;

r = parser(remainder);
}

return new Success<IEnumerable<T>>(result, remainder);
};
}

public static Parser<IEnumerable<T>> XUntil<T, U>(this Parser<T>
parser, Parser<U> stop)
{
Enforce.ArgumentNotNull(parser, "parser");
Enforce.ArgumentNotNull(stop, "stop");

return i =>
{
var remainder = i;
var result = new List<T>();
var r = parser(i);
while (r is Success<T>)
{
var s = r as Success<T>;
if (remainder == s.Remainder)
break;

if (stop(remainder) is Success<U>)
break;

result.Add(s.Result);
remainder = s.Remainder;

r = parser(remainder);
}

return new Success<IEnumerable<T>>(result, remainder);
};
}

On Feb 10, 5:07 pm, Nicholas Blumhardt <nicholas.blumha...@gmail.com>
wrote:


> Hi Sean,
>
> Sprache doesn't have a built-in for this, but one would be welcome.
>
> Something along these lines _might_ work:
>
> Parse.CharExcept('*").Or(Parse.Char('*').Then(c =>
> Parse.CharExcept('/'))).Many()
>
> Good luck!
>
> Nick
>

Sean Woods

unread,
Feb 11, 2010, 6:33:36 PM2/11/10
to sprache
Here is a more correct version of XUntil. It fails unless it hits its
the stop condition. All it needs is a meaningful failure message for
the if (remainder = s.Remainder) condition.

public static Parser<IEnumerable<T>> XUntil<T, U>(this Parser<T>
parser, Parser<U> stop)
{
Enforce.ArgumentNotNull(parser, "parser");
Enforce.ArgumentNotNull(stop, "stop");

return i =>
{
var remainder = i;
var result = new List<T>();

if (stop(remainder) is Success<U>)


return new Success<IEnumerable<T>>(result, remainder);

var r = parser(i);


while (r is Success<T>)
{
var s = r as Success<T>;
if (remainder == s.Remainder)

return new Failure<IEnumerable<T>>(s.Remainder, () =>
"", () => new[] { "" });

result.Add(s.Result);
remainder = s.Remainder;

if (stop(remainder) is Success<U>)


return new Success<IEnumerable<T>>(result, remainder);

r = parser(remainder);
}

var f = r as Failure<T>;

return new Failure<IEnumerable<T>>(f.FailedInput, () =>
f.Message, () => f.Expectations);
};

Nicholas Blumhardt

unread,
Feb 12, 2010, 6:36:32 AM2/12/10
to sprache...@googlegroups.com
Thanks Sean, this is awesome. I'd most certainly like to add this to the project, and should have a chance in the next few days. Thanks for sharing the improvements!

Nick

Nicholas Blumhardt

unread,
Feb 20, 2010, 7:35:06 PM2/20/10
to sprache...@googlegroups.com
Took some time to refactor and experiment - there are two new combinators in trunk, Except() and Until() http://code.google.com/p/sprache/source/detail?r=fe0124cd2d76ec8033a71711a94f0ea58f91e4ea

The Until parser is implemented in terms of Except:

        public static Parser<IEnumerable<T>> Until<T, U>(this Parser<T> parser, Parser<U> until)
        {
            return parser.Except(until).Many().Then(r => until.Return(r));
        }

I've made some notes in Except, which I also think could be expressed by combining other parsers.

I haven't implemented an X (exclusive) version of either yet.

Let me know what you think - is this going in the right direction?

NB
Reply all
Reply to author
Forward
0 new messages