getting all string literals available at point

53 views
Skip to first unread message

Brannon King

unread,
Sep 18, 2013, 1:31:33 PM9/18/13
to eto-...@googlegroups.com
I need a list of potential string literals at any point. I need this for my intellisense window. It appears that the GrammarMatch.Errors is updated whether or not there was a successful parse. That is good. However, it seems that the index is in the wrong spot. The index should be farther down the chain than it is. It also seems like the error list is incorrect if it is in the middle and is an unsuccessful parse. I'm using this code to extract my list:
 
private HashSet<string> ExtractTerminals(GrammarMatch parent)
{
 var ret = new HashSet<string>();
 foreach (var parser in parent.Errors)
  foreach (var t in ExtractTerminals(parser))
   ret.Add(t);
 return ret;
}
private IList<string> ExtractTerminals(Parser parent)
{
 var ret = new List<string>();
 if (parent is LiteralTerminal)
  ret.Add(((LiteralTerminal)parent).Value);
 foreach (var parser in parent.Children())
  ret.AddRange(ExtractTerminals(parser));
 return ret;
}

Curtis Wensley

unread,
Sep 18, 2013, 2:24:44 PM9/18/13
to eto-...@googlegroups.com
The ErrorIndex is actually invalid when there was a successful match.  It will be the index of the last error it found (e.g. an alternate or optional that didn't match).

For repeating parsers, it does not add an error if another repeat is not found, because technically it isn't an error unless the repeats are less than the specified Minimum value.

When it is a successful match, what sort of errors are you expecting it to output?

It sounds like you are trying to get a list of parsers tested (but not required) at the end of the match..

Brannon King

unread,
Sep 18, 2013, 2:30:12 PM9/18/13
to eto-...@googlegroups.com
It sounds like you are trying to get a list of parsers tested (but not required) at the end of the match..
 
Uh, sure. How do I do that? And it seems that they could be required if the match was not successful.
 
I also think that ErrorIndex is incorrect when the match was not successful. If I send in a partial match where the only problem is incompleteness, ErrorIndex should point to the end, true?

Curtis Wensley

unread,
Sep 18, 2013, 9:10:27 PM9/18/13
to eto-...@googlegroups.com
Eto.Parse doesn't keep track of that, so there's no way to get those directly.. unless you change your grammar a little to include an optional parser at the end for the ones you're interested in - then the error list (should) include the ones at the end of the match, even though they will never match..

ErrorIndex is the last index of the right most parser that caused the error (with AddError as true). If you're not getting the position you're expecting, it's probably because your grammar doesn't match those characters.. It'll give you the starting position of the parser that failed, not the ending position.

For example, if you have a terminal literal "blah" in your grammar, and it only matches "bl", it will return an error index at 'b', not 'a'. 

Cheers,
Curtis.

Brannon King

unread,
Sep 18, 2013, 10:57:28 PM9/18/13
to eto-...@googlegroups.com
Eto.Parse doesn't keep track of that, so there's no way to get those directly.. unless you change your grammar a little to include an optional parser at the end for the ones you're interested in - then the error list (should) include the ones at the end of the match, even though they will never match..

I care about the named parsers. I can go through the tree and stick an optional something on the end of each. How will that not mess with my grammar? I don't understand "they will never match". Do we have a "void" parser?
 
ErrorIndex is the last index of the right most parser that caused the error (with AddError as true). If you're not getting the position you're expecting, it's probably because your grammar doesn't match those characters.. It'll give you the starting position of the parser that failed, not the ending position.

It gives the starting position and my Errors list has 15 items in it. How do I know which Error was the "right-most", aka, the best match? Are the Errors ordered?

Curtis Wensley

unread,
Sep 18, 2013, 11:24:15 PM9/18/13
to eto-...@googlegroups.com

I care about the named parsers. I can go through the tree and stick an optional something on the end of each. How will that not mess with my grammar? I don't understand "they will never match". Do we have a "void" parser?

If only your named parsers have AddError = true, then you'll only get errors at the beginning of where they tried to match but failed.  It won't give you the index 'inside' unless the child parsers also have AddError set to true.

I don't mean to add an optional version at the end of each, you were mentioning if there is a partial match that you want to know what parsers could come next.. so if you had:

var myparsers = statement1 | statement2 |  statement3;

and your main grammar was:

Inner = +myparsers;

you could do:

Inner = +myparsers & ~myparsers;

this should give you errors at the end of the match telling you what could come next (haven't tried it, but theoretically it should).

It gives the starting position and my Errors list has 15 items in it. How do I know which Error was the "right-most", aka, the best match? Are the Errors ordered?

The errors are ordered in the order they were attempted, however all errors listed occurred at the same ErrorIndex.  If an error happens after the ErrorIndex, the list is cleared and the ErrorIndex is set to that new position.


Brannon King

unread,
Sep 20, 2013, 9:32:29 AM9/20/13
to eto-...@googlegroups.com
I tried the approach of adding in optional items. That doesn't seem to work as I still have way too many items in the errors list. I put together a small sample project showing my issues. I'm outputting the expected results. Any ideas toward achieving those desired outputs would be helpful.
 
using Eto.Parse;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace TestParser
{
 class Program
 {
  class CarGrammar : Grammar
  {
   private static readonly Parser WS = +Terminals.SingleLineWhiteSpace;
   public CarGrammar():base("procedure")
   {
    var writables = ((Parser)"[brake]" | "[throttle]").Named("telemetry");
    var readables = (writables | "[velocity]").Named("telemetry");
    var number = new Eto.Parse.Parsers.NumberParser{AllowDecimal = true, AllowExponent = true, AllowSign = false};
    // (it sure would be nice to specify a range on that number)
    var writer = ((Parser)"set").Named("action") & WS & writables & WS & "at" & WS & number.Named("value") & ~WS & ~(Parser)"%";
    var op = ((Parser)"<"|">"|"=").Named("operator");
    var unit = ((Parser)"m/s"|"kph"|"mph").Named("unit");
    var reader = ((Parser)"until").Named("action") & WS & readables & WS & op & ~WS & ~unit;
    var row = (~(writer | reader) & Terminals.Eol).Named("row");
    Inner = +row;
   }
  }
  static void Main(string[] args)
  {
   var parser = new CarGrammar();
   var match = parser.Match("set [throttle] at 15 %\r\n");
   Console.WriteLine("Running match on 'set [throttle] at 15 %nl'");
   Console.WriteLine("Expected success at {0} but it was {1}", true, match.Success);
   Console.WriteLine("Expected {0} errors but had {1} errors", 2, match.Errors.Count());
   Console.WriteLine("Expected error index at {0} but it was {1}", 23, match.ErrorIndex);
   Console.WriteLine("Expected possibilities of {0} but had {1}", "set, until", FindPossibilities(match));
   Console.WriteLine("Reported error: {0}", match.ErrorMessage);
   Console.WriteLine();
   match = parser.Match("");
   Console.WriteLine("Running match on ''");
   Console.WriteLine("Expected success at {0} but it was {1}", false, match.Success);
   Console.WriteLine("Expected {0} errors but had {1} errors", 2, match.Errors.Count());
   Console.WriteLine("Expected error index at {0} but it was {1}", 0, match.ErrorIndex);
   Console.WriteLine("Expected possibilities of {0} but had {1}", "set, until", FindPossibilities(match));
   Console.WriteLine("Reported error: {0}", match.ErrorMessage);
   Console.WriteLine();
   match = parser.Match("set ");
   Console.WriteLine("Running match on 'set '");
   Console.WriteLine("Expected success at {0} but it was {1}", false, match.Success);
   Console.WriteLine("Expected {0} errors but had {1} errors", 2, match.Errors.Count());
   Console.WriteLine("Expected error index at {0} but it was {1}", 4, match.ErrorIndex);
   Console.WriteLine("Expected possibilities of '{0}' but had '{1}'", "[brake], [throttle]", FindPossibilities(match));
   Console.WriteLine("Reported error: {0}", match.ErrorMessage);
   Console.WriteLine();
   match = parser.Match("set [throttle] ");
   Console.WriteLine("Running match on 'set [throttle] '");
   Console.WriteLine("Expected success at {0} but it was {1}", false, match.Success);
   Console.WriteLine("Expected {0} errors but had {1} errors", 1, match.Errors.Count());
   Console.WriteLine("Expected error index at {0} but it was {1}", 14, match.ErrorIndex);
   Console.WriteLine("Expected possibilities of '{0}' but had '{1}'", "at", FindPossibilities(match));
   Console.WriteLine();
   if (Debugger.IsAttached)
    Console.ReadKey();
  }
  private static string FindPossibilities(GrammarMatch match)
  {
   var literals = new List<string>();
   foreach (var child in match.Errors)
   {
    literals.AddRange(FindPossibilities(child));
   }
   return string.Join(", ", literals.Distinct().OrderBy(l => l));
  }
  private static List<string> FindPossibilities(Parser match)
  {
   var literals = new List<string>();
   if (match is Eto.Parse.Parsers.LiteralTerminal)
    literals.Add(((Eto.Parse.Parsers.LiteralTerminal)match).Value);
   foreach (var child in match.Children())
   {
    literals.AddRange(FindPossibilities(child));
   }
   return literals;
  }
 }
}

 
 
 

 


TestParser.zip

Curtis Wensley

unread,
Sep 20, 2013, 10:36:51 AM9/20/13
to eto-...@googlegroups.com
Ok, so I've taken a look.. and the problem with getting the wrong error index is that the parser that is causing the error is not named (e.g. "at"), and everything up until that was successful.  Giving it a name, or setting AddError to true on that makes it give you the right index.

It only sets the error index when it encounters a non matched parser with AddError set to true.

There is also another problem with the way you are trying to get the list of expected matches.. For example it'd report all of the elements of a sequence as expected which in fact it was only the first element of the sequence that you want back.

There is one bug with Eto.Parse though, where it reports errors for all parents even if the ErrorIndex does not match the position of it.  This will be fixed next version (which would make the parsers with errors match exactly what you're expecting).

See attached sample on how to get the expected parsers a little more accurately.

Hope this helps!
Curtis.
TestParser-v2.zip

Brannon King

unread,
Sep 20, 2013, 11:18:06 AM9/20/13
to eto-...@googlegroups.com
There is also another problem with the way you are trying to get the list of expected matches.. For example it'd report all of the elements of a sequence as expected which in fact it was only the first element of the sequence that you want back.
 
Good call.
 
There is one bug with Eto.Parse though, where it reports errors for all parents even if the ErrorIndex does not match the position of it.  This will be fixed next version (which would make the parsers with errors match exactly what you're expecting).

See attached sample on how to get the expected parsers a little more accurately.
 
Yes, that is very close! I'm looking forward to your next release. Please respond here when you release it. 

Curtis Wensley

unread,
Sep 21, 2013, 10:37:55 AM9/21/13
to eto-...@googlegroups.com
It's released now. (;

Cheers,
Curtis.

Brannon King

unread,
Sep 23, 2013, 3:23:10 PM9/23/13
to eto-...@googlegroups.com
Thanks for the release. I had one follow up question on this intellisense problem: suppose I want the "FindPossibilities" method to not search beyond the current line. From our earlier conversation, I expected that by not doing the "-op & ~op" I would be okay. That doesn't seem to change things, though. I also named my EndTerminal instance. That made it show up (deep) in the Errors list. However, my literals for the next line are still peers in that main Error list. Hence, the ordering does not allow me to break out of the "FindPossibilities" early. It seems to have something to do with having several optional items at the end of the line. Any ideas on that one?

Curtis Wensley

unread,
Sep 23, 2013, 3:44:29 PM9/23/13
to eto-...@googlegroups.com
Hey Brannon,

I'm not too sure I follow what you're trying to do.. would you be able to update your sample to show what you're expecting?

Thanks!
Curtis.
Message has been deleted

Brannon King

unread,
Sep 23, 2013, 4:09:18 PM9/23/13
to eto-...@googlegroups.com
I had one error in that code I just posted. It is missing "WS &" before the val operator when setting inner.

 

 

Curtis Wensley

unread,
Sep 23, 2013, 6:10:37 PM9/23/13
to eto-...@googlegroups.com
Hm, okay..

Try the following.  A few notes:  
  • you really should use a repeating parser instead of doing your loop to create strings like "AAA", "AA", "A", etc.  This can be replaced by: ((Parser)"A").Repeat(1,3).  You can then update your FindPossibilities to look at repeats if they are a repeating literal to combine them together.
  • You don't want each row variation to repeat (as far as I understand it)..  you'd want multiple rows, where each row can be one of many variations.  e.g. doing (+rowAAA_to_JJJ | -rowAA_to_JJ | -rowA_to_J) is rather cumbersome.. you could instead do +(rowAAA_to_JJJ | rowAA_to_JJ | rowA_to_J).. or even better:  +(rowArepeating_to_Jrepeating)
  • To get rid of the last test for 'set', you need to parse out the end of line before testing the final repeat.. see the sample
private class TestGrammar : Grammar
{
private static readonly Parser WS = +Terminals.SingleLineWhiteSpace;
public TestGrammar(): base("procedure")
{
var eol = (Terminals.Eol | Terminals.End).Named("eol");
var hash = ((Parser)"#");
hash.AddError = true;
var comment = ~(~WS & (hash & (-Terminals.AnyChar ^ eol)));

var set = (Parser)"set";
set.Name = "operator";
var at = (Parser)"at";
at.Name = "at";
var val = new NumberParser();
val.Name = "value";
var perc = (Parser) "%";
perc.Name = "percent";
var row = new AlternativeParser();
for (int i = 3; i > 0; i--) // curious to know how to do this with i counting up; can I specify a lookahead amount on my group strings?
{
// should really use a repeating parser instead, then update your FindPossibilities to 
// deal with them and extract the possibilities from the repeat.
var strings = new string[10];
for (int j = 0; j < strings.Length; j++)
{
strings[j] = new string((char)(j + 0x41), i);
}

// cleaner way
var group = new AlternativeParser(strings.Select(r => (Parser)r));
group.Name = "action";
var inner = set & WS & group & WS & at & WS & val & ~(~WS & perc) & comment & ~WS & eol;

// much cleaner.. don't need each row to repeat themselves, we repeat the entire alternate below
row.Add(inner);
}
// don't check the row unless there's something there by gobbling up the eol's
Inner = +(eol | row);
}
}

Hope this helps!
Curtis.
Reply all
Reply to author
Forward
0 new messages