From: Bill Lee <lee@utexas-11>
Date: 3 Jun 1982 19:39 CDT
Someone else may have answered this for you by now (our Unix-Wizard mail
is slower than snail mail these days) but I'll give it a shot. Lex and
yacc can be obscure but you can parse lots of stuff with minimal effort.
If you are only concerned with parsing strings, then lex will handle that
for you. Your strings can be whatever you define them to be and they can
include whitespace. Lex will always match the longest string automatically.
Now if you want some context information with your string, then you will
probably need to use yacc. Set your lex program up to pick up whatever
you want your tokens to be and return a defined value that matches a terminal
defined in your lex program (or you can let yacc do the defines for the token
names if you don't need to assign specific values). You need to specify a
correct grammer (or at least close to correct, shift/reduce errors and
reduce/reduce errors don't really matter). Yacc will match the longest
production that it can. If you get shift/reduce errors then make
sure that you specify the most desireable production before the less
desireable production and yacc will resolve the shift/reduce conflicts
in favor of the first applicable production defined. This is easier than it
sounds but you might have to work at it a while to understand how it works.
You probably have no need to reparse the same string if you get an error.
You will only get an error if the input cannot be matched to any of the