--
Jerry Nettleton
email: ne...@technix.mn.org
...!uunet!cs.umn.edu!kksys!edgar!technix!nett
[Bison has some extra hackery to associate a line and column number with
every token, though it still won't give you the whole line. See the next
message for some suggestions. -John]
--
Send compilers articles to comp...@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
Yacc promises to complain as soon as it sees a token you can't parse so it's
mostly a matter of having the current line available when the error occurs.
Basically, you have to buffer the line yourself. In AT&T lex one
possibility is to rewrite the input() macro to do line buffering, but
here's a sketch of how to do it in a portable way that should work with
flex, the version of lex that all sensible people use. This should extend
in a straightforward way to multi-line records so long as the lexer can
tell where the record boundaries are.
---begin untested lex code---
%{
char linebuf[500]; /* line buffer for tokens */
int curoffs; /* start of current token */
int clear_on_next = 0;
%}
%%
/* clear buffer after end of line */
\n { add_linebuf(); clear_on_next = 1; return(EOL); }
/* real token */
foo { add_linebuf(); return(FOO); }
/* ignored white space still needs to go in the buffer */
[ \t]+ { add_linebuf(); }
%%
/* initialize the line buffer */
clr_linebuf()
{
linebuf[0] = '\0';
curoffs = 0;
clear_on_next = 0;
}
/* add the current token to the current line */
add_linebuf()
{
if(clear_on_next)
clr_linebuf();
curoffs = strlen(linebuf); /* start of current */
strcpy(linebuf+curoffs, yytext); /* append current */
/* strcpy is faster than strcat */
}
/* report an error */
yyerror(char *errmsg)
{
int curend = linebuf+strlen(linebuf); /* current buf end */
char *p;
/* get the rest of the line if not at end */
if(!clear_on_next) {
for(p = curend; ; ) {
int c = input();
*p++ = c;
if(c == '\n')
break;
}
*p = 0;
/* now give it back so lex can scan it later */
while(p > curend)
unput(*--p);
}
/* linebuf[] now has the whole line, with the current token */
/* at curoffs */
/* print error message and current line */
printf("%s\n%s", errmsg, linebuf);
/* print an X under the most recent token */
printf("%*sX\n", curoffs , ""); /* curoffs spaces, then X */
}
--end untested lex code---
Here's a slightly easier approach which pre-reads the next line after
every newline, but doesn't keep track of the location of the current
token.
--begin more untested lex code---
%%
\n.* { strcpy(linebuf, yytext+1); /* save the next line */
yyless(1); /* give back all but the \n to rescan */
}
--begin more untested lex code---
Regards,
John Levine, jo...@iecc.cambridge.ma.us, {spdcc|ima|world}!iecc!johnl
As soon as YACC detects an error it calls yyerror(). Change this to
create a linked list/array of error code/text and character position.
When an end of line is detected check to see if the error list is empty,
if it isn't print the line and then print each of the errors. The
character position allows you to position a marker on the line.
The down side of this is error cascading where one error causes a large
number of spurious errors to be seen as the grammar re-synchs with the
input.
Craig.
[This certainly works, though the original question was how to get a copy
of the entire input line from lex. -John]