Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Converting a lex scanner to flex, help needed

9 views
Skip to first unread message

Aharon Robbins

unread,
Dec 29, 2021, 5:36:48 PM12/29/21
to
Hi.

I am trying to convert a V7 Unix vintage lex scanner to flex.

The rule

#.* {fixval(); xxbp = -1; return(xxcom); }

seems to be consuming as much as it can instead of stopping at
the first newline. When I look at the collected buffer, it
has multiple lines in it:

(gdb) p xxbuf
$7 = "# ========== ratfor in fortran for bootstrap ==========\n#\n# block data - initialize global variables\n#\nblock data\ncommon /cchar/ extdig(10), intdig(10), extlet(26), intlet(26), extbig(26), intbig(26"...

The program I am trying to modernize is 'struct', which reads Fortran and
produces Ratfor. The lex scanner is in the 'beautify' part. The whole
thing is at https://github.com/arnoldrobbins/struct. If you clone the
repo, check out the 'modernize' branch, and fix the makefile to compile
with gcc -m32, you will get working binaries. (64 bit and cleaning up
the warnings is work in progress.)

What am I doing wrong?

Thanks,

Arnold
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
[In flex a . doesn't match a newline. What do you see when you look at yytext, which
is the token it matched? The input buffer doesn't tell you anything very useful about
individual matched tokens. -John]

Aharon Robbins

unread,
Dec 30, 2021, 1:23:46 PM12/30/21
to

In article <21-1...@comp.compilers>,
Aharon Robbins <arn...@skeeve.com> wrote:
>I am trying to convert a V7 Unix vintage lex scanner to flex.
> ....
>[In flex a . doesn't match a newline. What do you see when you look at
>yytext, which
>is the token it matched? The input buffer doesn't tell you anything
>very useful about
>individual matched tokens. -John]

You're right, it looks like yytext is fine. There seems to be
other stuff going on between the grammar and the scanner, with the
grammar poking around inside the input buffer and expecting things to
work the way they did in lex.

I will probably have to dive into the code more deeply, instead of
just mechanically fixing compilation warnings, which is mostly
what I've been doing so far.

As an aside, the original code very cavalierly converted int to pointer
and back, all over. Over 40 years later, it's really hard to have
to mess with code like this.

Interestingly enough, though, when compiled in 32 bit, where int
and pointer are the same size, things seem to actually work!

Thanks,

Arnold
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
[Urrgh. The file handling in flex is quite different from lex. In lex
it's very simple, I think it just read a line at a time into a buffer,
in flex it reads large blocks and uses pointers to keep track of where
it is, with some cleverness if a token spans a block boundary. In lex
yytext is an array,in flex it's normally a pointer. -John]
0 new messages