What's the correct interpretation of readline?

luser.droog

unread,

Oct 12, 2012, 12:11:42 AM10/12/12

to

I've spent the past few days porting my Mandelbrot Explorer to run on xpost.
And I've run into a bizarre snag. Here's a typescript that illustrates
the problem. [I've inserted '[^D]' to indicate where I hit ctrl-D.]

515(1)10:56 PM:ps 0> cat interact.ps

{
{ (%lineedit) (r) file }
stopped { clear exit } if
dup bytesavailable
string readline pop
print (\n) print
} loop

516(1)10:59 PM:ps 0> gsnd interact.ps
GPL Ghostscript 8.62 (2008-02-29)
Copyright (C) 2008 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
a
b
c
[^D]a
b
c
GS>[^D]517(1)11:00 PM:ps 0> xpost -nd interact.ps
197 opcodes
227 definitions in systemdict
mcp interact.ps
a
%%[ Error: rangecheck; OffendingCommand: readline ]%%
Stack:
-filestream-
(a)
Exec Stack:
--quit--
{ { handleerror } --if----quit--}
false
-filestream-
--loop--
--cvx--
[ { (%lineedit)(r)file } stopped { clear exit } if dup bytesavailable
string readline pop print (
)print ]
{ pop print (
)print }
stop
mcp launching executive
Xpost Version x2vG
PS<FS>[^D]518(1)11:00 PM:ps 0>

But here's the description from the PLRM 1ed.

readline

reads a line of characters (terminated by newline character) from file
and stores them into successive elements of string. readline then
returns the substring of string that was actually filled and a boolean
indicating the outcome (true normally, false if end-of-file was encoun-
tered before a newline character was read).

The terminating newline character is not stored into string or included
at the end of the returned substring. If readline completely fills string
before encountering a newline character, it executes the error
rangecheck.

So I've implemented this in xpost with the following C function.

OPFN_ void FSreadline(state *st, object f, object s) {
if (!f_status(st, f)) error(st,ioerror);
int n, c = 0;
if (!f.flags.read) error(st,invalidaccess);
for (n = 0; n < s.u.c.n; n++) {
c = getc( *(FILE **)VM(f.u.c.a));
if (c == EOF || c == '\n') break;
STR(s)[n] = c;
}
//if (c != EOF)
if (n == s.u.c.n && c != '\n') error(st,rangecheck);
s.u.c.n = n;
push(s);
push(consbool(c != EOF));
}

So it seems I'm doing *what it says*. But I much prefer
ghostscript's behavior of not signaling this error here.

I toyed with adding the 'if (c != EOF)' part to suppress
the error, but I'm not sure I'm following the standard
by doing that.

Any thoughts?

luser.droog

unread,

Oct 12, 2012, 10:35:47 PM10/12/12

to

But it won't fix the problem anyway. Since the string *is*
filled, there's no cause to read that last char to discover
the EOF condition.

I found ghostscript's code for %lineedit
http://git.ghostscript.com/?p=ghostpdl.git;a=blob;f=gs/psi/ziodev.c;h=9099475afb1439d95905d6f8e787b85f7310d0ca;hb=HEAD

and readline:
http://git.ghostscript.com/?p=ghostpdl.git;a=blob;f=gs/psi/zfileio.c;h=aea01345874b95436cc9cad39a3afcd4febfc756;hb=HEAD

But I haven't found the answer yet. The whole approach is
much more involved, looping with callbacks on the execstack.

sjprou...@gmail.com

unread,

Oct 15, 2012, 3:01:27 PM10/15/12

to

Is "bytesavailable" returning a valid value?

luser.droog

unread,

Oct 15, 2012, 11:34:54 PM10/15/12

to

sjprou...@gmail.com wrote:

> Is "bytesavailable" returning a valid value?

It appears to be returning the correct result.
Xpost's %lineedit uses tmpfile(3), and bytesavailable uses fstat(2),
which both seem to work together well.

sjprou...@gmail.com

unread,

Oct 16, 2012, 3:43:09 PM10/16/12

to

On Monday, October 15, 2012 8:39:13 PM UTC-7, luser- -droog wrote:
> sjprou...@gmail.com wrote: > Is "bytesavailable" returning a valid value? It appears to be returning the correct result. Xpost's %lineedit uses tmpfile(3), and bytesavailable uses fstat(2), which both seem to work together well.

My only other idea is what newline char(s) is "readline" trying to read?

luser.droog

unread,

Oct 16, 2012, 9:22:10 PM10/16/12

to

I'm just targetting unix (so I can make as many simplifying assumptions
as possible) so it's just looking for '\n'.

But the problem is that when the file comes from %lineedit, there's no
'\n' in the file! (Can't find that in the PLRM, I just copied ghostscript's
behavior.) So what I need is a "loophole" in this sentence:

> If readline completely fills string
> before encountering a newline character, it executes the error
> rangecheck.

Hmm. Maybe just reversing the tests in the loop. Sthg like

for (n=0; (c=fgetc()), c!='\n' && c!=EOF; n++) {
if (n == str.length) error(rangecheck);
str[n] = c;
}

So, have it read the next char _before_ checking how full the
string is. This might work!

Thanks for the sounding board!

sjprou...@gmail.com

unread,

Oct 17, 2012, 3:06:05 PM10/17/12

to

On Tuesday, October 16, 2012 6:22:35 PM UTC-7, luser- -droog wrote:
> sjprou...@gmail.com wrote: > On Monday, October 15, 2012 8:39:13 PM UTC-7, luser- -droog wrote: >> sjprou...@gmail.com wrote: > Is "bytesavailable" returning a valid >> value? It appears to be returning the correct result. Xpost's %lineedit >> uses tmpfile(3), and bytesavailable uses fstat(2), which both seem to >> work together well. > > My only other idea is what newline char(s) is "readline" trying to read? I'm just targetting unix (so I can make as many simplifying assumptions as possible) so it's just looking for '\n'.

Actually, my point was that '\n' can be interpreted as 0x0D, 0x0A, or 0x0D0A. If your implementation only looks for 0x0D0A, then it will never find a termination in files that only use 0x0A.

luser.droog

unread,

Oct 18, 2012, 12:43:43 AM10/18/12

to

Gotcha. The ghostscript comments mention something about the 0x0D0A case
leading to the complexity of its -readline- code.

My problem was persuading readline to "be ok" with not finding a newline
at all, since *we* know there must have been one in order for %lineedit
to terminate, but %lineedit suppressed it.

And then I had to persuade myself to "be ok" with not doing what the PLRM
says. I think I found a nice grey spot to hide in. Readline *doesn't have
to know* that it's filled the string until it has the (n+1)th character
and nowhere to put it. Thus if it encounters EOF, it needn't check
the 'filled' status at all, at that point; it checked every char going into
the string, and now it's done, EOF, finito. Why go looking for errors to
throw?