Get_Line problem (GNAT bug?)

Maciej Sobczak

unread,

Dec 6, 2006, 9:25:31 AM12/6/06

to

Hi,

Consider:

with Ada.Text_IO;

procedure Hello is
use Ada.Text_IO;

Input_Line : String(1..100);
Last_Index : Integer range 0..100;
begin
loop
Put("What's your name? ");
exit when End_Of_File;
Get_Line(Input_Line, Last_Index);
if Last_Index >= Input_Line'First then
Put("Hi, ");
Put(Input_Line(1..Last_Index));
New_Line;
else
Put("You have funny empty name.");
New_Line;
end if;
end loop;
end Hello;

(please focus on the Get_Line problem here)

It should be obvious what the program does, except that the behaviour in
the case of empty input line is a bit strange.
Below, in the right-hand column I describe what keys were pressed:

$ ./hello
What's your name? Maciek -- M a c i e k ENTER
Hi, Maciek
What's your name? -- ENTER
-- ENTER
You have funny empty name.
What's your name? -- ENTER
You have funny empty name.
What's your name? -- EOF
$

As you see, the first ENTER was somehow "swollowed", creating an empty
line in the console (that's the echo of what user typed), but still
blocking in Get_Line. All subsequent ENTERs seem to be handled
correctly, which means that Get_Line returns with Last_Index <
Input_Line'First.

I was already suggested that it might be a GNAT feature. If yes, it
seems to be persistent, because I see it with two different versions.

Of course, I expect that empty lines are handled uniformly.

Any thoughts?

--
Maciej Sobczak : http://www.msobczak.com/
Programming : http://www.msobczak.com/prog/

Adam Beneschan

unread,

Dec 6, 2006, 1:06:25 PM12/6/06

to

It has something to do with the End_Of_File call. I got the same
behavior that you did, but when I commented out the line "exit when
End_Of_File;" [and used another method to determine when to exit the
loop---specifically, I exited when the input was equal to "quit"], I
did not get that behavior. Note that when "standard input" is a
terminal or the equivalent, the End_Of_File call has to perform an
input operation because it has to wait for the user to press control-D
(on Unix-like systems) before End_Of_File knows what to return. It may
be that that isn't being handled quite correctly.

I also found that if I took out "exit when End_Of_File" and added an
exception handler for End_Error, the erroneous behavior didn't occur.

-- Adam

Gautier

unread,

Dec 6, 2006, 3:34:49 PM12/6/06

to

Same behaviour on ObjectAda.
But Adam's solution works on both GNAT and ObjectAda.
Note that when you enter again a non-empty string after
an empty one you also get something strange with the End_of_File exit.
______________________________________________________________
Gautier -- http://www.mysunrise.ch/users/gdm/index.htm
Ada programming -- http://www.mysunrise.ch/users/gdm/gsoft.htm

NB: For a direct answer, e-mail address on the Web site!

Dmitry A. Kazakov

unread,

Dec 6, 2006, 4:47:19 PM12/6/06

to

I think this is a correct behavior. Here is my explanation of what's going
on:

When End_Of_File meets a CR (Ctrl-M) it cannot know if this CR manifests
the end of the current line or both the line end and the file end. This is
because a file can ends with a CR, and this trailing CR is not counted as
an empty line. So eventually End_Of_File should attempt to read the
following character, i.e. block. This is what happens when you start your
program and promptly hit ENTER. When you hit ENTER again, End_Of_File sees,
aha, that CR wasn't the end and unblocks. This lets Get_Line to read the
*first* CR. The second one remains in the buffer. The next round will
block, but this time not on the ENTER you will hit (this would be a third
CR), but on the *second* one. So you will observe what appears a "correct"
behavior, which in fact is "incorrect", because Get_Line gives you the
*previous* empty string. Who cares, all empty strings are empty. (:-)) Now
when entered strings aren't empty everything works because End_Of_File
reacts on the first character of each line and happily returns.

The rules of thumb I suppose I and many other Ada programmers are using:

1. Never ever use End_Of_File with text files;
2. If you yet use End_Of_File then do it for *each* character of the file;
3. As Adam has suggested, End_Error exception is the right design;
4. End_Error is not only cleaner and correct, but also more efficient.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Adam Beneschan

unread,

Dec 6, 2006, 6:40:34 PM12/6/06

to

Dmitry A. Kazakov wrote:

> I think this is a correct behavior. Here is my explanation of what's going
> on:
>
> When End_Of_File meets a CR (Ctrl-M) it cannot know if this CR manifests
> the end of the current line or both the line end and the file end. This is
> because a file can ends with a CR, and this trailing CR is not counted as
> an empty line. So eventually End_Of_File should attempt to read the
> following character, i.e. block.

No, I don't think this makes sense. If a disk file contained the
characters M a c i e k <CR> <CR> at the end of the file, then this
should be read in as two lines: one line containing the string
"Maciek", and one empty line. You're right that this should not be
followed by yet another empty line, but that isn't relevant. After the
first Get_Line has consumed the characters M a c i e k <CR>, then, when
End_Of_Line looks ahead and sees that the next character is <CR>, it
should be able to return False immediately without needing to look
ahead to see whether <CR> is immediately followed by EOF. And I don't
see why this wouldn't apply to an interactive file as well. (We
shouldn't be saying CR in any case; Unix uses LF (Ctrl-J) as the line
separator, and Windows uses a CR-LF combination. We really should just
be talking about "end-of-line" or EOL without referring to any specific
ASCII character.)

And, in any case, if somehow "a trailing CR is not counted as an empty
line" were true, it would be true regardless of whether the program
used End_Of_File to test for end of file, or simply used Get_Line and
relied on End_Error being raised. So the behavior would have to be the
same in both cases. But that's not what I observed.

-- Adam

Björn Persson

unread,

Dec 6, 2006, 7:02:04 PM12/6/06

to

Adam Beneschan wrote:
> After the
> first Get_Line has consumed the characters M a c i e k <CR>, then, when
> End_Of_Line looks ahead and sees that the next character is <CR>, it
> should be able to return False immediately without needing to look
> ahead to see whether <CR> is immediately followed by EOF. And I don't
> see why this wouldn't apply to an interactive file as well.

I agree that that's how it should be, but it's not how End_Of_Line is
defined: "Returns True if a file terminator is next, or if the combination
of a line, a page, and a file terminator is next; otherwise returns False."

--
Björn Persson PGP key A88682FD
omb jor ers @sv ge.
r o.b n.p son eri nu

Adam Beneschan

unread,

Dec 6, 2006, 8:09:42 PM12/6/06

to

Björn Persson wrote:
> Adam Beneschan wrote:
> > After the
> > first Get_Line has consumed the characters M a c i e k <CR>, then, when
> > End_Of_Line looks ahead and sees that the next character is <CR>, it
> > should be able to return False immediately without needing to look
> > ahead to see whether <CR> is immediately followed by EOF. And I don't
> > see why this wouldn't apply to an interactive file as well.
>
> I agree that that's how it should be, but it's not how End_Of_Line is
> defined: "Returns True if a file terminator is next, or if the combination
> of a line, a page, and a file terminator is next; otherwise returns False."

I meant to type End_Of_File. But End_Of_File is defined the same way:
it returns True if the combination of a line, a page, and a file
terminator is next. (A.10.5(25))

However, I still don't think it should return True here. The concepts
of line, page, and file terminator are logical concepts whose actual
representation is defined by the implementation (A.10(7-8)); it's not
correct to simply assume that <CR> (or <LF> or <CR><LF>) is equivalent
to a line terminator in all cases. A.10(7) says that the end of a file
is marked by the combination of a line terminator, a page terminator,
and a file terminator. I would hope that in the above case, the
logical end-of-file is considered to be a line terminator followed by a
page terminator followed by a file terminator; the line terminator and
page terminator here would be implicit, and would not be represented by
any actual bits in the disk file, but would be logically considered to
be present. Thus, after Get_Line reads the first string, "Maciek", the
next things would be a line terminator (represented by the second <CR>)
followed by *another* line terminator (implicit, not represented by
anything in the disk file) followed by a page terminator (also
implicit) followed by a file terminator (also implicit); thus,
End_Of_File would return false.

This is all implementation-dependent. However, I'd consider that if an
implementation would return two strings for Get_Line ("Maciek" and "")
and raise End_Error only the third time Get_Line is called, but would
return True for End_Of_File after just the *first* Get_Line, then the
implementation would be in error since it isn't consistent. You may be
able to interpret the RM in a way that would make this permissible, but
the unwritten rule is that implementations are supposed to act
sensibly, and this would not make sense. (By the way, the behavior
that I believe makes no sense is also how GNAT is acting when reading
from a file opened with Open, rather than standard input.)

Somebody please correct me if I'm wrong. If I'm wrong, though, and the
RM requires the behavior I'm seeing, then the RM doesn't seem to make
sense.

-- Adam

Björn Persson

unread,

Dec 6, 2006, 8:28:00 PM12/6/06

to

Adam Beneschan wrote:
> Björn Persson wrote:
>> Adam Beneschan wrote:
>> > After the
>> > first Get_Line has consumed the characters M a c i e k <CR>, then, when
>> > End_Of_Line looks ahead and sees that the next character is <CR>, it
>> > should be able to return False immediately without needing to look
>> > ahead to see whether <CR> is immediately followed by EOF. And I don't
>> > see why this wouldn't apply to an interactive file as well.
>>
>> I agree that that's how it should be, but it's not how End_Of_Line is
>> defined: "Returns True if a file terminator is next, or if the
>> combination of a line, a page, and a file terminator is next; otherwise
>> returns False."
>
> I meant to type End_Of_File.

So did I. I didn't even notice your mistake, and what I quoted was the
definition of End_Of_File. Oops! :-)

Steve

unread,

Dec 6, 2006, 10:34:42 PM12/6/06

to

"Maciej Sobczak" <no....@no.spam.com> wrote in message
news:el6jss$268$1...@cernne03.cern.ch...
[snip]
>
> Any thoughts?
>
You have run across what I consider to be a weakness in the standard Text_IO
library, it is one of the few things I find really annoying about Ada, which
I really like otherwise.

In Ada you can't write a simple program to read and process lines in a text
file that looks something like:

while not End_Of_File( inFile ) loop
Get_Line( inFile, Input_Buffer, Count );
Process_Line( Input_Buffer( 1 .. Count ) );
end loop;

The problem is that Get_Line reads a line of text into a string buffer and
returns the count of characters read. The "line of text" is defined by the
next sequential characters from the current position of the file up to the
next line terminator. If the last character in the file doesn't happen to
be a line termnator an End_Error exception is raised.

It would be much more useful if Get_Line read from the current position to
the next line terminator OR end of file. This is the way the other
languages I have used work. Because of the way the Get_Line function works
you have to jump through a bunch of inefficient hoops like reading one
character at a time in order to get the desired behavior. In my opinion
raising an exception when the end of file doesn't have an end of line
terminator should be the exceptional case and not the rule.

I suspec this is the same problem you're running into with your interactive
program.

Steve
(The Duck)

Jeffrey R. Carter

unread,

Dec 7, 2006, 12:00:40 AM12/7/06

to

Adam Beneschan wrote:
>
> However, I still don't think it should return True here. The concepts
> of line, page, and file terminator are logical concepts whose actual
> representation is defined by the implementation (A.10(7-8)); it's not
> correct to simply assume that <CR> (or <LF> or <CR><LF>) is equivalent
> to a line terminator in all cases. A.10(7) says that the end of a file
> is marked by the combination of a line terminator, a page terminator,
> and a file terminator. I would hope that in the above case, the
> logical end-of-file is considered to be a line terminator followed by a
> page terminator followed by a file terminator; the line terminator and
> page terminator here would be implicit, and would not be represented by
> any actual bits in the disk file, but would be logically considered to
> be present. Thus, after Get_Line reads the first string, "Maciek", the
> next things would be a line terminator (represented by the second <CR>)
> followed by *another* line terminator (implicit, not represented by
> anything in the disk file) followed by a page terminator (also
> implicit) followed by a file terminator (also implicit); thus,
> End_Of_File would return false.

I believe there's a permission somewhere in an AI that any or all of the
final line, page, and file terminators may be missing, to handle text
files not written by Ada, and many non-Ada programs that write text
files will end with a line terminator; Ada is allowed to (and should)
interpret that as both an EOL and an EOF, just as it interprets
<EOL><EOP><EOF>.

Things are especially messy when reading standard input, because Text_IO
assumes that look ahead is possible, and with standard input, it isn't.

So, if a text file (not standard input) contained

xyz<EOL><EOL>

after a Get_Line you'd have eaten

xyz<EOL>

and both End_Of_Line and End_Of_File should return True.

However, for standard input, as Kazakov said, End_Of_File doesn't know
if this 2nd <EOL> is the last thing in the "file". So it has to wait for
more input to be buffered before it can return. There are three cases:

1. The input is <EOF>. End_Of_File returns True.

2. The input is <EOP>. End_Of_File has to wait for more input.

3. The input is anything else. End_Of_File returns False.

> Somebody please correct me if I'm wrong. If I'm wrong, though, and the
> RM requires the behavior I'm seeing, then the RM doesn't seem to make
> sense.

It comes from the need to handle files created by Text_IO (<EOL> is part
of the file terminator) and by non-Ada programs (no file terminator).

--
Jeff Carter
"People called Romanes, they go the house?"
Monty Python's Life of Brian
79

Maciej Sobczak

unread,

Dec 7, 2006, 3:26:07 AM12/7/06

to

Dmitry A. Kazakov wrote:

> When End_Of_File meets a CR (Ctrl-M) it cannot know if this CR manifests
> the end of the current line or both the line end and the file end.

I understand. But then I consider the specs to be broken, see below. :-)

> The rules of thumb I suppose I and many other Ada programmers are using:
>
> 1. Never ever use End_Of_File with text files;

This is broken. For me, End_Of_File is a concept that is completely
orthogonal to what the file contains and how it is interpreted.
It is true that it is defined in Text_IO and therefore it might be a
hint that actually EOF is somehow entangled with interpretation of text
markers, but that's not what I would expect.

> 2. If you yet use End_Of_File then do it for *each* character of the file;

I don't see how it might solve this problem - End_Of_File would block
after first <CR> anyway.

> 3. As Adam has suggested, End_Error exception is the right design;

I don't find it to be "right". For me, exception is something that is
unexpected. An error, usually. With files, EOF happens - on average -
almost once per file and it is something that I expect. There is nothing
unusual in the fact that there is no more data in a file, *unless* in
the given environment we really want to work with conceptually infinite
data sources. But this is an exception (no pun intended) and is rather
domain-specific, so should not be imposed by the general-purpose library.

EOF might be an error if, for example, with some known file format the
parser expected some more data and there isn't any (like with the XML
file which was truncated somewhere in the middle). *That* would be the
reason to raise an exception, but then it's the business of the parser,
not IO library. The library should not interfere with my interpretation
of what I want to see in the file.

> 4. End_Error is not only cleaner and correct, but also more efficient.

It's not cleaner for me (see above), it is very likely correct (in the
sense that the real solution is broken) and I don't see on what basis it
is expected to be more efficient. :-)

Jean-Pierre Rosen

unread,

Dec 7, 2006, 4:14:05 AM12/7/06

to

Dmitry A. Kazakov a écrit :

> The rules of thumb I suppose I and many other Ada programmers are using:
>
> 1. Never ever use End_Of_File with text files;
> 2. If you yet use End_Of_File then do it for *each* character of the file;
> 3. As Adam has suggested, End_Error exception is the right design;
> 4. End_Error is not only cleaner and correct, but also more efficient.
>

And I'll add another one:
5. End_Error works correctly with badly formed files, i.e. files whose
last line does not end with and end-of-line mark. You can get strange
behaviour in that case with any other solution.

--
---------------------------------------------------------
J-P. Rosen (ro...@adalog.fr)
Visit Adalog's web site at http://www.adalog.fr

Jean-Pierre Rosen

unread,

Dec 7, 2006, 4:21:36 AM12/7/06

to

Maciej Sobczak a écrit :

>> 3. As Adam has suggested, End_Error exception is the right design;
>
> I don't find it to be "right". For me, exception is something that is
> unexpected. An error, usually.

I beg to differ. Otherwise, they would be called "errors" or "abnormal".
An exception corresponds to an "exceptional" event, i.e. something that
prevents usual processing, and forces you to an exceptional handling. If
you consider that normal processing (normally the body of a loop) is the
rule, it is an exception to the rule. And EOF certainly matches this
definition (as is an error, of course, but it is not the only case).

Exceptions are a useful tool in the programmers tool box. The narrow
view of exceptions as just a way to handle errors is a mistake that
prevents elegant solutions to some problems, in my view, and I've been
constantly fighting against that.

Dmitry A. Kazakov

unread,

Dec 7, 2006, 5:22:19 AM12/7/06

to

On Thu, 07 Dec 2006 09:26:07 +0100, Maciej Sobczak wrote:

> Dmitry A. Kazakov wrote:
>
>> When End_Of_File meets a CR (Ctrl-M) it cannot know if this CR manifests
>> the end of the current line or both the line end and the file end.
>
> I understand. But then I consider the specs to be broken, see below. :-)

No. It is the concept, which is broken. And that wasn't Ada, who broke it,
but crippled operating systems like Windows and Unix. In a proper OS the
line terminator is not a character.

>> The rules of thumb I suppose I and many other Ada programmers are using:
>>
>> 1. Never ever use End_Of_File with text files;
>
> This is broken. For me, End_Of_File is a concept that is completely
> orthogonal to what the file contains and how it is interpreted.

Right, so see above. You need a file system which has EOF state
determinable without look ahead. It is not language business.

[Though I don't defend End_Of_File. I would simply remove it from Text_IO.]

> It is true that it is defined in Text_IO and therefore it might be a
> hint that actually EOF is somehow entangled with interpretation of text
> markers, but that's not what I would expect.

Huh, what did you expect buying Windows? (:-))

>> 2. If you yet use End_Of_File then do it for *each* character of the file;
>
> I don't see how it might solve this problem - End_Of_File would block
> after first <CR> anyway.

Yes, but then at least you would know what's going on. End_Of_File happened
to be lower level (in OSI hierarchy terms) than Get_Line. Mixing levels is
asking for trouble. Because End_Of_File drags you into a character-oriented
encoding issues, so have to face this in arms.

>> 3. As Adam has suggested, End_Error exception is the right design;
>
> I don't find it to be "right". For me, exception is something that is

> unexpected. An error, usually. [...]

OK, this is a bit "theological" issue... (:-))

My answer is no. Exception is not an error. It indicates an exceptional
state. Note that an exceptional state is a *valid* state. While an error
(bug) has no corresponding program state at all.

Sure an exception may indicate an error, but this error is *never* one of
the program where it is handled. For example, Segmentation Fault is an
error of an application, but a valid state for the OS which has spawned
that erroneous application and where Segmentation Fault is handled.

To summarize: we should distinguish errors (and states) of the problem and
solution spaces. Exceptions live in the latter, what you meant does in the
former.

>> 4. End_Error is not only cleaner and correct, but also more efficient.
>
> It's not cleaner for me (see above), it is very likely correct (in the
> sense that the real solution is broken) and I don't see on what basis it
> is expected to be more efficient. :-)

This is because you consider it from the C++ stand point. In Ada exceptions
are efficient. They are highly recommended for use in place of return
codes. This is a *good* design. End_Of_File in your program serves the
purpose of return code. What is even worse, from the software design
perspective, is that one operation "give me next line" is split into two,
so that the side effect of one determines the outcome of another and
reverse. It is a very fragile (and wrong) design. Just notice how much
efforts it requires to analyse. And what for? To defend a myth, that each
loop should have only one exit? Your code didn't managed that either!
Neither manages it inputs longer than 99 characters. Do you call it clean?
Further, even in C you wouldn't use it either. You probably would turn to
fgetc (or even to fread):

int Char;
// Do we all know that characters are integers? char is just for fun,
// real programmers are using int for all datatypes! (:-))

while (EOF != (Char = fgetc (File)))
{
...

Ludovic Brenta

unread,

Dec 7, 2006, 8:35:06 AM12/7/06

to

Jean-Pierre Rosen <ro...@adalog.fr> writes:
> I beg to differ. Otherwise, they would be called "errors" or "abnormal".
> An exception corresponds to an "exceptional" event, i.e. something
> that prevents usual processing, and forces you to an exceptional
> handling. If you consider that normal processing (normally the body of
> a loop) is the rule, it is an exception to the rule. And EOF certainly
> matches this definition (as is an error, of course, but it is not the
> only case).
>
> Exceptions are a useful tool in the programmers tool box. The narrow
> view of exceptions as just a way to handle errors is a mistake that
> prevents elegant solutions to some problems, in my view, and I've been
> constantly fighting against that.

I agree with you in the general case where you have an Ada run-time
library that propagates exceptions. But there are cases where your
Ada run-time library does not propagate exceptions, or where you don't
have a run-time library at all. In those particular cases, you have
to use the "narrow view" whereby an exception is something so
exceptional that it "should never happen" (tm).

In the OP's case, handling EOF as an exception is just fine, since
file I/O implies the existence of a suitable run-time library.

--
Ludovic Brenta.

Maciej Sobczak

unread,

Dec 7, 2006, 9:51:50 AM12/7/06

to

Dmitry A. Kazakov wrote:

>>> When End_Of_File meets a CR (Ctrl-M) it cannot know if this CR manifests
>>> the end of the current line or both the line end and the file end.
>> I understand. But then I consider the specs to be broken, see below. :-)
>
> No. It is the concept, which is broken. And that wasn't Ada, who broke it,
> but crippled operating systems like Windows and Unix. In a proper OS the
> line terminator is not a character.

That's a brave concept. ;-)
Why do you assign the special meaning to the line terminator? Is it
*that* special in, for example, your previous post? Why not assign the
special meaning to word or sentence terminator? Or anything else for
that matter. Isn't it domain-specific? Do you want support from "proper
OS" for all this stuff?

>>> 1. Never ever use End_Of_File with text files;
>> This is broken. For me, End_Of_File is a concept that is completely
>> orthogonal to what the file contains and how it is interpreted.
>
> Right, so see above. You need a file system which has EOF state
> determinable without look ahead.

No. I might be using pipes or fifos or sockets or whatever else where
EOF is not really determinable by position. It's not file system issue.

> [Though I don't defend End_Of_File. I would simply remove it from Text_IO.]

But then it would be somewhere else. There would be an opportunity to
specify it correctly, without messing with interpretation of the file
structure.

>> It is true that it is defined in Text_IO and therefore it might be a
>> hint that actually EOF is somehow entangled with interpretation of text
>> markers, but that's not what I would expect.
>
> Huh, what did you expect buying Windows? (:-))

I don't understand this - Windows is not involved in this discussion at
all (not mentioning buying it).

>>> 2. If you yet use End_Of_File then do it for *each* character of the file;
>> I don't see how it might solve this problem - End_Of_File would block
>> after first <CR> anyway.
>
> Yes, but then at least you would know what's going on. End_Of_File happened
> to be lower level (in OSI hierarchy terms) than Get_Line.

This is exactly what I would expect. End_Of_File should not mess with
file structure.

> Mixing levels is
> asking for trouble.

Agreed.

>>> 3. As Adam has suggested, End_Error exception is the right design;
>> I don't find it to be "right". For me, exception is something that is
>> unexpected. An error, usually. [...]
>
> OK, this is a bit "theological" issue... (:-))

Indeed. :-)

> My answer is no. Exception is not an error. It indicates an exceptional
> state. Note that an exceptional state is a *valid* state. While an error
> (bug) has no corresponding program state at all.

It's not about bugs. I have presented an example of truncated XML file -
there's no bug in a program that happened to be given a broken file to
digest. It's an error in a sense that the program cannot read the data
that it genuinely expects. Still, the program should handle this case
reasonably, so we have valid state.

Now, if the program specs says: "read the lines from input until EOF",
then this for me immediately translates into a loop with some exit
condition. A while loop, probably, or something in this area. "Read
until" - you have a regular end-of-sequence condition here. Close to
iterators. How do you write iteration routines? Do you use exceptions
for the end-of-sequence condition to break the loop? In what way
iteration over the container is different from reading lines from input?

(Probably the best thing would be to just have "line iterators".)

Sorry, I'm not convinced that exception might be a correct design choice
for breaking the loop that reads data from well formatted file.

> This is because you consider it from the C++ stand point.

Which is, of course, evil by definition. ;-)

> In Ada exceptions
> are efficient.

So how do you write iteration routines?

> They are highly recommended for use in place of return
> codes.

Good point. There is no return code here. Everything is managed at the
same level. The red-light here is that with exceptions I would need to
use empty "where" clause. Empty clause at the same level? Looks like
goto in disguise.

> End_Of_File in your program serves the
> purpose of return code.

Nope. It's the end-of-sequence condition. Just like with iterators.

> What is even worse, from the software design
> perspective, is that one operation "give me next line" is split into two,
> so that the side effect of one determines the outcome of another and
> reverse.

Same with iterators.

> It is a very fragile (and wrong) design. Just notice how much
> efforts it requires to analyse.

Frankly, I don't find it difficult. Ada is very readable, you know. ;-)

> And what for? To defend a myth, that each
> loop should have only one exit?

If the specs says "read until end", then this means single exit
condition to me.

> Your code didn't managed that either!

Why?

> Neither manages it inputs longer than 99 characters.

Good point. How should I solve this?

> Do you call it clean?

Yes.

> Further, even in C you wouldn't use it either.

You're right, I wouldn't. Why would I use C if C++ gets it right? ;-)

string line;
while (getline(cin, line))
{
// play with line here

Dmitry A. Kazakov

unread,

Dec 7, 2006, 11:29:27 AM12/7/06

to

On Thu, 07 Dec 2006 15:51:50 +0100, Maciej Sobczak wrote:

> Dmitry A. Kazakov wrote:
>
>>>> When End_Of_File meets a CR (Ctrl-M) it cannot know if this CR manifests
>>>> the end of the current line or both the line end and the file end.
>>> I understand. But then I consider the specs to be broken, see below. :-)
>>
>> No. It is the concept, which is broken. And that wasn't Ada, who broke it,
>> but crippled operating systems like Windows and Unix. In a proper OS the
>> line terminator is not a character.
>
> That's a brave concept. ;-)
> Why do you assign the special meaning to the line terminator?

Because a line can contain any character. Purely mathematically referential
recursion is known to be flawed beyond repair. "All Cretans are liars."

> Is it
> *that* special in, for example, your previous post? Why not assign the
> special meaning to word or sentence terminator? Or anything else for
> that matter. Isn't it domain-specific? Do you want support from "proper
> OS" for all this stuff?

Yes. A properly designed OS would have a container object to represent
formatted things.

>>>> 1. Never ever use End_Of_File with text files;
>>> This is broken. For me, End_Of_File is a concept that is completely
>>> orthogonal to what the file contains and how it is interpreted.
>>
>> Right, so see above. You need a file system which has EOF state
>> determinable without look ahead.
>
> No. I might be using pipes or fifos or sockets or whatever else where
> EOF is not really determinable by position. It's not file system issue.

But OS doesn't need pipes, sockets as well as files. It needs objects with
clearly defined behavior. If the behavior is defined to support iteration
and string items, that's called a container of strings.

>> [Though I don't defend End_Of_File. I would simply remove it from Text_IO.]
>
> But then it would be somewhere else. There would be an opportunity to
> specify it correctly, without messing with interpretation of the file
> structure.

Yes, it should be a container.

>>>> 2. If you yet use End_Of_File then do it for *each* character of the file;
>>> I don't see how it might solve this problem - End_Of_File would block
>>> after first <CR> anyway.
>>
>> Yes, but then at least you would know what's going on. End_Of_File happened
>> to be lower level (in OSI hierarchy terms) than Get_Line.
>
> This is exactly what I would expect. End_Of_File should not mess with
> file structure.

But the package is called Text_IO. Texts have a structure. So the problem.

>> My answer is no. Exception is not an error. It indicates an exceptional
>> state. Note that an exceptional state is a *valid* state. While an error
>> (bug) has no corresponding program state at all.
>
> It's not about bugs. I have presented an example of truncated XML file -
> there's no bug in a program that happened to be given a broken file to
> digest. It's an error in a sense that the program cannot read the data
> that it genuinely expects. Still, the program should handle this case
> reasonably, so we have valid state.

It is an error in a file, it is not an error in the program. Consider a
defect HDD. Were an exception appropriate here?

> Now, if the program specs says: "read the lines from input until EOF",
> then this for me immediately translates into a loop with some exit
> condition. A while loop, probably, or something in this area. "Read
> until" - you have a regular end-of-sequence condition here. Close to
> iterators. How do you write iteration routines? Do you use exceptions
> for the end-of-sequence condition to break the loop? In what way
> iteration over the container is different from reading lines from input?
>
> (Probably the best thing would be to just have "line iterators".)

See below.

> Sorry, I'm not convinced that exception might be a correct design choice
> for breaking the loop that reads data from well formatted file.

Not only that. I am using exceptions for parsing sources. It fits very
nicely for recursive descent parsing, makes things a lot cleaner and
easier.

>> This is because you consider it from the C++ stand point.
>
> Which is, of course, evil by definition. ;-)
>
>> In Ada exceptions
>> are efficient.
>
> So how do you write iteration routines?

If you mean the case when the number of iterations is statically
indeterminable, then yes, using exceptions. Especially when iteration is
mixed with recursion.

>> They are highly recommended for use in place of return
>> codes.
>
> Good point. There is no return code here. Everything is managed at the
> same level. The red-light here is that with exceptions I would need to
> use empty "where" clause. Empty clause at the same level?

Umm, I didn't understand it, but the skeleton code looks like:

Protocol_Error : exception;

begin
loop
Line := Get_Line (Source);
-- do something. This may raise an exception as well
end loop;
exception
when End_Error =>
-- done due to file end
when Data_Error =>
-- due to I/O error
when Protocol_Error =>
-- due to protocol error
...
end;

> Looks like goto in disguise.

Any execution flow control is. So exceptions are as well.

>> End_Of_File in your program serves the
>> purpose of return code.
>
> Nope. It's the end-of-sequence condition. Just like with iterators.

But Get_Line already has a result, which is a string. String is not a
condition.

>> What is even worse, from the software design
>> perspective, is that one operation "give me next line" is split into two,
>> so that the side effect of one determines the outcome of another and
>> reverse.
>
> Same with iterators.

That's a different idiom. Iterators assume an indexed container. You could
use iterators for dealing with a container of strings. But a stream isn't
one. It is again about mixing abstraction levels. You can convert a
character stream into a sequence of strings, but the stream itself is a
container of characters, not lines. While a text file is a third thing.

>> And what for? To defend a myth, that each
>> loop should have only one exit?
>
> If the specs says "read until end", then this means single exit
> condition to me.

No. This is mixing problem and solution spaces. What if I had a concurrent
program, which would map the file into virtual memory. Then I could split
that memory into 10 pieces and let 10 tasks "read it until end."

>> Your code didn't managed that either!
>
> Why?

Because it contained a hidden goto: "exit when!" (:-))

>> Neither manages it inputs longer than 99 characters.
>
> Good point. How should I solve this?

By making the main loop dealing with lines instead of reads. This is
another reason, why it is not a clean iteration idiom.

>> Further, even in C you wouldn't use it either.
>
> You're right, I wouldn't. Why would I use C if C++ gets it right? ;-)
>
> string line;
> while (getline(cin, line))
> {
> // play with line here
> }

That's OK to me. However, it is not that clean. line outlives the loop. But
it is not equivalent to your Ada code, because you chose fixed-length
strings. An Ada equivalent of your C++ example would use Unbounded_String.
Then what happens upon read error, reading the system paging file?

Adam Beneschan

unread,

Dec 7, 2006, 12:42:13 PM12/7/06

to

Steve wrote:
> "Maciej Sobczak" <no....@no.spam.com> wrote in message
> news:el6jss$268$1...@cernne03.cern.ch...
> [snip]
> >
> > Any thoughts?
> >
> You have run across what I consider to be a weakness in the standard Text_IO
> library, it is one of the few things I find really annoying about Ada, which
> I really like otherwise.
>
> In Ada you can't write a simple program to read and process lines in a text
> file that looks something like:
>
> while not End_Of_File( inFile ) loop
> Get_Line( inFile, Input_Buffer, Count );
> Process_Line( Input_Buffer( 1 .. Count ) );
> end loop;
>
> The problem is that Get_Line reads a line of text into a string buffer and
> returns the count of characters read.

Nitpick: Get_Line returns the *index* of the last character it has read
into the buffer (or Input_Buffer'First-1 if none), not the count. The
two are equivalent if the 'First of the buffer is 1. But it doesn't
have to be. You can say something like

Get_Line (InFile, Some_Buffer(10..50), Last);

and if it reads three characters, the three characters will be read
into Some_Buffer(10..12) and Last will be set to 12.

The "line of text" is defined by the
> next sequential characters from the current position of the file up to the
> next line terminator. If the last character in the file doesn't happen to
> be a line termnator an End_Error exception is raised.

This, I think, is implementation-dependent. Ada makes it clear that a
"line terminator" is a logical concept, not a character. There is
nothing stopping a Unix-type implementation from saying that a file
does not end with a LF charcter, then the implementation will consider
the end-of-file to be accompanied by an *implicit* line terminator that
is not represented by any bits in the file, but is conceptually
considered to be there anyway. Ada allows this interpretation.

It's a mistake to think of a line terminator as a "character"; this is
probably a common mistake among those used to using C and Unix (or
Linux or other OS's ending in "x") (or Solaris), but it's not even
correct to say that a line terminator is a character on other OS's. On
Windows, it's two characters (CR-LF). VMS, as I recall, doesn't use
any character as a line terminator, but keeps information on where
lines start and how long they are in an index or somewhere. Ada is a
portable language that needs to be implementable on all these systems
and others, so therefore it left the definition of "line terminator" up
to the implementation and did not tie us to the concept that there has
to be a "character" involved.

-- Adam

Robert A Duff

unread,

Dec 7, 2006, 5:23:34 PM12/7/06

to

Jean-Pierre Rosen <ro...@adalog.fr> writes:

> Maciej Sobczak a écrit :
>>> 3. As Adam has suggested, End_Error exception is the right design;
>> I don't find it to be "right". For me, exception is something that is
>> unexpected. An error, usually.
> I beg to differ. Otherwise, they would be called "errors" or "abnormal".
> An exception corresponds to an "exceptional" event, i.e. something that
> prevents usual processing, and forces you to an exceptional handling.

I basically agree with Jean-Pierre here. I like to put it this way:
exceptions are for separating different parts of the software that have
different views of what's an "error" and/or what to do about it.

For example, extracting something from a container, when that
"something" does not exist (because the container is empty, or
the lookup fails, or whatever). From the point of view of the container
package, this could be considered an "error". From the point of view of
the client, it could be an error/bug, or it could be perfectly normal
and recoverable. So the container raises an exception, and the client
gets to decide what to do about it (handle the exception and continue,
or write the client code so the exception won't happen, or ...).

>.. If

> you consider that normal processing (normally the body of a loop) is the
> rule, it is an exception to the rule.

Well, I'm not sure I agree with that. I mean, if you're looping through
an array, you normally check against 'Last, via "while", or "exit when",
of implicitly via "for I in X'Range...". You don't (normally) let your
index go past the end, and then handle the Constraint_Error outside the
loop.

It seems like looping through a file ought be similar to looping through
an array, in this regard.

>...And EOF certainly matches this

> definition (as is an error, of course, but it is not the only case).
>
> Exceptions are a useful tool in the programmers tool box. The narrow
> view of exceptions as just a way to handle errors is a mistake that
> prevents elegant solutions to some problems, in my view, and I've been
> constantly fighting against that.

Agreed -- if one says "exceptions are only for errors", then one has to
define "error", and that's impossible, since "error" depends on the
point of view.

- Bob

Robert A Duff

unread,

Dec 7, 2006, 5:35:00 PM12/7/06

to

"Adam Beneschan" <ad...@irvine.com> writes:

> Steve wrote:
>> "Maciej Sobczak" <no....@no.spam.com> wrote in message
>> news:el6jss$268$1...@cernne03.cern.ch...
>> [snip]
>> >
>> > Any thoughts?
>> >
>> You have run across what I consider to be a weakness in the standard Text_IO
>> library, it is one of the few things I find really annoying about Ada, which
>> I really like otherwise.
>>
>> In Ada you can't write a simple program to read and process lines in a text
>> file that looks something like:
>>
>> while not End_Of_File( inFile ) loop
>> Get_Line( inFile, Input_Buffer, Count );
>> Process_Line( Input_Buffer( 1 .. Count ) );
>> end loop;
>>
>> The problem is that Get_Line reads a line of text into a string buffer and
>> returns the count of characters read.

Another problem is that this encourages programmers to place arbitrary
limitations on things (line length in this case -- there's no sensible
way to choose the length of Input_Buffer).

In Ada 2005, you can use the Get_Line function:

Input_Buffer: constant String := Get_Line(inFile);

which returns the line, however long it is. (Well, there's still an
arbitrary limitation, if your disk can hold 100 gigabytes, but String can
hold only 2**31-1 bytes. Oh well, at least that limitation is not quite
so severe. ;-))

There are many problems with the design of Text_IO...

> It's a mistake to think of a line terminator as a "character"; this is
> probably a common mistake among those used to using C and Unix (or
> Linux or other OS's ending in "x") (or Solaris), but it's not even
> correct to say that a line terminator is a character on other OS's. On
> Windows, it's two characters (CR-LF). VMS, as I recall, doesn't use
> any character as a line terminator, but keeps information on where
> lines start and how long they are in an index or somewhere. Ada is a
> portable language that needs to be implementable on all these systems
> and others, so therefore it left the definition of "line terminator" up

^^^^^^^^^

> to the implementation and did not tie us to the concept that there has
> to be a "character" involved.

I believe the above is historically correct, but I don't buy the
"therefore" above. For portability, Ada requires the implementation to
translate from whatever OS conventions to Ada's conventions. It could
just as well have required translating to the Unix convention instead of
translating the rather record-oriented style of Text_IO. The Unix
convention has certain advantages.

- Bob

Robert A Duff

unread,

Dec 7, 2006, 5:50:50 PM12/7/06

to

"Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> writes:

> No. It is the concept, which is broken. And that wasn't Ada, who broke it,
> but crippled operating systems like Windows and Unix. In a proper OS the
> line terminator is not a character.

Why do you say so? The concept of "sequence of characters", which
includes blanks and end-of-line chars, seems pretty good to me.
(Other control chars, such as tabs, should be banished to the far
side of the moon.)

I think the Unix idea -- "line terminator (or separator?) = a particular
character" -- is a pretty convenient model. It's certainly convenient
for parsing input text: read one char at a time, and deal with it,
treating end-of-line as one possible case. E.g., an Ada compiler
typically works that way.

How much human intellectual effort has been wasted by having to deal
with "text mode" versus "binary mode" ftp?! The unix model makes them
identical, and if all operating systems had magically agreed on that
from the dawn of time, we'd all be better off.

To represent end-of-line as TWO characters is just plain stupid.
Even a manual typewriter has a single lever that does both (returns the
carriage, and feeds the line).

Note: I don't know of any Ada compiler that uses Text_IO to read the Ada
source code to be compiled.

- Bob

Randy Brukardt

unread,

Dec 7, 2006, 7:13:41 PM12/7/06

to

"Robert A Duff" <bob...@shell01.TheWorld.com> wrote in message
news:wccfybr...@shell01.TheWorld.com...

> Note: I don't know of any Ada compiler that uses Text_IO to read the Ada
> source code to be compiled.

Janus/Ada did originally, back in the early days when we had a partial
implementation of everything. Once we finished the complete Text_IO, though,
the whole thing became too slow. Indeed, these days all of the compiler's IO
is done directly through the lowest of our I/O layers (we called it
"Basic_IO", it's vaguely like Stream_IO).

My two cents on this silly discussion:

(1) The definition of End_of_File requires reading ahead as many as 4
characters. End_of_Line similarly requires read-ahead in some cases. This
requirement has a significant impact on the entire Text_IO (once you've read
those characters ahead, you have to save them somewhere for future use. But
regular buffering would make Standard_Input from the keyboard unusable...).
The requirement for lookahead means that it should *never* be called on
anything that can't be buffered, like the keyboard. So using End_of_File on
Standard_Input is always a mistake.

Besides, it doesn't make sense for a keyboard to even have an EOF. Systems
that allow it - like UNIX - are more likely to cause problems because of an
accidental EOF than any possible use. Way back, we had a CP/M machine which
treated <Ctrl>-Z from the keyboard as closing the keyboard - a reboot was
required to fix it. The machine actually had a <Ctrl>-Z key!! That often
caused loss of work when the keyboard input to the editor suddenly became
closed... End-of-file from Standard_Input is *the* classic example of an
exceptional condition that shouldn't clutter the "normal" code.

(2) The implementation of Text_IO is *very* complicated, especially by
things that are hardly ever used like page terminators, and line and page
counts. Some routines are especially bad; End_of_File is one of these.
Because of this substantial overhead, it's usually far more efficient to
read a file with an infinite loop terminated by an exception.

(3) Text_IO.Get_Line has to read a character at a time. This can be as much
as ten times slower than other methods of reading input. So, if performance
is critical, it's probably best to read and interpret the file another way.

(4) All of this behavior is required by the RM and is enforced by the ACATS.
An implementation that doesn't return End_of_File = True for a file
containing just a blank line will fail the ACATS.

(5) Does this mean that the definition of Text_IO is screwy and
over-complex? Absolutely. But there is absolutely no chance that there will
be any change in the definition of Ada.Text_IO -- it would break a large
percentage of existing Ada programs. So, unless you're designing a
replacement language for Ada, there's no point whining about it. The
definition of the language is not going to change; compilers are not going
to change. Live with it. And that means that for 99% of programs, calling
End_of_File is just wrong; handle End_Error instead. Sorry if that hurts
your sensibilities.

Randy.

Larry Kilgallen

unread,

Dec 7, 2006, 11:04:14 PM12/7/06

to

In article <wccfybr...@shell01.TheWorld.com>, Robert A Duff <bob...@shell01.TheWorld.com> writes:

> I think the Unix idea -- "line terminator (or separator?) = a particular
> character" -- is a pretty convenient model. It's certainly convenient
> for parsing input text: read one char at a time, and deal with it,
> treating end-of-line as one possible case. E.g., an Ada compiler
> typically works that way.

It constrains the ASCII values that can be in a record. That may not
be important for compilers but I write programs where record boundaries
different from "new line" is quite useful in the output files.

> To represent end-of-line as TWO characters is just plain stupid.

I will raise (lower) your TWO characters and say that inline record
boundaries are foolish.

Dmitry A. Kazakov

unread,

Dec 8, 2006, 4:11:15 AM12/8/06

to

On Thu, 07 Dec 2006 17:50:50 -0500, Robert A Duff wrote:

> "Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> writes:
>
>> No. It is the concept, which is broken. And that wasn't Ada, who broke it,
>> but crippled operating systems like Windows and Unix. In a proper OS the
>> line terminator is not a character.
>
> Why do you say so? The concept of "sequence of characters", which
> includes blanks and end-of-line chars, seems pretty good to me.
> (Other control chars, such as tabs, should be banished to the far
> side of the moon.)

The concept of a sequence of characters is OK, but it is not text I/O. It
is just String. What I mean is that text I/O cannot be defined in such
terms. If we did that, we would implicitly specify a certain encoding
format, which is OS/presentation specific. It would be OK if there were
only one OS and only one presentation format. But, to give an extreme
example, a text in HTML format should be readable using Text_IO without
seeing any <BR> tags.

> I think the Unix idea -- "line terminator (or separator?) = a particular
> character" -- is a pretty convenient model. It's certainly convenient
> for parsing input text: read one char at a time, and deal with it,
> treating end-of-line as one possible case. E.g., an Ada compiler
> typically works that way.

I don't think so. From the compiler construction perspective this
presentation format is very unfortunate, because you don't know in advance
how long a source line is. (Ada program validity depends on line ends.)
Further areas where this idea works quite poorly are networking (there is
no native way to block packets), keyboard input. It is were all these
buffer overrun issues are rooted. And in general it leads to nowhere. What
about EOF character? What about "abs", "declare", "loop" etc characters?
(:-))

> How much human intellectual effort has been wasted by having to deal
> with "text mode" versus "binary mode" ftp?! The unix model makes them
> identical, and if all operating systems had magically agreed on that
> from the dawn of time, we'd all be better off.

Absolutely, that is exactly my point. It is the flawed Unix model which
considers texts, executables, databases and mouse buttons as sequences of
characters. It is untyped. They should be different ADTs! (:-))

Maciej Sobczak

unread,

Dec 8, 2006, 3:22:09 AM12/8/06

to

Dmitry A. Kazakov wrote:

>> Why do you assign the special meaning to the line terminator?
>
> Because a line can contain any character.

That's fine, but it doesn't change much from the Get_Line point of view.

>>> My answer is no. Exception is not an error. It indicates an exceptional
>>> state. Note that an exceptional state is a *valid* state. While an error
>>> (bug) has no corresponding program state at all.
>> It's not about bugs. I have presented an example of truncated XML file -
>> there's no bug in a program that happened to be given a broken file to
>> digest. It's an error in a sense that the program cannot read the data
>> that it genuinely expects. Still, the program should handle this case
>> reasonably, so we have valid state.
>
> It is an error in a file, it is not an error in the program. Consider a
> defect HDD. Were an exception appropriate here?

Yes. As in disconnected NFS, and so on.

>> Sorry, I'm not convinced that exception might be a correct design choice
>> for breaking the loop that reads data from well formatted file.
>
> Not only that. I am using exceptions for parsing sources. It fits very
> nicely for recursive descent parsing, makes things a lot cleaner and
> easier.

That's a different kettle of fish. Recursvie descent parser does not
really iterate over things - it *accepts* tokens. The accepting is what
makes a difference between parsing and iterating. There is a failure
logic build in the parser that is not present in iteration.

It's interesting that you mention recursive descent parser, because this
was actually the background for my original problem.
Just few days ago I wanted to practice with some Ada "homework" and
decided to write a simple line-oriented calculator. There is a parser,
of course, and it has a simple grammar with just four productions. The
parser accepts tokens that it expects according to its grammar and
raises an exception when the expected token is not there. The exception
is then handled at the top level, where the user is notified that
ill-formed expression was provided. I have absolutely no problems with
exceptions here - as note above, there is a failure logic in the parser.
But the top level (main subprogram) uses a regular loop for reading
lines of text and End_Of_File predicate to decide whether it's OK to
finish. There is no place for exceptions, it's just pure linear
iteration with single end-of-sequence condition.

Actually, I found this Get_Line problem while having fun with my
calculator "homework".

>> So how do you write iteration routines?
>
> If you mean the case when the number of iterations is statically
> indeterminable, then yes, using exceptions.

What do you mean "statically indeterminable"? What about iterating over
a container?

"Programming in Ada 2005", John Barnes, chapter 19.5 "Iterators".
I don't see any exceptions in there.

> Especially when iteration is
> mixed with recursion.

It's not really mixed, becaue if you decouple parser from tokenizer,
then iteration and recursion work in separated levels of program
structure. :-)

> Protocol_Error : exception;
>
> begin
> loop
> Line := Get_Line (Source);
> -- do something. This may raise an exception as well
> end loop;
> exception
> when End_Error =>
> -- done due to file end
> when Data_Error =>
> -- due to I/O error
> when Protocol_Error =>
> -- due to protocol error
> ...
> end;
>
>> Looks like goto in disguise.
>
> Any execution flow control is. So exceptions are as well.

But you still didn't convince me why exceptions should be preferred in
this case. :-)

>>> End_Of_File in your program serves the
>>> purpose of return code.
>> Nope. It's the end-of-sequence condition. Just like with iterators.
>
> But Get_Line already has a result, which is a string. String is not a
> condition.

Same with iterators. The end-of-sequence condition is the iterator's
state, not the value it returns.

> That's a different idiom. Iterators assume an indexed container.

What about linked lists? They are not indexed.
I've been also using iterators without any containers - that's a nice
solution for function generators, for example (for Python aficionados -
think about range vs. xrange).

> You could
> use iterators for dealing with a container of strings.

That's what I want to see on input when reading consecutinve lines.

> But a stream isn't
> one.

I'm not reading a stream. I'm reading lines - the structure is already
there, and I don't want to care about what is below.

> It is again about mixing abstraction levels. You can convert a
> character stream into a sequence of strings, but the stream itself is a
> container of characters, not lines. While a text file is a third thing.

How does it relate to my problem with Get_Line?

>> If the specs says "read until end", then this means single exit
>> condition to me.
>
> No. This is mixing problem and solution spaces. What if I had a concurrent
> program, which would map the file into virtual memory. Then I could split
> that memory into 10 pieces and let 10 tasks "read it until end."

Then you will have 10 tasks reading their own sequences, very likely
using loops with single exit conditions. It doesn't change the nature of
the problem at all.

>>> Your code didn't managed that either!
>> Why?
>
> Because it contained a hidden goto: "exit when!" (:-))

That is one exit point. Just what the specs says.

>>> Neither manages it inputs longer than 99 characters.
>> Good point. How should I solve this?
>
> By making the main loop dealing with lines instead of reads.

Isn't Get_Line dealing with lines?

>> string line;
>> while (getline(cin, line))
>> {
>> // play with line here
>> }
>
> That's OK to me. However, it is not that clean. line outlives the loop.

It is the price we sometimes pay for more compact representations.
Exactly the same considerations apply to Ada - most loops in Ada I have
seen were written this way.

> But
> it is not equivalent to your Ada code, because you chose fixed-length
> strings. An Ada equivalent of your C++ example would use Unbounded_String.

Of course, but that doesn't change the problem. It's the Get_Line in Ada
vs. getline in C++ that shows the difference.

> Then what happens upon read error, reading the system paging file?

I'd expect std::bad_alloc. That's STORAGE_ERROR in Ada.