Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Weird behavior of Get character with trailing new lines.

66 views
Skip to first unread message

Blady

unread,
Sep 22, 2023, 3:30:21 PM9/22/23
to
Hello,

I'm reading a text file with Get character from Text_IO with a while
loop controlled by End_Of_File.

% cat test_20230922_get_char.adb
with Ada.Text_IO; use Ada.Text_IO;
procedure test_20230922_get_char is
procedure Get is
F : File_Type;
Ch : Character;
begin
Open (F, In_File, "test_20230922_get_char.adb");
while not End_Of_File(F) loop
Get (F, Ch);
Put (Ch);
end loop;
Close (F);
Put_Line ("File read with get.");
end;
begin
Get;
end;



All will be well, unfortunately not!

Despite the End_Of_File, I got an END_ERROR exception when there are
several trailing new lines at the end of the text:

% test_20230922_get_char
with Ada.Text_IO; use Ada.Text_IO;procedure test_20230922_get_char is
procedure Get is F : File_Type; Ch : Character; begin
Open (F, In_File, "test_20230922_get_char.adb"); while not
End_Of_File(F) loop Get (F, Ch); Put (Ch); end
loop; Close (F); Put_Line ("File read with get.");
end;beginGet;end;

Execution of ../bin/test_20230922_get_char terminated by unhandled exception
raised ADA.IO_EXCEPTIONS.END_ERROR : a-textio.adb:517

The code is compiled with GNAT, does it comply with the standard?

A.10.7 Input-Output of Characters and Strings
For an item of type Character the following procedures are provided:
procedure Get(File : in File_Type; Item : out Character);
procedure Get(Item : out Character);
After skipping any line terminators and any page terminators, reads the
next character from the specified input file and returns the value of
this character in the out parameter Item.
The exception End_Error is propagated if an attempt is made to skip a
file terminator.

This seems to be the case, then how to avoid the exception?

Thanks, Pascal.



Niklas Holsti

unread,
Sep 22, 2023, 3:52:25 PM9/22/23
to
In Text_IO, a line terminator is not an ordinary character, so you must
handle it separately, for example like this:

while not End_Of_File(F) loop
if End_Of_Line(F) then
New_Line;
Skip_Line(F);
else
Get (F, Ch);
Put (Ch);
end if;





Jeffrey R.Carter

unread,
Sep 22, 2023, 4:05:59 PM9/22/23
to
On 2023-09-22 21:30, Blady wrote:
>
> A.10.7 Input-Output of Characters and Strings
> For an item of type Character the following procedures are provided:
> procedure Get(File : in File_Type; Item : out Character);
> procedure Get(Item : out Character);
> After skipping any line terminators and any page terminators, reads the next
> character from the specified input file and returns the value of this character
> in the out parameter Item.
> The exception End_Error is propagated if an attempt is made to skip a file
> terminator.

As you have quoted, Get (Character) skips line terminators. End_Of_File returns
True if there is a single line terminator before the file terminator, but False
if there are multiple line terminators before the file terminator. So you either
have to explicitly skip line terminators, or handle End_Error.

--
Jeff Carter
"Unix and C are the ultimate computer viruses."
Richard Gabriel
99

J-P. Rosen

unread,
Sep 23, 2023, 3:02:41 AM9/23/23
to
And this works only if the input file is "well formed", i.e. if it has
line terminators as the compiler expects them to be (f.e., you will be
in trouble if the last line has no LF).
That's why I never check End_Of_File, but handle the End_Error
exception. It always works.
--
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
https://www.adalog.fr https://www.adacontrol.fr

Niklas Holsti

unread,
Sep 23, 2023, 4:39:29 AM9/23/23
to
On 2023-09-23 10:02, J-P. Rosen wrote:
> Le 22/09/2023 à 22:05, Jeffrey R.Carter a écrit :
>> On 2023-09-22 21:30, Blady wrote:
>>>
>>> A.10.7 Input-Output of Characters and Strings
>>> For an item of type Character the following procedures are provided:
>>> procedure Get(File : in File_Type; Item : out Character);
>>> procedure Get(Item : out Character);
>>> After skipping any line terminators and any page terminators, reads
>>> the next character from the specified input file and returns the
>>> value of this character in the out parameter Item.
>>> The exception End_Error is propagated if an attempt is made to skip a
>>> file terminator.
>>
>> As you have quoted, Get (Character) skips line terminators.
>> End_Of_File returns True if there is a single line terminator before
>> the file terminator, but False if there are multiple line terminators
>> before the file terminator. So you either have to explicitly skip line
>> terminators, or handle End_Error.
>>
> And this works only if the input file is "well formed", i.e. if it has
> line terminators as the compiler expects them to be (f.e., you will be
> in trouble if the last line has no LF).


Hm. The code I suggested, which handles line terminators separately,
does work without raising End_Error even if the last line has no line
terminator, at least in the context of the OP's program.


> That's why I never check End_Of_File, but handle the End_Error
> exception. It always works.


True, but it may not be convenient for the overall logic of the program
that reads the file. That program often wants do to something with the
contents, after reading the whole file, and having to enter that part of
the program through an exception does complicate the code a little.

On the other hand, past posts on this issue say that using End_Error
instead of the End_Of_File function is faster, probably because the
Text_IO code that implements Get cannot know that the program has
already checked for End_Of_File, so Get has to check for that case
anyway, redundantly.

My usual method for reading text files is to use Text_IO.Get_Line, and
(I admit) usually with End_Error termination.

Dmitry A. Kazakov

unread,
Sep 23, 2023, 5:25:07 AM9/23/23
to
On 2023-09-23 10:39, Niklas Holsti wrote:
> On 2023-09-23 10:02, J-P. Rosen wrote:

>> That's why I never check End_Of_File, but handle the End_Error
>> exception. It always works.
>
> True, but it may not be convenient for the overall logic of the program
> that reads the file. That program often wants do to something with the
> contents, after reading the whole file, and having to enter that part of
> the program through an exception does complicate the code a little.

It rather simplifies the code. You exit the loop and do whatever is
necessary there.

Testing for the file end is unreliable and non-portable. Many types of
files simply do not support that test. In other cases the test is not
file immutable with the side effects that can change the program logic.

It is well advised to never ever use it.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Niklas Holsti

unread,
Sep 23, 2023, 10:03:18 AM9/23/23
to
On 2023-09-23 12:25, Dmitry A. Kazakov wrote:
> On 2023-09-23 10:39, Niklas Holsti wrote:
>> On 2023-09-23 10:02, J-P. Rosen wrote:
>
>>> That's why I never check End_Of_File, but handle the End_Error
>>> exception. It always works.
>>
>> True, but it may not be convenient for the overall logic of the
>> program that reads the file. That program often wants do to something
>> with the contents, after reading the whole file, and having to enter
>> that part of the program through an exception does complicate the code
>> a little.
>
> It rather simplifies the code.


Oh?


> You exit the loop and do whatever is necessary there.

That is exactly what happens in the "while not End_Of_File" loop.

If you want to use End_Error instead, you have to add an exception
handler, and if you want to stay in the subprogram's statement sequence
without entering the subprogram-level exception handlers, you have to
add a block to contain the reading loop and make the exception handler
local to that block.

To me that looks like adding code -> more complex. Of course not much
more complex, but a little, as I said.


> Testing for the file end is unreliable and non-portable. Many types
> of files simply do not support that test.In other cases the test is
> not file immutable with the side effects that can change the program
> logic.

I suppose you are talking about the need for End_Of_File to possibly
read ahead past a line terminator? If not, please clarify.

That said, I certainly think that a program reading files should be
prepared to handle End_Error, especially if a file is read at several
places in the program (and not in a single loop as in the present program).

Dmitry A. Kazakov

unread,
Sep 24, 2023, 3:50:52 AM9/24/23
to
On 2023-09-23 16:03, Niklas Holsti wrote:
> On 2023-09-23 12:25, Dmitry A. Kazakov wrote:

>> You exit the loop and do whatever is necessary there.
>
> That is exactly what happens in the "while not End_Of_File" loop.

It does not because you must handle I/O errors and close the file.

> If you want to use End_Error instead, you have to add an exception
> handler, and if you want to stay in the subprogram's statement sequence
> without entering the subprogram-level exception handlers, you have to
> add a block to contain the reading loop and make the exception handler
> local to that block.

You always have to in order to handle I/O errors.

> To me that looks like adding code -> more complex. Of course not much
> more complex, but a little, as I said.

No, it is simpler if the code is production code rather than an
exercise. Consider typical case when looping implements reading some
message, block etc. You have

loop
read something
read another piece
read some count
read a block of count bytes
...

You cannot do it this way if you use end of file test because you must
protect each minimal input item (e.g. byte) by the test. It is massively
obtrusive and would distort program logic. You will end up with nested
ifs or else gotos.

>> Testing for the file end is unreliable and non-portable. Many types
>> of files simply do not support that test.In other cases the test is
>> not file immutable with the side effects that can change the program
>> logic.
>
> I suppose you are talking about the need for End_Of_File to possibly
> read ahead past a line terminator? If not, please clarify.

Yes, reading ahead and also issues with blocking and with race condition
in shared files. Then things like sockets do not have end of file,
connection drop is indicated by an empty read.

Blady

unread,
Sep 25, 2023, 3:56:01 PM9/25/23
to
Le 24/09/2023 à 09:50, Dmitry A. Kazakov a écrit :
> On 2023-09-23 16:03, Niklas Holsti wrote:
>> On 2023-09-23 10:02, J-P. Rosen wrote: >>> Le 22/09/2023 à 22:05, Jeffrey R.Carter a écrit :>>>> On 2023-09-22
21:30, Blady wrote:
>>>>
>>>> A.10.7 Input-Output of Characters and Strings
>>>> For an item of type Character the following procedures are provided:
>>>> procedure Get(File : in File_Type; Item : out Character);
>>>> procedure Get(Item : out Character);
>>>> After skipping any line terminators and any page terminators, reads
>>>> the next character from the specified input file and returns the
>>>> value of this character in the out parameter Item.
>>>> The exception End_Error is propagated if an attempt is made to skip
>>>> a file terminator.

Thanks all for your helpful answers.

It actually helps.

Especially, I was not aware of the particular behavior of End_Of_File
with a single line terminator before the file terminator.

In my case, I prefer to reserve exceptions for exceptional situations
:-) so I've took the code from Niklas example.

Regards, Pascal.

Randy Brukardt

unread,
Sep 26, 2023, 1:53:32 AM9/26/23
to
"J-P. Rosen" <ro...@adalog.fr> wrote in message
news:uem2id$moia$1...@dont-email.me...
Agreed. And if the file might contain a page terminator, things get even
worse because you would have to mess around with End_of_Page in order to
avoid hitting a combination that still will raise End_Error. It's not worth
the mental energy to avoid it, especially in a program that will be used by
others. (I've sometimes used the simplest possible way to writing a
"quick&dirty" program for my own use; for such programs I skip the error
handling as I figure I can figure out what I did wrong by looking at the
exception raised. But that's often a bad idea even in that case as such
programs have a tendency to get reused years later and then the intended
usage often isn't clear.)

Randy.


0 new messages