On 03/09/2022 15:47, Dmitry A. Kazakov wrote:
> On 2022-09-03 14:12, James Harris wrote:
>> On 01/09/2022 16:12, Dmitry A. Kazakov wrote:
>>> On 2022-09-01 15:57, James Harris wrote:
>>>> On 31/08/2022 10:36, Dmitry A. Kazakov wrote:
Going back to this thread as I had to make a choice for some code I
wrote recently.
>>>
>>>>> There is only one case of EOF, namely the end of file (:-))
>>>>
>>>> So you would prefer EOF to mean "there are zero more bytes available
>>>> to read at the moment"?
>>>
>>> No. EOF means the file/container ends here, like 'Z' is the last
>>> letter of the alphabet.
>>
>> How is "the file ends here" different from there being nothing left to
>> read? If the file ends here (your definition) then there's nothing
>> left, surely.
>
> Reading is an operation, EOF is a state. The semantics of read depends
> on the state.
My best guess at what you are driving at is that to you EOF is a
higher-level, logical state rather than a physical one. That's fine.
Sometimes one needs to recognise that "/at the moment/ the file ends
here" is different from "we are at the end of a complete file".
For instance, a file may currently end at byte 49 but a nanosecond later
something is going to write byte 50 and the file is not complete until
byte 50 is also present.
The problem is that the OS may well not know, so it could not tell a
program anything other than "there are currently no more bytes to read".
If the OS had some way to know that byte 50 was needed it could block
the reader until byte 50 arrived or return an indication that more data
was to follow. But that's not always possible.
>
>> There's a further case, too, as follows.
>>
>> Imagine that the offset of the next byte to read is the same as the
>> file's size. Take that as the standard condition for a read to return
>> EOF.
>
> Unnecessary assumptions. There could be no bytes and the file newer
> constructed as a whole as in the case with pipes.
Some streams of data (e.g. TCP and pipes) do have out-of-band indication
of the difference between "there is nothing more to read now" and "the
stream has ended; nothing more will or can be added".
But plain files do not. Which comes back to the question of what are the
best indications to return to a program.
>
>> What if the offset is set to some byte /after/ where the file ends,
>> e.g. the file length is 50 and the offset is 55.
>
> Then you have wrong offset provided offset exist, since that depends on
> the type of file, e.g. a random access file.
>
>> As a programmer, would you want to get EOF in that case, too, or would
>> you want some separate exception such as 'read past EOF'?
>
> You cannot read past EOF per definition of.
OK, "read requested when file pointer is past EOF", if you prefer.
> Whether reading past the
> file end causes an exception or like in the case of the C library
> returns a special value is up to the designer of the API.
Sure. I was just wondering what a programmer would find most convenient
and useful to cover the different cases. There are two parts which must
be brought together:
* what the programmer would like to know
* what the environment (the RTS or OS) may be able to say
The programmer may want full information but the environment in which
the program runs may not be able to give that much detail. Yet the same
program will need to be able to run in different environments and on
different types of stream.
This doesn't sound easy to reconcile. For example, a program may want to
know when the logical end of the data has been reached but the
environment may only be able to say "there's nothing more just now" as
with the case of files, above.
So, trying to brainstorm what a program could be told when it tries to
read from a stream of data:
For blocking reads
* Here's all the data you asked for,
* Here's some but less than you asked for.
* I have x amount of the data but it's less than you asked for so I am
returning nothing.
* I have nothing more for you to read just now.
* I have nothing more for you to read just now but the stream is
closable and is not closed so there may be more.
* The stream is closable and is closed; there will be nothing more.
* There was an unrecoverable input error.
* There was an input error which may be correctable but could not be
corrected before the timeout expired.
Or the environment could block until enough data arrive or there's a
timeout.
For nonblocking reads the responses would probably be the same except,
of course, for the potential for blocking. Put another way, nonblocking
reads may be the same as blocking reads with a timeout of zero (?).
Basically, AISI an ostensibly simple 'read' call could get a reply of
any of the above responses and maybe others. The problem is to work out
how such info could be returned so as to make a programmer's life as
easy as possible, especially bearing in mind that his program may run in
different environments and on different types of stream.
Can anyone see where I am going with this - and save me a few steps!
There must already be some standard model of reads that resolves these
issues.
--
James Harris