Line 1<eor>
Line 2<eof>
where <eor> is the end of record marker, that also happens to be the
line terminator for your platform of choice, and <eof> just means end of
file (which may or may not involve some sort of explicit marker).
(Basically - I've got a text file that doesn't end in a newline on a
unix-like or windows platform).
Then consider the following program:
PROGRAM eieio_stat
IMPLICIT NONE
INTEGER, PARAMETER :: unit = 10
INTEGER :: ios1, ios2, ios3
CHARACTER(100) :: buffer
!****
OPEN(UNIT=unit, FILE='file3.txt') ! file 'xxx' has contents as above
READ (unit,"(A)",ADVANCE="NO",iostat=ios1) buffer
READ (unit,"(A)",ADVANCE="NO",iostat=ios2) buffer
READ (unit,"(A)",ADVANCE="NO",iostat=ios3) buffer
CLOSE(unit)
WRITE (*, "(3(A,'(',I0,')',:,', '))") &
ioch(ios1), ios1, ioch(ios2), ios2, ioch(ios3), ios3
CONTAINS
FUNCTION ioch(iostat)
USE, INTRINSIC :: ISO_FORTRAN_ENV, ONLY: IOSTAT_END, IOSTAT_EOR
INTEGER, INTENT(IN) :: iostat
CHARACTER(4) :: ioch
!****
IF (iostat == IOSTAT_END) THEN
ioch = 'IOSTAT_END'
ELSE IF (iostat == IOSTAT_EOR) THEN
ioch = 'IOSTAT_EOR'
ELSE IF (iostat < 0) THEN
ioch = 'Other -ve '
ELSE IF (iostat > 0) THEN
ioch = 'Positive '
ELSE
ioch = 'Zero '
END IF
END FUNCTION ioch
END PROGRAM eieio_stat
On Intel Fortran I get:
IOSTAT_EOR(-2), IOSTAT_EOR(-2), IOSTAT_END(-1).
On very recent gfortran 4.5.0 I get:
IOSTAT_EOR(-2), Zero (0), IOSTAT_END(-1).
On g95 I get:
IOSTAT_EOR(-2), Positive (207), Positive (207).
These differences cause issues with the get procedure from
iso_varying_string to read such "missing final newline" files and I've
been trying to understand what the best/most robust way forward is
(putting aside STREAM access for now).
When attempting to rationalise the results, some of my conflicting
thoughts have been:
- Because the input file was not written by Fortran as a sequential
formatted file (the way I am trying to read it) and misses the formally
required end of record marker, then "anything goes" so I'm sunk and this
whole post is irrelevant.
- Intel's treatment is probably the most useful, the second EOR can be
justified because the record did end (though it may have been missing a
formal marker it's up to the processor to decide what the end of a
record is?) and some useful stuff can be retrieved into buffer.
- G95's treatment is not terribly useful, but still formally correct as
the lack of an end of record marker is an error condition. From that
point on the unit is hosed.
- gfortran's treatment is also useful, but a little inconsistent -
something terminated the input into buffer, if it wasn't the end of
record, end of file or some other error condition (ie. record
incomplete), then what was it?
By useful, I mean that I can get access to the text that is on the
troublesome last line.
Thoughts?
Thanks,
IanH
> (Basically - I've got a text file that doesn't end in a newline on a
> unix-like or windows platform).
...
> These differences cause issues with the get procedure from
> iso_varying_string to read such "missing final newline" files and I've
> been trying to understand what the best/most robust way forward is
> (putting aside STREAM access for now).
I wonder why you are putting aside stream? That is the only way where
the standard gives any help in the matter. As long as you open the file
as a sequential formatted file, then I think you are stuck with...
> - Because the input file was not written by Fortran as a sequential
> formatted file (the way I am trying to read it) and misses the formally
> required end of record marker, then "anything goes" so I'm sunk and this
> whole post is irrelevant.
The bit about not being written by Fortran perhaps overstates the
requirements, but the file does have to be one that is valid by whatever
rules the Fortran compiler (ok, "processor" in standard-speak) defines.
The standard is silent on what those rules are. In practice, you are
essentially guaranteed that all will be ok if the files end in the
platform's newline sequence (for the mentioned platforms, which have
one). The standard doesn't guarantee that, but practice does.
Contributing to the "anything goes" is "the standard doesn't specify
what counts as an error condition."
> By useful, I mean that I can get access to the text that is on the
> troublesome last line.
For "improperly structured" files like the ones in question, you can
debate what you feel would be useful, and perhaps try to talk vendors
into doing it, but the standard doesn't offer any support.
I would think it more constructive to pursue using what is already in
the standard (stream) instead of trying to push vendors into defacto
standardization of something that isn't formally standard and, by your
own evidence, isn't handled consistently by different existing
compilers. You are, of course, free to try, but my guess is that such a
request wouldn't be considered very high prority when a standard
conforming approach already exists. It also has the problem that you
might have to go through it again with every new vendor that becomes
relevant to you.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
> For "improperly structured" files like the ones in question, you can
> debate what you feel would be useful, and perhaps try to talk vendors
> into doing it, but the standard doesn't offer any support.
There are many unix utilities that will have problems with that last
unterminated line, not just fortran compilers and fortran i/o
libraries. I would suggest shifting the emphasis back one step and
try to fix the offending program that creates such files.
$.02 -Ron Shepard
If you're using ADVANCE='NO', might be useful to
define BUFFER as having 1 character, not 100.
Thanks for the comments.
The files are generated by a text editor, which is operated by various
human users. Many times previously I've tried to fix these users, but
with rather limited success.
I was reluctant to go down the formatted stream path as I wasn't sure
what implications it would have for other IO code that reads from that
unit. Call me a chicken, or lazy, or both. Also, some testing shows
that some of the compilers I use have problems with it, though I'm also
a common factor in that testing.
All I'm trying to do is to patch an iso_varying_string implementation
(the one at www.fortran.com/iso_varying_string.f95) to be a bit more
robust.
I'm surprised that a text editor written in any language would save a
file whose last line wasn't properly terminated. That sort of thing is
usually transparent to the user.
In any case, you might consider writing a utility program (in Fortran or
C if that's easier) to fix the file. Then your main Fortran application
wouldn't have to worry about it.
Louis
For a "text editor" for editing plain ASCII files, yes.
It seems, though, that the convention for prose files is to use
the record delimiter character as a paragraph break, and not to store
one at the end of lines. That allows for line breaks to flow
as the width changes. In that case it seems usual not to supply
a paragraph break at the end of the file.
Maybe there is an option to turn of this feature.
-- glen
How about a program like this?
! incomplete.f90 --
! Read an incomplete file
!
program incomplete
implicit none
integer :: i, k, ierr
character(len=80) :: line
character(len=1) :: char
open( 10, file = 'incomplete.inp' )
line = 'Some line of text, written without an EOL character'
write( 10, '(a)' ) 'First line'
write( 10, '(a)', advance = 'no' ) trim(line)
close( 10 )
!
! Try read the file
!
open( 10, file = 'incomplete.inp' )
line = ' '
k = 0
do i = 1,100
k = k + 1
read( 10, '(a)', iostat = ierr, advance = 'no' ) char
if ( ierr == -2 ) then
write(*,*) ierr, trim(line)
read( 10, '(a)', iostat = ierr, advance = 'no' ) char
k = 1
line = ' '
else if ( ierr == -1 ) then
write(*,*) ierr, trim(line)
exit
endif
line(k:k) = char
enddo
end program
The trick is to read the file character by character, so that
nothing gets lost.
It is annoying that one has to go through such lengths, but
that is the real world for you.
Regards,
Arjen
> On 4/02/2010 3:58 AM, Ron Shepard wrote:
>> In article<1jdbapv.1ad8d151o18ny2N%nos...@see.signature>,
>> nos...@see.signature (Richard Maine) wrote:
>>
>>> For "improperly structured" files like the ones in question, you can
>>> debate what you feel would be useful, and perhaps try to talk vendors
>>> into doing it, but the standard doesn't offer any support.
>>
>> There are many unix utilities that will have problems with that last
>> unterminated line, not just fortran compilers and fortran i/o
>> libraries. I would suggest shifting the emphasis back one step and
>> try to fix the offending program that creates such files.
>>
>> $.02 -Ron Shepard
>
> Thanks for the comments.
>
> The files are generated by a text editor, which is operated by various
> human users. Many times previously I've tried to fix these users, but
> with rather limited success.
I seem to recall seeing an option deep in the "more obscure options
section" of some
editors that they would fix the mising final line end. All the editors
I have experience
with will open multiple files easily and then equally easily close the
files. This
is just a special case of finding some utility that will copy the files
and ensure
that the output is well formed. The couple editors I use all have
options to use
any of the three major line end conventions. I would expect that
finding a programmer's
text editor that can both convert and fix line ends is mostly a
willingness to read the
fine print on options for a couple of them. Being allowed to buy the
software and
use it may be a different class of problem. :-( This is a case where
the adjective
in front of editor matters. Programmer's text editors are not word
processors as
they offer zero formatting and will only change the display font size
for the whole
file as an indulgence for the users who want easier to read screens.
Such editors are
sold on their features which help the manual production of HTML and
such so features
like line end control may be harder to detemine.
> Louis Krupp <lkrupp...@indra.com.invalid> wrote:
> (snip)
>
> > I'm surprised that a text editor written in any language would save a
> > file whose last line wasn't properly terminated. That sort of thing is
> > usually transparent to the user.
>
> For a "text editor" for editing plain ASCII files, yes.
I've seen multiple editors for "plain ASCII files" that have the
mentioned problem. Emacs comes to mind. At least in the case of Emacs,
there is an option to automatically add a final terminator as needed,
but that option at least used to default to being off. I recall it
because it was part of my usual customization of Emacs to turn it on.
> glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
>
> > Louis Krupp <lkrupp...@indra.com.invalid> wrote:
> > (snip)
> >
> > > I'm surprised that a text editor written in any language would save a
> > > file whose last line wasn't properly terminated. That sort of thing is
> > > usually transparent to the user.
> >
> > For a "text editor" for editing plain ASCII files, yes.
>
> I've seen multiple editors for "plain ASCII files" that have the
> mentioned problem. Emacs comes to mind. At least in the case of Emacs,
> there is an option to automatically add a final terminator as needed,
> but that option at least used to default to being off. I recall it
> because it was part of my usual customization of Emacs to turn it on.
I've also used editors in the past that would print a warning
message whenever a file was written with an unterminated line. I
thought vi did this, but when I checked just a minute ago, it wrote
the file with no such warning. So maybe some option needs to be set
or something.
I normally use emacs, but this error has bitten me so often in the
distant past that I always make sure the last line is correct just
by instinct, I don't even think about it any more. I'll look for
the option to add the termination automatically, that looks like a
good idea. I can't think of any situation where I would want to
edit a file without the termination.
$.02 -Ron Shepard
You're right. I just tried emacs and resisted the temptation to hit
return at the end of the last line like I always do without thinking.
For the OP, I would try (in this order):
1. Set up a .rc file or something with an option to make the editor add
the trailing newline without the user having to worry about it.
2. Wrap the application in a shell script or something that first runs
a utility to add the newline if necessary. You could probably do this
in fifteen lines of C (and I don't think anyone will argue that,
whatever your opinion is of C, this is a good use for it).
3. Change the application to accommodate a truncated input file.
Louis
> In article <1jdds6z.bkp2ud6tbooyN%nos...@see.signature>,
> nos...@see.signature (Richard Maine) wrote:
> > At least in the case of Emacs,
> > there is an option to automatically add a final terminator as needed,
> > but that option at least used to default to being off. I recall it
> > because it was part of my usual customization of Emacs to turn it on.
...
> I normally use emacs, .... I'll look for
> the option to add the termination automatically,
Extracted from my personal customization files:
(setq require-final-newline t) ; Automatically add newline at end of
file.
It is possible that might be specific to xemacs, which was the
particular emacs denomination where I worshipped at the time.
Thanks for the suggestion.
Curiously, on the few compilers I was testing when I was trying to set
up test cases, I couldn't get them to generate an appropriately dodgy
input file. A record terminator (newline) got added (using the same
code sequence as you have above) when the file was closed, even in
stream access mode. I had to use a text editor to make a broken one.
I think this char by char stuff is where I'll end up going.
> Ron Shepard <ron-s...@NOSPAM.comcast.net> wrote:
>
> > In article <1jdds6z.bkp2ud6tbooyN%nos...@see.signature>,
> > nos...@see.signature (Richard Maine) wrote:
>
> > > At least in the case of Emacs,
> > > there is an option to automatically add a final terminator as needed,
> > > but that option at least used to default to being off. I recall it
> > > because it was part of my usual customization of Emacs to turn it on.
> ...
> > I normally use emacs, .... I'll look for
> > the option to add the termination automatically,
>
> Extracted from my personal customization files:
>
> (setq require-final-newline t) ; Automatically add newline at end of
> file.
>
> It is possible that might be specific to xemacs, which was the
> particular emacs denomination where I worshipped at the time.
Thanks. In the standard emacs version in MacOSX (22.1.1), that variable
is set to t when in f90 and text modes, but not necessarily in other
file modes.
$.02 -Ron Shepard
I remember this as a message from vi when it _reads_ the file:
Something like "incomplete last line"
Regards,
Arjen
Really? I used gfortran - an old one it turns out: 4.1.x, maybe that
has changed since then.
Regards,
Arjen
It does have that message. As far as I know, though, it won't
create files that way. If you edit one and save it then the
last line gets its "\n" added.
I just tested it with vim to be sure. Maybe not all vi though.
I haven't used emacs much, but it might be that it can edit
non-text files where the ability to create files without line
terminators was needed. Not so convenient for a default, but
sometimes useful.
-- glen
Not of Fortran origin is not relevant. For those who care, the example file can
be created using Fortran with either STREAM access or using the $ edit
descriptor. Let's move on to the important part.
Look at the F2003 standard. Seems pretty clear and well defined to me what this
should do. I have submitted a patch to gfortran to fix this. With PAD="yes",
which is the default if not specified, the EOR or EOF conditions should be set
and the list item result is not "undefined". With PAD="no", the results are
"undefined" and GFortran chooses to leave "buffer" untouched. I appreciate your
presenting this test case.
9.10.3 (1)
"If the pad mode has the value YES, the record is padded with blanks to satisfy
the input list item (9.5.3.4.2) and corresponding data edit descriptor that
requires more characters than the record contains. If the pad mode has the value
NO, the input list item becomes undefined."
9.5.4.3.2
"During nonadvancing input when the pad mode has the value NO, an end-of-record
condition (9.10) occurs if the input list and format specification require more
characters from the record than the record contains, and the record is complete
(9.2.2.3). If the record is incomplete, an end-of-file condition occurs instead
of an end-of-record condition.
During nonadvancing input when the pad mode has the value YES, blank characters
are supplied by the processor if an input item and its corresponding data edit
descriptor require more characters from the record than the record contains. If
the record is incomplete, an end-of-file condition occurs; otherwise an
end-of-record condition occurs."
Best regards,
Jerry
> For those who care, the example file can
> be created using Fortran with either STREAM access or using the $ edit
> descriptor. Let's move on to the important part.
The $ edit descriptor is nonstandard.
> Look at the F2003 standard.
Yes. But note that all the relevant material is new to f2003 and, in
particular, was introduced in conjunction with stream access. Note that
the definition of an incomplete record is in the section on stream
access (and that's the only way you can create such a thing with
standard f2003). Prior to the introduction of stream access, the Fortran
standard had no concept of an incomplete record.
Thus for f95, there is no standard-specified answer. For f2003, well...
the OP said "putting aside STREAM access for now." I'm not clear on why
he put stream access aside, as I think that the best answer. The ones
you cited are basically a side effect of trying to integrate stream
access with record concepts.
We are moving toward F2003 and even F2008. If the feature is there and it is
Standard complying, people should not hesitate to use it.
Putting semantics aside, a record that has a missing <EOR> seems very much
incomplete to me and I have no problem taking the broader interpretation. Side
effect or not, it makes good sense to me to do what it says. The read is
hitting the end of the file. I would suggest that if further clarification is
needed, then the standards committee should do so. (And the engineer said as he
approached the girl half the distance ... close enough! ... and he kissed her ;) )
Best Regards,
Jerry