Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

INQUIRE - FORM vs. (UN)FORMATTED

23 views
Skip to first unread message

Tobias Burnus

unread,
Dec 8, 2006, 4:16:17 AM12/8/06
to
Hello,

I'm trying to understand the FORM, FORMATTED and UNFORMATTED specifiers
of INQUIRE.

Let's start with FORM:
"The scalar-default-char-variable in the FORM= specifier is assigned
the value FORMATTED if the file is connected for formatted
input/output, and is assigned the value UNFORMATTED if the file is
connected for unformatted input/output. If there is no connection, it
is assigned the value UNDEFINED."

Ok, this is clear to me. Especially, it means given a UNIT (or a FILE
name) I can determine whether a file is connected and, if yes, whether
I need to access it as unformatted or formatted. I see many uses for
procedures where the UNIT is passed.


Ok, now comes the specifiers for which I don't understand what is (a)
exactly the purpose and (b) what value should be assigned when. (I talk
about FORMATTED below, but with UNFORMATTED I have exactly the same
problems.)

"The scalar-default-char-variable in the FORMATTED= specifier is
assigned the value YES if FORMATTED is included in the set of allowed
forms for the file, NO if FORMATTED is not included in the set of
allowed forms for the file, and UNKNOWN if the processor is unable to
determine whether or not FORMATTED is included in the set of allowed
forms for the file."

First case: INQUIRE(NAME=existing_file,...), which is not
connected/opened. Well, "UNKNOWN" can of cause always returned, but are
there scenarios where one would return FORMATTED="YES" or "NO"?
For a directory, one could e.g. return "NO". And in principle if a file
is readable, writtable and seekable there should be no problem opening
it - either as FORMATTED or as UNFORMATTED, whether one can read
something from the file is another question.

gfortran and g95 currently return for non-opened files: UNIT: "UNKNOWN"
NAME: Does not exists "UNKNOWN", directory "NO", File (regular,
block/character device, named pipe) "YES", else "UNKNOWN".
NAG f95, ifort and sunf95 return "UNKNOWN" which is always right.

Is the behaviour of gfortran/g95 correct?


Second case: The file is opened.
What should be returned for (UN)FORMATTED? Simply FORMATTED = "YES" if
FORM="FORMATTED" and otherwise "NO"? If this is the case, what is the
purpose of the FORMATTED vs. UNFORMATTED?
Or should be returned whether it is in principle possible to open the
file as (UN)FORMATTED? Then one would always return "YES" (I'm ignoring
issues like whether a file is seekable.)?

gfortran/g95 return always "YES" in this case,
ifort/NAG f95/sunf95 return FORMATTED="YES", UNFORMATTED="NO" if the
file is opened as FORMATTED.

In total, my problem is: What is meant by "in the set of allowed forms
for the file"? Depending how I understand it, I would either expect the
g* behaviour or the ifort/(sun)f95 behaviour.

Especially, for the ifort/sunf95/f95 behaviour I completely miss the
purpose of the (UN)FORMATTED specifier as they provide effectively the
same information as the FORM specifier.


Tobias

PS: Besides an interpretation of the standard, I'm also interested what
would be useful to return - as "this is implementation dependent" has
also somehow to be implemented.

Richard Maine

unread,
Dec 8, 2006, 12:15:42 PM12/8/06
to
Tobias Burnus <bur...@net-b.de> wrote:

> I'm trying to understand the FORM, FORMATTED and UNFORMATTED specifiers
> of INQUIRE.
>

> Let's start with FORM:..


> Ok, this is clear to me. Especially, it means given a UNIT (or a FILE
> name) I can determine whether a file is connected and, if yes, whether
> I need to access it as unformatted or formatted.

Well... I don't like your use of "need" there. While correct, I find it
potentially misleading. It tells you whether the file was opened (aka
connected) with formatted or unformatted. You "need" to use the way that
corresponds to the current open, but this says nothing about whether or
not you might also be able to use the other way on the same file with a
different open (at a different time).

> First case: INQUIRE(NAME=existing_file,...), which is not
> connected/opened. Well, "UNKNOWN" can of cause always returned, but are
> there scenarios where one would return FORMATTED="YES" or "NO"?

Certainly. I suspect you must be thinking of implementation issues for
current operating systems, because the answer seems so obvious in
general. If the file can definitely be opened as a formatted file, it
returns yes. If it definitely cannot be, it returns no. But I just
restated what th e standard said without even changing the words much. I
can't think of a more obvious way to say it.

Now most current operating systems don't have a good way to tell. So I
suspect most compilers will probably return unknown. That's likely the
practical answer.

> For a directory, one could e.g. return "NO". And in principle if a file
> is readable, writtable and seekable there should be no problem opening
> it - either as FORMATTED or as UNFORMATTED, whether one can read
> something from the file is another question.

Not necessarily so. You are thinking of particular implementations. It
is quite possible that a system might refuse to open a file the "wrong"
way, and even that this could be at a level below that dealt with by the
Fortran run-time library. It is also quite plausible that the Fortran
run-time library might be "smart enough" to notice the problem and make
the OPEN fail. It could be regarded as a better approach than opening it
even though nothing will work. The Fortran library might get the data
from the OS. Most OSes today don't keep such data, but some have/do. Or
the Fortran run-times might even look at the first part of the file and
deduce what's up.

> gfortran and g95 currently return for non-opened files: UNIT: "UNKNOWN"
> NAME: Does not exists "UNKNOWN", directory "NO", File (regular,
> block/character device, named pipe) "YES", else "UNKNOWN".
> NAG f95, ifort and sunf95 return "UNKNOWN" which is always right.
>
> Is the behaviour of gfortran/g95 correct?

Correct? Yes. The most user-friendly? That's more arguable. Worth doing
better? That's also arguable. Since current OSes don't tend to provide a
handly way to get this data reliably, I don't generally recommend that
users depend on it.

> Second case: The file is opened.
> What should be returned for (UN)FORMATTED? Simply FORMATTED = "YES" if
> FORM="FORMATTED" and otherwise "NO"?

That one is definitely wrong. The YES part is likely correct, although
arguable pedanticly. If you suceeded in opening the file with formatted,
then presumably it is allowed. If it wasn't allowed, then one would have
hoped the OPEN would fail instead of suceeding and then claiming (vioa
inquire) that you weren't allowed to do that. Since error conditions are
compiler-defined, the pendant might argue differently, but that would
seem silly.

But just because the file is opened formatted says little about whether
it is capable of being opened unformatted. I suppose that if the system
was such that only one of the opens would work, you might make
thatdeduction, but that's not what I hear you saying. The standard says
"set of allowed" forms" - not "currently used form".

> what is the purpose of the FORMATTED vs. UNFORMATTED?
> Or should be returned whether it is in principle possible to open the
> file as (UN)FORMATTED?

Yes, that's the idea. I don't know how to state igt any more explicitly
than the standard.

> Then one would always return "YES" (I'm ignoring
> issues like whether a file is seekable.)?

Not necessarily so. I see you making assumptions here based on common
current systems. There is nothing inherently so about that. It might
well not be possible in principle to open a file in the wrong mode. The
standard is broader than that (and was written at a time when there was
more diversity in operating systems).



> In total, my problem is: What is meant by "in the set of allowed forms
> for the file"?

I hope I explained, but I'm not sure. You need to back off from thinking
that what current common operating systems do is the only way. The words
seem simple to me. I think the problem arrises because the facility
isn't very useful with current operating systems and you are
overgeneralizing that. So you see a facility that seems useless and are
wondering why such a thing would ever be. THat's because it could be
(was) useful in other contexts.

> PS: Besides an interpretation of the standard, I'm also interested what
> would be useful to return - as "this is implementation dependent" has
> also somehow to be implemented.

I'd probably just punt and return UNKNOWN for almost all cases, since
you basically can't tell without extra fuss - and users who depend on
you going to the trouble are going to have code portability problems
anyway. If you can detect some obvious cases, then it is ok to give the
data, but I wouldn't go out of yourt way. Just my opinion.

--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain

Tobias Burnus

unread,
Dec 8, 2006, 12:31:42 PM12/8/06
to
Hello,

Tobias Burnus wrote:
> Ok, now comes the specifiers for which I don't understand what is (a)
> exactly the purpose and (b) what value should be assigned when. (I talk
> about FORMATTED below, but with UNFORMATTED I have exactly the same
> problems.)

Ok, I found something about this from 1998:
http://groups.google.com/group/comp.lang.fortran/browse_thread/thread/661e9ecf0a74d8d/

Namely the answer by Richard Main to the question: "Is there another
way (portable) to check if file is formatted or unformatted?"

* * *

"Closer to what you are asking for are the FORMATTED= and
UNFORMATTED= specifier in INQUIRE. THey tell you whether the
file can be connected for formatted or unformatted I/O. Note
that it is possible for both of them to return "YES". It
is also allowed for them to return "UNKNOWN", which means
that you can't count on them to be definitive.

"On unix systems, there basically is not a reliable method to
tell. The operating system doesn't keep track of such things.
All you (or the compiler runtimes) can do is look at the contents
of the file to see if it look plausible as formatted or
unformatted. But this isn't infallable. It is possible to
write *ANY* content to the file with unformatted direct access
I/O - this includes content that might look a lot like some
other form (or might start out looking like some other form
and then change half-way through the file)."

* * *

Thus, I think for a not-opend, existing file, both "YES" (can be opened
in either way, reading might give garbbage) and "UNKNOWN" (the more
careful approach) make sense.

For as unformatted opened files, I'm inclined to think formatted="NO"
is the better choice, even if the file could be indeed closed and then
opened as formatted file (at least for empty files this should work
just fine). But also here, both should be ok.

Any other suggestion how an implementation should behave to please
users (with the loose restriction that it should be standard conform)?

Tobias

glen herrmannsfeldt

unread,
Dec 8, 2006, 1:43:12 PM12/8/06
to
Richard Maine <nos...@see.signature> wrote:
> Tobias Burnus <bur...@net-b.de> wrote:

>> I'm trying to understand the FORM, FORMATTED and UNFORMATTED specifiers
>> of INQUIRE.
(snip)


> Certainly. I suspect you must be thinking of implementation issues for
> current operating systems, because the answer seems so obvious in
> general. If the file can definitely be opened as a formatted file, it
> returns yes. If it definitely cannot be, it returns no. But I just
> restated what th e standard said without even changing the words much. I
> can't think of a more obvious way to say it.

> Now most current operating systems don't have a good way to tell. So I
> suspect most compilers will probably return unknown. That's likely the
> practical answer.

For OS derived from OS/360, UNFORMATTED must be RECFM=VBS and
FORMATTED must not be VBS. I believe that VMS also has record
format information in its file system that could be used.

Unix and DOS/Windows only give a stream of bytes which must
contain any information needed.

It would have been nice for UNFORMATTED files to contain a unique
byte sequence at the beginning, one that would be unlikely to
occur on a FORMATTED file. Many unix file formats are designed
to be recognized by the first few bytes, and the unix file command
will read those and tell you what kind of file it is. As far
as I know, that is not usually done for unix/DOS/windows
UNFORMATTED files.

-- glen

robert....@sun.com

unread,
Dec 9, 2006, 2:56:02 AM12/9/06
to

Tobias Burnus wrote:
> Hello,
>
> I'm trying to understand the FORM, FORMATTED and UNFORMATTED specifiers
> of INQUIRE.

One of the nastier cases arises when a file is implicitly
opened as a result of a BACKSPACE, ENDFILE, or
REWIND statement on a unit that is not yet connected
to a file. In Fortran 90/95, the file would be known to be
sequential access, but whether it was formatted or
unformatted would not be determined. In Fortran 2003,
it is no longer known to be sequential access.

Implicitly opening files is an extension to the standard,
but because it was the only way to open files in
FORTRAN 66, many programs did it. Most modern
Fortran implementations still support such usage.

Bob Corbett

Terence

unread,
Dec 9, 2006, 6:13:17 PM12/9/06
to
I just don't agree.
You can't ask that question and get a sensible answer, EVEN if it is
now open!
It may be open wrongly.

I can open any file as SEQUENTIAL, BINARY, and read a few blocks
(sectors) of data and tell the probability of it having been written as
FORMATTED (with cr-lf code pairs but never separate cr or lf anywhere,
and no other characters of value 0-31 except a final 1A followed only
by more 1A characters.

If not the case, you probably have binary data, in which case you just
have to look for bridging codes to see if the data was written as
sequential unformatted, or else non-standard (best-to-use) sequential
binary (known as 'transparent' by some).

Then you close the final and open it as you have decided ti should be
read.
Note: even sequential binary files can be written to produce a valid
sequential formatted output format (or any other format with care).

Richard E Maine

unread,
Dec 9, 2006, 10:14:40 PM12/9/06
to
Terence <tbwr...@cantv.net> wrote:

> I can open any file as SEQUENTIAL, BINARY,....


> Note: even sequential binary files can be written to produce a valid
> sequential formatted output format (or any other format with care).

This all refers to particular compilers on particular operating systems.
It is simply not true in general.

For a start, BINARY is nonstandard and is not supported on all
compilers. (And on some, there is something like that, but spelled a
different way).

Second, it is simply not true that all operating systems allow you to
open any file with "wrong" parameters. Most currently popular operating
systems are like that, but it is not true in general.

--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain| experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain

Terence

unread,
Dec 10, 2006, 2:52:02 AM12/10/06
to
Nor the operating system nor even the compiler matters, as long as the
external storage device is capable of being written and read by one
unit of strorage (usually a multiple-of-8 bit wide datum) at a time, or
an equivalence to that can be constructed in intermediary hardware or
software programming.

An equivalence for devices with physically limited access blocks (e.g.
a tape drive) uses a buffer, to or from which units of storage are
written or extracted and the buffer written or read over as applicable.
DIsc drives do this with hardware buffers for sector multiples.
The operating system adds another intermediary software buffering

In which case, IF there are these four major Fortran language file
format alternatives to chose among to determine which format was used
to store the data on the device, and therefore with which format to
open the file, (to make life easier), the process I described more
simply before still applies.

Now which description do you prefer?
Isn't the object to help enquirers with common-sense descriptions that
are understandable? This wasn't a profound problem in the first place!

I'm always getting data files sent to me, to work out not only how they
were written, but what was encoded and how. Knowing that, you already
know something about the nature of the data. Assuming no encryptation
of the file, a hex/ascii dump quickly tells the eys if you have text or
binary-coded numbers and values (or a mixture). For binary files,
statistics on the 256 possible bit combinations found along the data,
and spacings between recurrences yield more information and points to
data types. Most numeric data is of positive sign on the whole, so
mantissas can generally be found where suspected, since other bytes
have even probability of a top (sign?) bit set, except for small
integers. And sequences of hex binary zeroes points to word lengths.

Now you know why I use device physical-length-blocked DIRECT
UNFORMATTED or SEQUENTIAL BINARY (where applicable) to read unknown
data files.

Dr Ivan D. Reid

unread,
Dec 10, 2006, 4:46:59 AM12/10/06
to
On 9 Dec 2006 23:52:02 -0800, Terence <tbwr...@cantv.net>

wrote in <1165737122.2...@l12g2000cwl.googlegroups.com>:
> Nor the operating system nor even the compiler matters, as long as the
> external storage device is capable of being written and read by one
> unit of strorage (usually a multiple-of-8 bit wide datum) at a time, or
> an equivalence to that can be constructed in intermediary hardware or
> software programming.

Try doing that on OpenVMS...

--
Ivan Reid, School of Engineering & Design, _____________ CMS Collaboration,
Brunel University. Ivan.Reid@[brunel.ac.uk|cern.ch] Room 40-1-B12, CERN
KotPT -- "for stupidity above and beyond the call of duty".

Jim

unread,
Dec 10, 2006, 1:23:42 PM12/10/06
to

"Dr Ivan D. Reid" <Ivan...@brunel.ac.uk> wrote in message
news:slrnennlsj.2...@loki.brunel.ac.uk...
Just to amplify, the Record Management System (through which nearly all file
system requests pass) will not allow any program compiled with whatever
compiler to open an existing file with parameters which do not match the
parameters which were used to create the file. I should point out that
there are four different schemes for writing formatted files (VFC,
Stream-CR, Stream-LF, and Stream-CRLF). The programmer must know which of
these formats that file contains.

Nevertheless, a determined programmer can bypass RMS by appropriate use of
system services. Such a method eliminates Fortran and RMS overhead (but it
is most assuredly not portable to any other operating system).

Jim

Jim


me...@skyway.usask.ca

unread,
Dec 10, 2006, 8:07:44 PM12/10/06
to
In a previous article, "Jim" <j...@nospam.com> wrote:
>
>"Dr Ivan D. Reid" <Ivan...@brunel.ac.uk> wrote in message
>news:slrnennlsj.2...@loki.brunel.ac.uk...
>> On 9 Dec 2006 23:52:02 -0800, Terence <tbwr...@cantv.net>
>> wrote in <1165737122.2...@l12g2000cwl.googlegroups.com>:
>>> Nor the operating system nor even the compiler matters, as long as the
>>> external storage device is capable of being written and read by one
>>> unit of strorage (usually a multiple-of-8 bit wide datum) at a time, or
>>> an equivalence to that can be constructed in intermediary hardware or
>>> software programming.
>>
>> Try doing that on OpenVMS...
>>
It can be done if you specify the record length (in words)
and unformatted in the open statement
from Alpha vms: (864 byte o/p )

open(18,file=gname,form='unformatted',
1 access='sequential',recordtype='fixed',
1 status='new',recl=216,err=860,iostat=ios)

Chris

Terence

unread,
Dec 11, 2006, 1:44:57 AM12/11/06
to
Just to clarify.
Any file is bits on a device.
As far as I know few operating systems actually store in the directory,
any information as to how the data was written, and especially NOT the
language used.

It's a another case "anything a man can do, another man can undo".

glen herrmannsfeldt

unread,
Dec 11, 2006, 2:27:48 AM12/11/06
to
Terence wrote:
> Just to clarify.
> Any file is bits on a device.
> As far as I know few operating systems actually store in the directory,
> any information as to how the data was written, and especially NOT the
> language used.

It was the great invention of unix not to store any other
information other than the bits. Until unix, it was more usual
to store some extra information. Even so, it still isn't so unusual.
MacOS has the resource fork where some other information can go.
NTFS keeps the resource fork for MacOS file sharing.

IBM's MVS, as a descendant of OS/360 has RECFM=VBS (RECord ForMat),
which I believe was specifically designed for Fortran UNFORMATTED
files. You can't read/write VBS with FORMATTED I/O, and can't
read/write other than VBS for UNFORMATTED. VBS includes the record
length descriptors as part of the file structure, along with the
ability to write blocks longer than a disk track.

VMS also has a file system with record structures, though I believe
it is built on 512 byte blocks. It is possible to change the
description in the directory (SET FILE/ATTRIBUTES) and read a
file using a different structure.

-- glen

glen herrmannsfeldt

unread,
Dec 11, 2006, 2:47:53 AM12/11/06
to
me...@skyway.usask.ca wrote:

(snip)

>>>Try doing that on OpenVMS...

> It can be done if you specify the record length (in words)
> and unformatted in the open statement
> from Alpha vms: (864 byte o/p )

> open(18,file=gname,form='unformatted',
> 1 access='sequential',recordtype='fixed',
> 1 status='new',recl=216,err=860,iostat=ios)

I would use SET FILE/ATTRIBUTES first to change the record
format attributes.

-- glen

Jan Vorbrüggen

unread,
Dec 11, 2006, 3:52:34 AM12/11/06
to
> It would have been nice for UNFORMATTED files to contain a unique
> byte sequence at the beginning, one that would be unlikely to
> occur on a FORMATTED file.

Well, for the usual format of UNFORMATTED files on most current OSes, you
could read the first four or eight bytes; if the value is reasonable as a
record length (given, for instance, the constraint set by the file's size)
you read the record marker indicated by that length, and if they are equal,
you likely have an UNFORMATTED file. It seems highly unlikely this would ever
be true of a formatted file, but it is still just a heuristic.

Jan

Jan Vorbrüggen

unread,
Dec 11, 2006, 3:57:03 AM12/11/06
to
> I should point out that there are four different schemes for writing
> formatted files (VFC, Stream-CR, Stream-LF, and Stream-CRLF).

There are even more. There are fixed-length records, variable length records
(VFC is with fixed control - that's the FC), and ISAM files, which you can
also read sequentially (been there, done that). Likely missed a few.

Jan

0 new messages