Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Writing to a binary file ?

1,722 views
Skip to first unread message

john.chl...@gmail.com

unread,
Aug 21, 2012, 11:03:07 AM8/21/12
to
I've tried the code snippet below to create and write to a binary file (using gfortran):

integer, parameter :: N = 3
double precision, dimension( N, N ) :: M
...
open(unit=8,file='m.bin', action='readwrite', form='UNFORMATTED')

do i=1,N
write (unit=8, fmt=*) M(:, i)
end do

close (unit=8)

But all I get is:

At line 57 of file ex4.f95 (unit = 8, file = 'm.bin')
Fortran runtime error: Format present for UNFORMATTED data transfer

Again, I'm using gfortran.

---John


Richard Maine

unread,
Aug 21, 2012, 12:06:47 PM8/21/12
to
Well, yes. You can use a format only for formatted files - nor for
unformatted ones. It makes no sense to have a format for an unformatted
file. The fmt=* specifies to use * as a format. So delete the fmt=*
part.

You might have been mislead by a common terminology error - one which I
regularly correct people about specifically because it causes such
confusions. Some people incorrectly refer to fmt=* as being unformatted
because it does not specify the details of the format. Nonetheless, it
is still a format specification. Particularly, it specifies that
list-directed formatting be used.

You did not mention the details of what you need this file to look like,
but your use of the term "binary" leads me to suspect that you might
also need another change in the code. "Binary" is not technically a term
used in Fortran files. Strictly speaking, "binary" just means "base 2",
which is sort of a pointless term as everything on current computers is
stored in base 2 at some level. The silliness of that bit of terminology
is another pet peeve of mine. But when people ask for "binary" files,
what they most often mean is a file that has just the bits of the data
items written to the file, with nothing else.

A Fortran unformatted sequential file will have the bits of the data
items, but it generally will also usually have header data for each
record. This header data is added automatically by the Fortran runtime
libraries. If you read the data back in the same way using the sam
ecompiler, the compiler will know what to do with those headers and it
wil be largely transparent to you.

But if you are writing the file for the use of some other program that
does not expect the header data, you won't get what you want.

What I suspect you want is what is known in Fortran as an unformatted
stream file. That's usually what people want when they say "binary". To
write an unformatted stream file, add access='stream' to the OPEN
statement (in addition to removing the fmt=* from the WRITE statement).

--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain

john.chl...@gmail.com

unread,
Aug 21, 2012, 3:10:51 PM8/21/12
to
On Tuesday, August 21, 2012 12:06:47 PM UTC-4, Richard Maine wrote:
I'm a C coder and am translating some C++ code to Fortran'95. So I was assuming binary meant the same thing it does in C ?

BTW, when I try:

do j=1,N
write (unit=8) M(j, :)
end do

the "binary" file is 96 bytes.

When I try:

do i=1,N
do j=1,N
write (unit=8) M(j, i)
end do
end do

the "binary" file is 144 bytes.

Why?

---John

ken.fa...@gmail.com

unread,
Aug 21, 2012, 3:51:45 PM8/21/12
to
From your previous post, M is double precision, 8 bytes per value,
and N=3.

Then in the first case, M(j,:) is 3x8=24 bytes; M(:,:) is 9x8=72 bytes.

Now note that 96-24=72, and 144-72=72.

In other words, there appears to be a 72-byte overhead in the file
(see Richard Maines reply about UNFORMATTED record-oriented versus
STREAM formatted files).

Richard pointed you in the correct direction: try your experiment
again adding ",ACCESS=STREAM" to the OPEN statement and come back
here with the results.

-Ken

P.S. For record-oriented files, it is common for there to be byte
counts at the beginning and end of each record. That is not
standardized and may be different for different compilers and on
different platforms.

Ian Harvey

unread,
Aug 21, 2012, 3:58:33 PM8/21/12
to
On 2012-08-22 5:10 AM, john.chl...@gmail.com wrote:
> On Tuesday, August 21, 2012 12:06:47 PM UTC-4, Richard Maine wrote:
>> <...> wrote:
>>> open(unit=8,file='m.bin', action='readwrite', form='UNFORMATTED')
>
> I'm a C coder and am translating some C++ code to Fortran'95. So I
> was assuming binary meant the same thing it does in C ?
>
> BTW, when I try:
>
> do j=1,N
> write (unit=8) M(j, :)
> end do
>
> the "binary" file is 96 bytes.
>
> When I try:
>
> do i=1,N
> do j=1,N
> write (unit=8) M(j, i)
> end do
> end do
>
> the "binary" file is 144 bytes.
>
> Why?

Overhead for record book-keeping. Fortran files are record based in the
absence of specifiers to the contrary. Each write statement in your
code creates a new record.

Try again with an ACCESS='STREAM' specifier in the open statement. For
unformatted files that tells the processor that there is no particular
record format - the file is just a stream of "bytes".


Gordon Sande

unread,
Aug 21, 2012, 4:25:41 PM8/21/12
to
N records writen

> the "binary" file is 96 bytes.
>
> When I try:
>
> do i=1,N
> do j=1,N
> write (unit=8) M(j, i)
> end do
> end do


N*N records writen


> the "binary" file is 144 bytes.
>
> Why?

Each record has its own overhead. The overhead enables partial record
reads, backspacing
and all the things one associates with a record oriented file system.
Makes a goodly
variety of things very easy. Interfacing with C byte streams was not on
the important
list when the i/o system was defined. Stream files were added later to
make that more
workable. That is why Richard recommended them. Richard provides long,
good and very
patient answers so I suggest you read them carefully.

> ---John


john.chl...@gmail.com

unread,
Aug 21, 2012, 4:10:33 PM8/21/12
to
I tried:

open(unit=8, file='m.bin', action='readwrite', form='unformatted', access='stream')

with both:

do i=1,N
do j=1,N
write (unit=8) M(j, i)
end do
end do

and:

do j=1,N
write (unit=8) M(j, :)
end do

and now both m.bin files are 96 bytes.

BUT if I did this with C, I would get 72 (9x8) bytes. Why the extra 24 bytes and can I get rid of these bytes?

---John

glen herrmannsfeldt

unread,
Aug 21, 2012, 4:50:27 PM8/21/12
to
john.chl...@gmail.com wrote:

(snip)

> I'm a C coder and am translating some C++ code to Fortran'95.
> So I was assuming binary meant the same thing it does in C ?

Binary might, but "UNFORMATTED" doesn't mean binary.

You probably want unformatted stream, which is much closer to what
C programmers are used to.

> BTW, when I try:

> do j=1,N
> write (unit=8) M(j, :)
> end do

Back to Fortran I, and a record oriented file system, each WRITE
writes one record, each READ reads one. Records were written to tape,
with an inter-record (now inter-block) gap in between.

An unformatted READ will read a whole record, even if the I/O
list isn't that long. (Don't try reading more, though.)

> the "binary" file is 96 bytes.

> When I try:

> do i=1,N
> do j=1,N
> write (unit=8) M(j, i)
> end do
> end do

> the "binary" file is 144 bytes.

> Why?

On a record-oriented file system, like VMS or IBM's OS/360 and
successors, each writes a whole record. On byte oriented file
systems, record headers (and usually trailers) with the record
length are added before (and after) each record. (The trailers
are needed to make BACKSPACE easier.)

If you think about how unix read(2) works on tapes, you will have
a better idea how Fortran UNFORMATTED works.

-- glen

Richard Maine

unread,
Aug 21, 2012, 6:13:57 PM8/21/12
to
<john.chl...@gmail.com> wrote:

> I tried:
>
> open(unit=8, file='m.bin', action='readwrite', form='unformatted',
access='stream')
>
> with both:
>
> do i=1,N
> do j=1,N
> write (unit=8) M(j, i)
> end do
> end do
>
> and:
>
> do j=1,N
> write (unit=8) M(j, :)
> end do
>
> and now both m.bin files are 96 bytes.
>
> BUT if I did this with C, I would get 72 (9x8) bytes. Why the extra 24
> bytes and can I get rid of these bytes?

I suspect that you did not delete the old file before rewriting it. In
that case, you are just overwriting the first 72 bytes of the old file,
leaving the last ones as they were before. This, by the way, is exactly
what C would also do in writing to an existing file. Fortran stream I/O
is modelled very much after C, including even some subtle points that
many C programmers aren't aware of. (Those points are mostly in
formatted stream). If you delete the old file before running this code,
I suspect you will find that it creates a file of 72 bytes.

Unless you actually intend to overwrite the old file, I recommend using
status='replace' in the open. That will avoid the necessity of deleting
any old file before the new run. The status='replace' in an OPEN says to
replace any previous file with a new one, or just create a new one if
there is no previous file.

I would also recommend action='write' instead of action='readwrite'
unless you also intend to read from the file in the same program. While
the readwrite is not causing you any current problem, I don't see any
point to it. Specifying the actual intended use of the file can help
catch nasty bugs. I *HAVE* seen people destroy what were supposed to be
input files by acidentally writing on them (perhaps just by using a
wrong unit number). It is less obvious what problems could occur from
specifying readwrite instead of write (perhaps sharing isues), but I'm
still a fan of saying exactly what is intended when that is just as easy
to do as saying something that ought to work, but isn't precisely what
you mean.

Oh, and as an aside in answer to a previous post, "binary" is not a
Fortran term at all, so it would be pretty hard for it to mean the same
thing in Fortran as it does in C. Note that nothing in the Fortran code
you posted uses that word. There have in the past been some nonstandard
extensions with various syntax, some of which used the term "binary",
but nothing in the standard. Yes, I suspect those extensions were
motivated by the C terminology (which I still regard as bizzare). The
unformatted stream of f2003 provides a standard way to do what those
nonstandard extensions.

Gib Bogle

unread,
Aug 21, 2012, 6:30:52 PM8/21/12
to
On 22/08/2012 8:10 a.m., john.chl...@gmail.com wrote:

> I tried:
>
> open(unit=8, file='m.bin', action='readwrite', form='unformatted', access='stream')
>
> with both:
>
> do i=1,N
> do j=1,N
> write (unit=8) M(j, i)
> end do
> end do
>
> and:
>
> do j=1,N
> write (unit=8) M(j, :)
> end do
>
> and now both m.bin files are 96 bytes.
>
> BUT if I did this with C, I would get 72 (9x8) bytes. Why the extra 24 bytes and can I get rid of these bytes?
>
> ---John
>

To simplify matters slightly, I tested with M various sizes of integer
(I'm using Intel Fortran):

integer, parameter :: N = 3
integer(1), dimension( N, N ) :: M
integer :: i, j

open(unit=8,file='m.bin', action='readwrite', form='UNFORMATTED')

M = 1
!write (unit=8) ((M(i,j),j=1,N),i=1,N)
write (unit=8) M

close (unit=8)
end

(Note that in Fortran if just the array name is given the whole array is
written).

I examine the file contents with a binary editor (XVI32).

With integer(1) the file size is 9 + 8 = 17 bytes, and it contains these
(decimal) bytes:
9 0 0 0 1 1 1 1 1 1 1 1 1 9 0 0 0

With integer(2) the file size is 18 + 8 = 26 bytes, and it contains:
12 0 0 0
1 0
1 0
1 0
1 0
1 0
1 0
1 0
1 0
1 0
12 0 0 0

With integer(4) the file size is 36 + 8 = 44 bytes, and it contains:
24 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
24 0 0 0

With M real (i.e. real(4)), the file starts with 24 0 0 0 (as with
integer(4)), while in the double precision (i.e. real(8))) case it
starts with 48 0 0 0. In all cases except integer(1) the first byte is
proportional to the size of the data type. No doubt someone will
explain this. In any case, clearly the data are preceded and followed
by a 4-byte integer.

glen herrmannsfeldt

unread,
Aug 21, 2012, 6:54:01 PM8/21/12
to
Richard Maine <nos...@see.signature> wrote:

(snip, someone wrote)
>> BUT if I did this with C, I would get 72 (9x8) bytes. Why the extra 24
>> bytes and can I get rid of these bytes?

> I suspect that you did not delete the old file before rewriting it. In
> that case, you are just overwriting the first 72 bytes of the old file,
> leaving the last ones as they were before. This, by the way, is exactly
> what C would also do in writing to an existing file.

It won't normally do that. If you open the file to write:

out=fopen(file, "w");

then any existing file is truncated.

If you open for read/write ("r+" as the second argument) then
it won't be truncated, and old data would stay. I am not sure what
C does if you open for read ("r") and start writing.

> Fortran stream I/O
> is modelled very much after C, including even some subtle points that
> many C programmers aren't aware of. (Those points are mostly in
> formatted stream). If you delete the old file before running this code,
> I suspect you will find that it creates a file of 72 bytes.

> Unless you actually intend to overwrite the old file, I recommend using
> status='replace' in the open. That will avoid the necessity of deleting
> any old file before the new run. The status='replace' in an OPEN says to
> replace any previous file with a new one, or just create a new one if
> there is no previous file.

> I would also recommend action='write' instead of action='readwrite'
> unless you also intend to read from the file in the same program. While
> the readwrite is not causing you any current problem, I don't see any
> point to it. Specifying the actual intended use of the file can help
> catch nasty bugs. I *HAVE* seen people destroy what were supposed to be
> input files by acidentally writing on them (perhaps just by using a
> wrong unit number). It is less obvious what problems could occur from
> specifying readwrite instead of write (perhaps sharing isues), but I'm
> still a fan of saying exactly what is intended when that is just as easy
> to do as saying something that ought to work, but isn't precisely what
> you mean.

Following the C convention, "write" would truncate, and "readwrite"
would not. Though I agree that status='replace' is a better choice.

> Oh, and as an aside in answer to a previous post, "binary" is not a
> Fortran term at all, so it would be pretty hard for it to mean the same
> thing in Fortran as it does in C. Note that nothing in the Fortran code
> you posted uses that word. There have in the past been some nonstandard
> extensions with various syntax, some of which used the term "binary",
> but nothing in the standard. Yes, I suspect those extensions were
> motivated by the C terminology (which I still regard as bizzare). The
> unformatted stream of f2003 provides a standard way to do what those
> nonstandard extensions.

C almost requires binary arithmetic. Unsigned arithmetic is modulo some
power of two. It would be possible to do that on a machine using
another base, but the overhead would be high.

But note also that "binary" and "bit" have a few different meanings
in computer science and engineering. First, they can indicate values
with a radix of two. They can also describe a set of values that
can be either 0 or 1, but that don't necessarily have associated
place values. Bit is also used in information theory, where it can
have fractional values. (A decimal digit has about 3.32 bits of
information. Don't try storing it in 3.32 memory cells.)

As unix systems only allow 255 different characters within a record
of a text file, even though they are stored in bits the result, to me,
isn't a binary value. You wouldn't be happy with a Fortran system
that didn't allow INTEGER variables to have the value 10, 266, 522,
778, 1034, etc.

-- glen

glen herrmannsfeldt

unread,
Aug 21, 2012, 7:06:26 PM8/21/12
to
Gib Bogle <g.b...@too.auckland.much.ac.spam.nz> wrote:

(snip)

> To simplify matters slightly, I tested with M various sizes of integer
> (I'm using Intel Fortran):

> integer, parameter :: N = 3
> integer(1), dimension( N, N ) :: M
> integer :: i, j

> open(unit=8,file='m.bin', action='readwrite', form='UNFORMATTED')

> M = 1
> write (unit=8) M

(snip)

> (Note that in Fortran if just the array name is given the whole
> array is written).

> I examine the file contents with a binary editor (XVI32).

> With integer(1) the file size is 9 + 8 = 17 bytes, and it contains these
> (decimal) bytes:
> 9 0 0 0 1 1 1 1 1 1 1 1 1 9 0 0 0

(snip)

> With M real (i.e. real(4)), the file starts with 24 0 0 0 (as with
> integer(4)), while in the double precision (i.e. real(8))) case it
> starts with 48 0 0 0. In all cases except integer(1) the first byte is
> proportional to the size of the data type. No doubt someone will
> explain this. In any case, clearly the data are preceded and followed
> by a 4-byte integer.

They are four byte record length indicators in little endian order.
(Each not including the length indicator itself.)

(The trailer needed for BACKSPACE to work efficiently.)

-- glen

glen herrmannsfeldt

unread,
Aug 21, 2012, 7:10:16 PM8/21/12
to
Gib Bogle <g.b...@too.auckland.much.ac.spam.nz> wrote:

(snip)

> With M real (i.e. real(4)), the file starts with 24 0 0 0 (as with
> integer(4)), while in the double precision (i.e. real(8))) case it
> starts with 48 0 0 0. In all cases except integer(1) the first byte is
> proportional to the size of the data type. No doubt someone will
> explain this. In any case, clearly the data are preceded and followed
> by a 4-byte integer.

Oh, the values you print are in base 16, though that may
not be obvious. They are 9, 18, 36, and 72 which print
out as 09, 12, 24, and 48.

-- glen

Gib Bogle

unread,
Aug 21, 2012, 7:36:38 PM8/21/12
to
Ah, my mistake.

Richard Maine

unread,
Aug 21, 2012, 9:25:20 PM8/21/12
to
glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:

> Richard Maine <nos...@see.signature> wrote:
>
> (snip, someone wrote)
> >> BUT if I did this with C, I would get 72 (9x8) bytes. Why the extra 24
> >> bytes and can I get rid of these bytes?
>
> > I suspect that you did not delete the old file before rewriting it. In
> > that case, you are just overwriting the first 72 bytes of the old file,
> > leaving the last ones as they were before. This, by the way, is exactly
> > what C would also do in writing to an existing file.
>
> It won't normally do that. If you open the file to write:
>
> out=fopen(file, "w");
>
> then any existing file is truncated.
>
> If you open for read/write ("r+" as the second argument) then
> it won't be truncated, and old data would stay. I am not sure what
> C does if you open for read ("r") and start writing.

Well, Fortran won't "normally" do that either, as long as I'm allowed to
define "normally" to mean whatever seems convenient to make the point at
the time. In particular, I would "normally" open a file with
status='replace' if I wanted to completely replace an old file.

The exact spellings of the C and Fortran statements are, of course
different. But I repeat the contention that Fortran stream I/O was
modelled after C. I *KNOW* that to be what it is modelled after; it
isn't just a guess. I personally wrote the original proposal to add
stream I/O to the Fortran standard, and I have a pretty good idea what I
modelled it after. The formatted stream stuff was added after my
original paper, but I helped work on that also and again, I know that it
was intentionally modelled after C, particularly including the issue of
file truncation, which is messier for the formatted case.

Robin Vowels

unread,
Aug 21, 2012, 9:56:32 PM8/21/12
to
There are overheads.
For each executed WRITE, there's a length written.
The length may be at both ends of the record.
Typically, the length information is 4 bytes.
Thus, for the second example, you have 9 x (8+4+4) = 144 bytes.
For the first example, you have 3 x (3 x 8 + 4 + 4) = 96 bytes.

john.chl...@gmail.com

unread,
Aug 21, 2012, 11:06:34 PM8/21/12
to
You are correct, I didn't delete m.bin. And yes, when I did delete the file and ran the code, the file size is 72.

Thanks for the help!

---John

glen herrmannsfeldt

unread,
Aug 22, 2012, 2:58:08 AM8/22/12
to
Richard Maine <nos...@see.signature> wrote:

(snip)
>> > I suspect that you did not delete the old file before rewriting it. In
>> > that case, you are just overwriting the first 72 bytes of the old file,
>> > leaving the last ones as they were before. This, by the way, is exactly
>> > what C would also do in writing to an existing file.

(snip, then I wrote)
>> It won't normally do that. If you open the file to write:

>> out=fopen(file, "w");

>> then any existing file is truncated.

(snip on open with "r+")

> Well, Fortran won't "normally" do that either, as long as I'm allowed to
> define "normally" to mean whatever seems convenient to make the point at
> the time. In particular, I would "normally" open a file with
> status='replace' if I wanted to completely replace an old file.

In the early days of Fortran, it was much more common than now
to write a file, and then read it back in the same run of a
program. (And also, before the OPEN statement.)

I remember some complications with OS/360 related to the way
Fortran opened files. It opened them in either INOUT or OUTIN
(OS/360 terms). With the appropriate JCL option, you could convince
it to open files INPUT, which was sometimes important.

Also, there is no OS/360 JCL option that says to overwrite a file if
it exists, and create one if it doesn't.

In later years, and with languages developed later, it is more
usual to either read or write a file, but not both. The usual
C open options, "r" for read, "w" for write, make that easy,
including truncating existing files with "w". I believe C isn't
supposed to let you write to files opened "r", and the system I
tested it on didn't, but I couldn't be sure that no C systems
would do it.

Yes, open with "r+" will not truncate the file, allows you to
read or write to it. This is often used for direct access files
(which aren't special in C, other than the way you use them).

> The exact spellings of the C and Fortran statements are, of course
> different. But I repeat the contention that Fortran stream I/O was
> modelled after C. I *KNOW* that to be what it is modelled after; it
> isn't just a guess. I personally wrote the original proposal to add
> stream I/O to the Fortran standard, and I have a pretty good idea what I
> modelled it after. The formatted stream stuff was added after my
> original paper, but I helped work on that also and again, I know that it
> was intentionally modelled after C, particularly including the issue of
> file truncation, which is messier for the formatted case.

What you do with a file once opened seems to work much like C,
but the OPEN options haven't changed that much. I presume that
status='replace' works about the same for STREAM or non-stream
file opens.

Anyway, both Fortran and C have ways to open files for read/write,
that don't truncate on open and allow writing over parts of files.
Don't use those options if you don't want that effect.

-- glen

glen herrmannsfeldt

unread,
Aug 22, 2012, 3:04:25 AM8/22/12
to
john.chl...@gmail.com wrote:
> On Tuesday, August 21, 2012 6:13:57 PM UTC-4, Richard Maine wrote:
>> <john.chl...@gmail.com> wrote:

(snip)
>> > open(unit=8, file='m.bin', action='readwrite', form='unformatted',
>> access='stream')

(snip)
>> I suspect that you did not delete the old file before rewriting it. In
>> that case, you are just overwriting the first 72 bytes of the old file,
>> leaving the last ones as they were before. This, by the way, is exactly

Why open the file with action='readwrite'

>> what C would also do in writing to an existing file. Fortran stream I/O
>> is modelled very much after C, including even some subtle points that
>> many C programmers aren't aware of.

I suppose not, if they never open files the C equivalent of
action='readwrite'. Only do that if you plan to both read
and write the file.


(snip)
>> I would also recommend action='write' instead of action='readwrite'
>> unless you also intend to read from the file in the same program. While
>> the readwrite is not causing you any current problem, I don't see any
>> point to it. Specifying the actual intended use of the file can help
>> catch nasty bugs. I *HAVE* seen people destroy what were supposed to be
>> input files by acidentally writing on them (perhaps just by using a
>> wrong unit number). It is less obvious what problems could occur from
>> specifying readwrite instead of write (perhaps sharing isues), but I'm
>> still a fan of saying exactly what is intended when that is just as easy
>> to do as saying something that ought to work, but isn't precisely what
>> you mean.

Yes, also causes no truncation before writing.

(snip)

> You are correct, I didn't delete m.bin. And yes, when I did delete
> the file and ran the code, the file size is 72.

Or change the action.

-- glen

Richard Maine

unread,
Aug 22, 2012, 5:50:55 AM8/22/12
to
glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:

> john.chl...@gmail.com wrote:
>
> > You are correct, I didn't delete m.bin. And yes, when I did delete
> > the file and ran the code, the file size is 72.
>
> Or change the action.

That's not likely to help with the problem of failing to truncate. While
I do (and did) suggest changing the action, it is the status - not the
action - that will help with the file size problem. Specifying
action='write" does not imply truncation. Yes, it is valid and even
sometimes useful to open an existing file with status='write" and not
necessarily truncate it. One reasonably common example involves pairing
status='write' and position='append'; that's the usua; case when writing
to a log file.
0 new messages