Re: [SciPy-User] writing data to binary for fortran

467 views
Skip to first unread message

Kurt Smith

unread,
May 11, 2010, 3:29:05 PM5/11/10
to SciPy Users List
On Tue, May 11, 2010 at 2:09 PM, Gideon <gideon....@gmail.com> wrote:
> I've previously used the FortranFile.py to read in binary data
> generated by fortran computations, but now I'd like to write data from
> NumPy/SciPy to binary which can be read in by a fortran program.  Does
> anyone have an example of using fortranfile.py to create and write
> data to binary?  Alternatively, can anyone suggest a way to write
> numpy arrays to binary in away that permits me to specify the correct
> offset (4 bytes on my machine) for fortran to then properly read the
> data in?

I have a couple of simple fortran reading/writing routines (in python)
that work with numpy arrays. I've attached them to this email -- use
as you see fit. Hopefully they help, or at least show you how to do
what you want.

Kurt
fio.py

Gideon

unread,
May 11, 2010, 3:09:32 PM5/11/10
to scipy...@scipy.org
I've previously used the FortranFile.py to read in binary data
generated by fortran computations, but now I'd like to write data from
NumPy/SciPy to binary which can be read in by a fortran program. Does
anyone have an example of using fortranfile.py to create and write
data to binary? Alternatively, can anyone suggest a way to write
numpy arrays to binary in away that permits me to specify the correct
offset (4 bytes on my machine) for fortran to then properly read the
data in?
_______________________________________________
SciPy-User mailing list
SciPy...@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user

--
You received this message because you are subscribed to the Google Groups "SciPy-user" group.
To post to this group, send email to scipy...@googlegroups.com.
To unsubscribe from this group, send email to scipy-user+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/scipy-user?hl=en.

Neil Martinsen-Burrell

unread,
May 11, 2010, 3:58:37 PM5/11/10
to SciPy Users List, Gideon
On 2010-05-11 14:09, Gideon wrote:
> I've previously used the FortranFile.py to read in binary data
> generated by fortran computations, but now I'd like to write data from
> NumPy/SciPy to binary which can be read in by a fortran program. Does
> anyone have an example of using fortranfile.py to create and write
> data to binary? Alternatively, can anyone suggest a way to write
> numpy arrays to binary in away that permits me to specify the correct
> offset (4 bytes on my machine) for fortran to then properly read the
> data in?

You can use the writeReals method of a FortranFile object:

In [1]: import fortranfile

In [2]: import numpy as np

In [3]: F = fortranfile.FortranFile('test.unf',mode='w')

In [4]: F.writeReals(np.linspace(0,1,10))

In [5]: F.close()

In [6]: !ls -l 'test.unf'
-rw-r--r-- 1 nmb nmb 48 2010-05-11 14:56 test.unf

There are also writeInts and writeString methods. Like usual,
FortranFile only writes and reads homogeneous records: all integers, all
reals, etc. To write fortran files with items of different types in a
single record, you will have to work harder, perhaps using the struct
module directly.

-Neil

Gideon

unread,
May 12, 2010, 1:56:20 PM5/12/10
to scipy...@scipy.org
I've tried the following.

In Python:
import numpy as np
from FortranFile import FortranFile

x = np.random.rand(10)
f = FortranFile('test.bin',mode='w')
f.writeReals(x)
f.close()

In Fortran:
program bintest

double precision x(10)
integer j

open(unit=80, file='test.bin', status='old', form='unformatted')

read(80) x
close(80)

do j=1,10
write(*,*) x(j)

enddo


end

then at the command line,

gfortran bintest.f -o bintest
./bintest
At line 9 of file bintest.f (unit = 80, file = 'test.bin')
Fortran runtime error: I/O past end of record on unformatted file

Note, I have no difficulty reading the test.bin file back in, while in
python, using the FortranFile.py routines.
> SciPy-U...@scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user

Neil Martinsen-Burrell

unread,
May 12, 2010, 2:00:36 PM5/12/10
to SciPy Users List, Gideon
It is likely that the problem is with the endian-ness of the file being
created by FortranFile not matching what is expected by the fortran
compiler. (There is a reason that the format of unformatted I/O is not
specified in the Fortran standard.) Try the above with different
settings of FortranFile(..., endian='<') or '>' or '='.

-Neil
_______________________________________________
SciPy-User mailing list

Gideon

unread,
May 12, 2010, 3:58:28 PM5/12/10
to scipy...@scipy.org
Tried both, but I got the same error in both cases.
> SciPy-U...@scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user

Neil Martinsen-Burrell

unread,
May 12, 2010, 4:21:17 PM5/12/10
to SciPy Users List, Gideon
On 2010-05-12 14:58, Gideon wrote:
> Tried both, but I got the same error in both cases.

If you want doubles in your file, you have to request them:

F.writeReals(x, prec='d')

makes everything work for me (Ubuntu 10.04, python 2.6.5, gfortran
4.4.3). Note that looking at the size of the file that you would expect
to have for the data you are expecting to read would have demonstrated
this: 10 doubles at eight bytes per double plus two 4-byte integers
would have given you 88 bytes for the file, rather than the 48 that were
being produced.

I use fortranfile most heavily for reading files, rather than writing
them, so I may have missed this opportunity, but do you think that the
precision used in writeReals should be auto-detected from the data type
that it is passed. That is, would

def writeReals(self, reals, prec=None):
if prec is None:
prec = reals.dtype.char
...

be better for your use? That would have made your original code work as
written.

-Neil
_______________________________________________
SciPy-User mailing list

Gideon

unread,
May 12, 2010, 6:05:55 PM5/12/10
to scipy...@scipy.org
Yea, that worked for me on my OS X machine. Thanks so much.

To be honest, in the 10 years I've been doing floating point
calculations for ODEs and PDEs, I don't think I've ever used single
precision arithmetic. So I am surprised it doesn't default to double
precision. Obviously, different people have different needs.

On May 12, 4:21 pm, Neil Martinsen-Burrell <n...@wartburg.edu> wrote:
> On 2010-05-12 14:58, Gideon wrote:
>
> > Tried both, but I got the same error in both cases.
>
> If you want doubles in your file, you have to request them:
>
> F.writeReals(x, prec='d')
>
> makes everything work for me (Ubuntu 10.04, python 2.6.5, gfortran
> 4.4.3).  Note that looking at the size of the file that you would expect
> to have for the data you are expecting to read would have demonstrated
> this: 10 doubles at eight bytes per double plus two 4-byte integers
> would have given you 88 bytes for the file, rather than the 48 that were
> being produced.
>
> I use fortranfile most heavily for reading files, rather than writing
> them, so I may have missed this opportunity, but do you think that the
> precision used in writeReals should be auto-detected from the data type
> that it is passed.  That is, would
>
> def writeReals(self, reals, prec=None):
>      if prec is None:
>          prec = reals.dtype.char
>      ...
>
> be better for your use?  That would have made your original code work as
> written.
>
> -Neil
> _______________________________________________
> SciPy-User mailing list
> SciPy-U...@scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user

Dag Sverre Seljebotn

unread,
May 13, 2010, 4:17:27 AM5/13/10
to SciPy Users List
A fully reliable way of reading such files is to wrap Fortran code
reading it with f2py (and fwrap, when that is done). Then, compile with
the Fortran compiler in question.

Dag Sverre

John Hassler

unread,
May 13, 2010, 7:38:56 AM5/13/10
to SciPy Users List
"Back in the day," double precision was MUCH slower than single precision arithmetic, so Fortran used single precision by default.  You used double precision only when absolutely necessary, and you had to call it explicitly.  Fortran even had separate "built-in" functions for single and double - eg., sin, dsin, log, dlog, etc. - that the user called explicitly.  (I haven't used Fortran for 20 years, but I think modern Fortran recognizes the type of argument, now.)

Single and double precision are about the same speed on modern processors, and double is sometimes even faster than single on 64 bit processors (because of the ancillary data shuffling, I think).  However, Fortran is dragging nearly 60 years of history along with it, so I'm not surprised that it defaults to single precision.

john
No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.819 / Virus Database: 271.1.1/2869 - Release Date: 05/12/10 02:26:00

Francesc Alted

unread,
May 13, 2010, 8:18:19 AM5/13/10
to SciPy Users List
A Thursday 13 May 2010 13:38:56 John Hassler escrigué:
> "Back in the day," double precision was MUCH slower than single precision
> arithmetic, so Fortran used single precision by default. You used double
> precision only when absolutely necessary, and you had to call it
> explicitly. Fortran even had separate "built-in" functions for single and
> double - eg., sin, dsin, log, dlog, etc. - that the user called
> explicitly. (I haven't used Fortran for 20 years, but I think modern
> Fortran recognizes the type of argument, now.)
>
> Single and double precision are about the same speed on modern processors,
> and double is sometimes even faster than single on 64 bit processors

Beware! This is so only for basic arithmetic operations. For computation of
transcendent functions (sin, cos, atanh, sqrt, log...), single precision is
still way faster (they require much less computations to reach the precision).

> (because of the ancillary data shuffling, I think). However, Fortran is
> dragging nearly 60 years of history along with it, so I'm not surprised
> that it defaults to single precision.
>
> john

--
Francesc Alted
_______________________________________________
SciPy-User mailing list
SciPy...@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user

David

unread,
May 13, 2010, 10:51:03 PM5/13/10
to SciPy Users List
On 05/13/2010 09:18 PM, Francesc Alted wrote:
> A Thursday 13 May 2010 13:38:56 John Hassler escrigué:
>> "Back in the day," double precision was MUCH slower than single precision
>> arithmetic, so Fortran used single precision by default. You used double
>> precision only when absolutely necessary, and you had to call it
>> explicitly. Fortran even had separate "built-in" functions for single and
>> double - eg., sin, dsin, log, dlog, etc. - that the user called
>> explicitly. (I haven't used Fortran for 20 years, but I think modern
>> Fortran recognizes the type of argument, now.)
>>
>> Single and double precision are about the same speed on modern processors,
>> and double is sometimes even faster than single on 64 bit processors
>
> Beware! This is so only for basic arithmetic operations. For computation of
> transcendent functions (sin, cos, atanh, sqrt, log...), single precision is
> still way faster (they require much less computations to reach the precision).

Also, float and double operations are at the same speed only considering
everything is in the registers... So concretely, single precision is
much faster for almost any code which is memory bound (it is very easy
to check with numpy: something as simple as dot is around twice faster
for single than double precision, assuming dot uses atlas or similarly
optimized library).

cheers,

David

Francesc Alted

unread,
May 14, 2010, 3:31:15 AM5/14/10
to SciPy Users List
A Friday 14 May 2010 04:51:03 David escrigué:
> > Beware! This is so only for basic arithmetic operations. For
> > computation of transcendent functions (sin, cos, atanh, sqrt, log...),
> > single precision is still way faster (they require much less computations
> > to reach the precision).
>
> Also, float and double operations are at the same speed only considering
> everything is in the registers... So concretely, single precision is
> much faster for almost any code which is memory bound (it is very easy
> to check with numpy: something as simple as dot is around twice faster
> for single than double precision, assuming dot uses atlas or similarly
> optimized library).

True. Although you don't even need atlas to see this:

In [3]: a = np.arange(1e6, dtype=np.float64)

In [4]: b = np.arange(1e6, dtype=np.float64)

In [5]: timeit a*b
100 loops, best of 3: 5.02 ms per loop

In [6]: a = np.arange(1e6, dtype=np.float32)

In [7]: b = np.arange(1e6, dtype=np.float32)

In [8]: timeit a*b
100 loops, best of 3: 2.68 ms per loop


--
Francesc Alted

Francesc Alted

unread,
May 14, 2010, 4:57:17 AM5/14/10
to SciPy Users List
A Tuesday 11 May 2010 21:09:32 Gideon escrigué:
> I've previously used the FortranFile.py to read in binary data
> generated by fortran computations, but now I'd like to write data from
> NumPy/SciPy to binary which can be read in by a fortran program. Does
> anyone have an example of using fortranfile.py to create and write
> data to binary? Alternatively, can anyone suggest a way to write
> numpy arrays to binary in away that permits me to specify the correct
> offset (4 bytes on my machine) for fortran to then properly read the
> data in?

Just for completeness to other solutions offered, I'm attaching a BinaryFile
class that allows you to read/write fortran files (in general, binary files).
From its docstrings:

"""
BinaryFile: A class for accessing data to/from large binary files
=================================================================

The data is meant to be read/write sequentially from/to a binary file.
One can request to read a piece of data with a specific type and shape
from it. Also, it supports the notion of Fortran and C ordered data,
so that the returned data is always well-behaved (C-contiguous and
aligned).

This class is seeking capable.
"""

It differs from the solutions that other presented here in that it does not
use the struct module at all, so it is much more faster. For example, when
using Neil's fortranfile module, one have:

In [1]: import fortranfile

In [2]: import numpy as np

In [3]: f = fortranfile.FortranFile('/tmp/test.unf',mode='w')

In [5]: time f.writeReals(np.arange(1e7))
CPU times: user 6.06 s, sys: 0.14 s, total: 6.21 s
Wall time: 6.41 s

In [7]: f.close()

In [8]: f = fortranfile.FortranFile('/tmp/test.unf',mode='r')

In [9]: time f.readReals()
CPU times: user 0.64 s, sys: 0.35 s, total: 0.99 s
Wall time: 1.00 s
Out[10]:
array([ 0.00000000e+00, 1.00000000e+00, 2.00000000e+00, ...,
9.99999700e+06, 9.99999800e+06, 9.99999900e+06], dtype=float32)

while using my binaryfile module gives:

In [1]: import numpy as np

In [2]: from binaryfile import BinaryFile

In [3]: f = BinaryFile('/tmp/test.bin', mode="w+", order='fortran')

In [4]: time f.write(np.arange(1e7))
CPU times: user 0.04 s, sys: 0.19 s, total: 0.24 s
Wall time: 0.24 s # 26x times faster than fortranfile

In [6]: f.seek(0)

In [7]: time f.read('f8', (int(1e7),))
CPU times: user 0.03 s, sys: 0.12 s, total: 0.15 s
Wall time: 0.15 s # 6.6 times faster than fortranfile
Out[8]:
array([ 0.00000000e+00, 1.00000000e+00, 2.00000000e+00, ...,
9.99999700e+06, 9.99999800e+06, 9.99999900e+06])

Also, binaryfile supports all the types in NumPy, even strings and records.

HTH,

--
Francesc Alted
binaryfile.py
test_binaryfile.py

Neil Martinsen-Burrell

unread,
May 14, 2010, 10:30:37 AM5/14/10
to SciPy Users List
Wonderful speed! But, alas, binaryfile does not produce fortran
unformatted output. The format that you've written is what Fortran
calls stream output and is a relatively recent addition to that
language. While fortranfile is certainly slow due to its use of the
struct module for all writes and reads, it allows it to read and write
Fortran's record-oriented (not like numpy records) format with a great
deal of flexibility. It was designed to be able to read data files
created by Fortran simulation codes that may have been produced on
machines with different integer sizes and endian-ness than the machine
doing the reading. Your binaryfile does not do this, although I do not
doubt that it could be done. Any improvements that make fortranfile
faster will be gladly accepted!

-Neil

Francesc Alted

unread,
May 14, 2010, 10:51:29 AM5/14/10
to SciPy Users List
A Friday 14 May 2010 16:30:37 Neil Martinsen-Burrell escrigué:
> Wonderful speed! But, alas, binaryfile does not produce fortran
> unformatted output. The format that you've written is what Fortran
> calls stream output and is a relatively recent addition to that
> language.

Mmh. I'm rather ignorant in this matter, but I'm wondering if what you call
'stream output' would be the same than the venerable 'sequential access' mode
(that exists at least since Fortran 90)?

> While fortranfile is certainly slow due to its use of the
> struct module for all writes and reads, it allows it to read and write
> Fortran's record-oriented (not like numpy records) format with a great
> deal of flexibility.

You are right. I suppose that what you call 'record-oriented' is the 'direct
access' mode in literature. Yup, this is not supported by binaryfile.

> It was designed to be able to read data files
> created by Fortran simulation codes that may have been produced on
> machines with different integer sizes and endian-ness than the machine
> doing the reading. Your binaryfile does not do this, although I do not
> doubt that it could be done. Any improvements that make fortranfile
> faster will be gladly accepted!

Well, I suppose that if you can get rid of the struct module in fortranfile
you may get much better performance. I don't think this would require a lot
of work.

--
Francesc Alted

al...@ajackson.org

unread,
May 15, 2010, 8:09:02 PM5/15/10
to scipy...@scipy.org
A few years ago I was speaking with a colleague - a brilliant gentleman
in his late seventies who had done the blast wave modeling for the
Bikini Atoll tests on a slide rule - and I mentioned that for his
finite difference elastic wave equation modeling code he must use a lot
of double precision arithmetic. He looked very hurt, and replied "Oh
no, single precision is all you need if you know what you're doing".
-----------------------------------------------------------------------
| Alan K. Jackson | To see a World in a Grain of Sand |
| al...@ajackson.org | And a Heaven in a Wild Flower, |
| www.ajackson.org | Hold Infinity in the palm of your hand |
| Houston, Texas | And Eternity in an hour. - Blake |
-----------------------------------------------------------------------
_______________________________________________
SciPy-User mailing list
SciPy...@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user

Reply all
Reply to author
Forward
0 new messages