Is this the best way to save my data? I find it too slow.

博陈

unread,

Apr 18, 2016, 9:30:12 PM4/18/16

to

In my fortran program, I just saved a large array, up = Array(Complex128, 8192, 8192), in this way:

open(100 , file = 'data/p.bin', access = 'direct' , form = 'unformatted' , recl = 4)
do j=1, ns
do i=1, ns
write(100, rec = ns*(j-1)+i ) up(i, j)
end do
end do
close(100)

I used the intrinsic time() function to frofile the I/O performance, and it showed that this process usually takes about 0.5h. Is there any improvement?

Ian Harvey

unread,

Apr 18, 2016, 10:45:00 PM4/18/16

to

Is there a particular need to write to a direct access file? Is there a
particular need to save each value in the array to its own record?

Depending on requirements, I would consider writing the whole array to a
single record, perhaps using stream access.

(Note that the absolute specification of record length will not be
portable between Fortran processors. Consider using an INQUIRE
statement to obtain the number of file storage units required for the
thing you are writing.)

Richard Maine

unread,

Apr 18, 2016, 11:40:32 PM4/18/16

to

Ian Harvey <ian_h...@bigpond.com> wrote:

> On 2016-04-19 11:30 AM, ?? wrote:
> > In my fortran program, I just saved a large array, up =
Array(Complex128, 8192, 8192), in this way:
> >
> > open(100 , file = 'data/p.bin', access = 'direct' , form =
'unformatted' , recl = 4)
> > do j=1, ns
> > do i=1, ns
> > write(100, rec = ns*(j-1)+i ) up(i, j)
> > end do
> > end do
> > close(100)
> >
> > I used the intrinsic time() function to frofile the I/O performance,
> > and it showed that this process usually takes about 0.5h. Is there
> > any improvement?
> >
>
> Is there a particular need to write to a direct access file? Is there a
> particular need to save each value in the array to its own record?
>
> Depending on requirements, I would consider writing the whole array to a
> single record, perhaps using stream access.

Seconded. Much simpler and more portable to boot. As in

open(100, file='data/p.bin', access='stream', form='unformatted',
status='replace')
!-- Or whatever is appropriate for the status.
!-- I generally recommend specifying it instead of using the default.
write (100) up
close(100)

--
Richard Maine
email: last name at domain . net
dimnain: summer-triangle

campbel...@gmail.com

unread,

Apr 19, 2016, 3:57:29 AM4/19/16

to

I would write one record per column, which should speed things up. Even for large arrays, this should not be a problem. If you are changing the record length of p.bin, you will need to delete the file first.

integer ns,i,j, iostat
real*4, allocatable :: up(:,:)
real*4 e,em
!
ns = 8093
allocate ( up(ns,ns) )
forall (i=1:ns, j=1:ns) up(i,j) = i+j*2
!
open (100 , file = 'data/p.bin', access = 'direct' , form = 'unformatted' , recl = ns*4, iostat=iostat )
if ( iostat /= 0 ) write (*,*) 'Error at open : iostat=',iostat
do j=1, ns
write (100, rec = j, iostat=iostat ) up(1:ns, j)
if ( iostat /= 0 ) write (*,*) 'Error at write j=',j,' : iostat=',iostat
end do
write (*,*) 'file written'
close (100)
!
up = -1
!
open (100 , file = 'data_p.bin', access = 'direct' , form = 'unformatted' , recl = ns*4, iostat=iostat )
if ( iostat /= 0 ) write (*,*) 'Error at open : iostat=',iostat
em = 0
do j=1, ns
read (100, rec = j, iostat=iostat ) up(1:ns, j)
if ( iostat /= 0 ) write (*,*) 'Error at write j=',j,' : iostat=',iostat
do i = 1,ns
e = up(i,j) - (i+j*2)
em = max ( e,em )
end do
end do
write (*,*) 'file read : max error =',em
close (100)
!
end

herrman...@gmail.com

unread,

Apr 19, 2016, 5:26:36 AM4/19/16

to

On Monday, April 18, 2016 at 7:45:00 PM UTC-7, Ian Harvey wrote:
> On 2016-04-19 11:30 AM, 博陈 wrote:
> > In my fortran program, I just saved a large array, up = Array(Complex128, 8192, 8192), in this way:

> > open(100 , file = 'data/p.bin', access = 'direct' , form = 'unformatted' , recl = 4)
> > do j=1, ns
> > do i=1, ns
> > write(100, rec = ns*(j-1)+i ) up(i, j)
> > end do
> > end do
> > close(100)

(snip)

> Is there a particular need to write to a direct access file? Is there a
> particular need to save each value in the array to its own record?

Yes. There is overhead for each I/O statement, and for each direct access
file operation. Even more, there might be less buffering for direct access.

> Depending on requirements, I would consider writing the whole array to a
> single record, perhaps using stream access.

I wouldn't go quite that far. Depending on the array size, I might
do it one row at a time. Writing a whole large array, or even more,
many large arrays, as one unformatted WRITE may (there are system
dependencies) require some large buffers that might slow things down.

It used to be that I liked to keep each I/O operation below about 32K,
but computers have gotten bigger and faster. At 32K, you have most
of the speed advantage of larger blocks, but could go to 128K or
even 1M.

If you don't need direct access, I would use ordinary sequential access,
which has less overhead, and often makes more optimal use of buffers.

Some I/O systems can do read ahead, anticipating reads. That is
rare for direct access, where the suggestion is that you won't do
sequential access.

open(100 , file = 'data/p.bin', access = 'sequential' , form = 'unformatted' )
do j=1, ns
write(100 ) up(:, j)
end do
close(100)

> (Note that the absolute specification of record length will not be
> portable between Fortran processors. Consider using an INQUIRE
> statement to obtain the number of file storage units required for the
> thing you are writing.)

If you really do need direct access, try to find a larger block.

It used to be that disks used 512 byte blocks, and that allowed
for optimal access. That is less true today, but it isn't a bad size
for direct access. 2K, 4K, or 8K are also often not bad.

robin....@gmail.com

unread,

Apr 19, 2016, 8:45:34 AM4/19/16

to

Apart from what the others have said, the program is in error.
For a complex single precision array, the record length needs to be 8,
in order to transmit both the real and imaginary parts.

You haven't given the declaration of UP, so we don't even know whether
8 or 16 needs to be the correct record length for this program.

Richard Maine

unread,

Apr 19, 2016, 12:15:09 PM4/19/16

to

<campbel...@gmail.com> wrote:

> If you are changing the record length of p.bin, you will need to delete
> the file first.

That's why I used status='replace' in my suggestion. It takes care of
the deletion, which there is otherwise no portable way to do in Fortran.
Status='replace' was added in f90 and allowed me ti get rid of my
system-dependent f77 code to handle deletion of the possible old file.
For most of today's systems, it doesn't matter for unformatted direct
(or stream) acess anyway because the system just stores the data as a
byte stream, with no record markers or other indications of what the
record size is. Exceptions have existed, though.

Ron Shepard

unread,

Apr 19, 2016, 12:34:33 PM4/19/16

to

On 4/19/16 4:26 AM, herrman...@gmail.com wrote:
[...]

>> Depending on requirements, I would consider writing the whole array to a
>> single record, perhaps using stream access.
>
> I wouldn't go quite that far. Depending on the array size, I might
> do it one row at a time. Writing a whole large array, or even more,
> many large arrays, as one unformatted WRITE may (there are system
> dependencies) require some large buffers that might slow things down.

Usually the operating system uses internal buffers in order to speed up
i/o, not slow it down. The buffer sizes may be chosen to match block,
sector, or cylinder sizes or other low-level hardware features.
Typically, if you look at i/o speeds as a function of record size, it is
generally an increasing function up to some limiting value, but
sometimes there is a smaller sawtooth pattern imposed on top of that
overall increasing performance trend due to, for example, disk sector
size, or to the number of channels to parallel i/o devices, or to
matching the speed of your computations with the disk rotation speed.
Most modern disks have large buffers on the device itself, so the i/o
operations are really transferring data between external RAM and
internal RAM, and this eliminates most of the sawtooth patterns that
might otherwise arise due to the physical hardware features.

> If you don't need direct access, I would use ordinary sequential access,
> which has less overhead, and often makes more optimal use of buffers.

I agree with this, but for large enough record sizes there will be no
difference between direct access and sequential for most hardware.

> Some I/O systems can do read ahead, anticipating reads. That is
> rare for direct access, where the suggestion is that you won't do

> sequential access. [...]

This is true, but then there is also the possibility of using
asynchronous i/o where you queue the next direct access read before you
need the data. Then you can overlap your computations with the i/o,
which is the same effect that the sequential read-ahead achieves.

$.02 -Ron Shepard

Richard Maine

unread,

Apr 19, 2016, 12:59:44 PM4/19/16

to

Ron Shepard <nos...@nowhere.org> wrote:

> On 4/19/16 4:26 AM, herrman...@gmail.com wrote:
> [...]
>
> >> Depending on requirements, I would consider writing the whole array to a
> >> single record, perhaps using stream access.
> >
> > I wouldn't go quite that far. Depending on the array size, I might
> > do it one row at a time. Writing a whole large array, or even more,
> > many large arrays, as one unformatted WRITE may (there are system
> > dependencies) require some large buffers that might slow things down.
>
> Usually the operating system uses internal buffers in order to speed up

> i/o, not slow it down....

>
> > If you don't need direct access, I would use ordinary sequential access,
> > which has less overhead, and often makes more optimal use of buffers.
>
> I agree with this, but for large enough record sizes there will be no
> difference between direct access and sequential for most hardware.

Of course, unformatted sequential adds complications because of the
record-length counts typically put at both ends of the records. The
first complication is that the file produced is actually different. It
wasn't clear from the OP's post whether or not he had specific
requirements for file format, but if he did so, unformatted sequential
won't meet them. It also means you'll need to read with the same record
length.

Second, those record-length counts often force an extra layer of
bufferring in the Fortran runtimes. And those are what sometimes cause
issues with large record sizes.

Stream seems far superior for the purpose to me. I see no benefit to be
gained by sequential compared to stream.

Tom Roche

unread,

Apr 19, 2016, 2:08:15 PM4/19/16

to

博陈[1]
> [on saving] a large array[:] Array(Complex128, 8192, 8192),

Rather than "rolling your own" I/O code, consider using one of the several available array-oriented Fortran libraries. Your first question should probably be, does that library support complex numbers (in the academic sense)? I don't know if the following does, so am just using it as an example:

'NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.'[2] It is highly-active and well-supported (by US NOAA). You can learn more about netCDF and its Fortran libraries at (e.g.) its GitHub site[3], its interface guide[4], or look at sample programs[5]. For questions, you might try the netcdfgroup mailing list[6].

HTH, Tom Roche <Tom_...@pobox.com>

[1]: https://groups.google.com/d/msg/comp.lang.fortran/jNbzIZtcEQc/41Wf-Gs6BgAJ
[2]: http://www.unidata.ucar.edu/software/netcdf/
[3]: https://github.com/Unidata/netcdf-fortran
[4]: http://www.unidata.ucar.edu/software/netcdf/netcdf-4/newdocs/netcdf-f90.html
[5]: http://www.unidata.ucar.edu/software/netcdf/examples/programs/
[6]: http://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/

herrman...@gmail.com

unread,

Apr 19, 2016, 3:15:49 PM4/19/16

to

On Tuesday, April 19, 2016 at 9:59:44 AM UTC-7, Richard Maine wrote:
> Ron Shepard <nos...@nowhere.org> wrote:

(snip)

> > I agree with this, but for large enough record sizes there will be no
> > difference between direct access and sequential for most hardware.

> Of course, unformatted sequential adds complications because of the
> record-length counts typically put at both ends of the records. The
> first complication is that the file produced is actually different. It
> wasn't clear from the OP's post whether or not he had specific
> requirements for file format, but if he did so, unformatted sequential
> won't meet them. It also means you'll need to read with the same record
> length.

> Second, those record-length counts often force an extra layer of
> bufferring in the Fortran runtimes. And those are what sometimes cause
> issues with large record sizes.

I went back to look. It doesn't say much about what the file is used for,
but does say how big it is. 8192*8192*128 complex values, so 64GB
for COMPLEX(KIND(1.0)), and 128GB for COMPLEX(KIND(1.D0))
(that is, COMPLEX*8 and COMPLEX*16 for oldies)

Though as Robin notes, he might be only writing four bytes for each.

Even without block headers/trailers, it might be that some systems
generate a buffer for the whole I/O operation. There are also other
possible problems caused by large sizes, mostly because they aren't
tested as well as they could be.

> Stream seems far superior for the purpose to me. I see no benefit to be
> gained by sequential compared to stream.

Other than compatibility to older systems, I agree.

herrman...@gmail.com

unread,

Apr 19, 2016, 3:21:22 PM4/19/16

to

On Tuesday, April 19, 2016 at 5:45:34 AM UTC-7, robin....@gmail.com wrote:

(snip)

> > open(100 , file = 'data/p.bin', access = 'direct' , form = 'unformatted' , recl = 4)

(snip)

> Apart from what the others have said, the program is in error.
> For a complex single precision array, the record length needs to be 8,
> in order to transmit both the real and imaginary parts.

Well, strictly, the standard doesn't define the size of "file storage unit" but
does suggest that the eight bit octet is a good choice.

Since we don't know the system the OP is using, we can't be sure.

> You haven't given the declaration of UP, so we don't even know whether
> 8 or 16 needs to be the correct record length for this program.

It could be COMPLEX*16 on a system with four byte file storage units.

I am not sure what direct access does when the WRITE is bigger than
the record size. I will guess that it truncates the record, but it might
write successive records.

Actually, 0.5h for writing 128GB or 256GB isn't bad at all.
(That is, for COMPLEX*16 and COMPLEX*32.)

Richard Maine

unread,

Apr 19, 2016, 5:05:50 PM4/19/16

to

<herrman...@gmail.com> wrote:

> On Tuesday, April 19, 2016 at 5:45:34 AM UTC-7, robin....@gmail.com
wrote:

> > For a complex single precision array, the record length needs to be 8,
> > in order to transmit both the real and imaginary parts.
>
> Well, strictly, the standard doesn't define the size of "file storage unit"
> but does suggest that the eight bit octet is a good choice.
>
> Since we don't know the system the OP is using, we can't be sure.
>
> > You haven't given the declaration of UP, so we don't even know
> > whether
> > 8 or 16 needs to be the correct record length for this program.
>
> It could be COMPLEX*16 on a system with four byte file storage units.

As noted, the OP didn't show crtitical declarations. He did say
something about

>>> up = Array(Complex128, 8192, 8192)

which is a little on the cryptic side, as it doesn't appear to be valid
Fortran syntax or any widely used notation. I might make a wild guess
that the "Complex128" part might be intended to suggest a 128-bit
complex kind.

> I am not sure what direct access does when the WRITE is bigger than
> the record size. I will guess that it truncates the record, but it might
> write successive records.

It violates the standard and thus, "anything" may happen. I would
normally expect an error condition (and thus a program halt, as there is
niether an err= nor an iostat= specifier), but the standard doesn't
require that (because it doesn't specify I/O error conditions). I seem
to recall systems that special-cased handling of the recl=1 case as
amounting to a form of stream access before stream was standardized, but
that doesn't apply here.

> Actually, 0.5h for writing 128GB or 256GB isn't bad at all.
> (That is, for COMPLEX*16 and COMPLEX*32.)

Hmm. Not sure where you got those sizes. Let's see...
Oh, perhaps you are interpreting the Array(Complex128, 8192, 8192)
as meaning a 3-D array of complex, dimensioned (128,8192,8192). Well,
maybe, since the OP didn't say what that notation meant. I do note that
the OP's code referenced the array elements as up(i,j), which implies
only rank 2, but without declarations, it is hard to say. His complex128
thing could at least possibly be some derived type with 128 elements;
that wouldn't be my first guess, but I can't disprove it.

herrman...@gmail.com

unread,

Apr 19, 2016, 6:52:34 PM4/19/16

to

On Tuesday, April 19, 2016 at 2:05:50 PM UTC-7, Richard Maine wrote:

(snip, I wrote)

> > Actually, 0.5h for writing 128GB or 256GB isn't bad at all.
> > (That is, for COMPLEX*16 and COMPLEX*32.)

> Hmm. Not sure where you got those sizes. Let's see...
> Oh, perhaps you are interpreting the Array(Complex128, 8192, 8192)
> as meaning a 3-D array of complex, dimensioned (128,8192,8192). Well,
> maybe, since the OP didn't say what that notation meant. I do note that
> the OP's code referenced the array elements as up(i,j), which implies
> only rank 2, but without declarations, it is hard to say. His complex128
> thing could at least possibly be some derived type with 128 elements;
> that wouldn't be my first guess, but I can't disprove it.

Yes, I was reading it as 128, 8192, 8192, and not thinking about Complex128.

I could blame it on my newsreader font, but mostly I wasn't looking
carefully enough.

OK, so if it is 128 bit complex, only 1GB, not small but not huuuuuuge(tm), either.

I might believe that 10Mb/s is a reasonable speed, so about 100s or so.

I still probably would do it one row (that is, the contiguous storage dimension)
at a time. It would be interesting to see times both ways for some different systems.

herrman...@gmail.com

unread,

Apr 19, 2016, 7:10:52 PM4/19/16

to

On Tuesday, April 19, 2016 at 3:52:34 PM UTC-7, herrman...@gmail.com wrote:

(snip)

> OK, so if it is 128 bit complex, only 1GB, not small but not huuuuuuge(tm), either.

> I might believe that 10Mb/s is a reasonable speed, so about 100s or so.

> I still probably would do it one row (that is, the contiguous storage dimension)
> at a time. It would be interesting to see times both ways for some different systems.

So, I tried on a Linux system with a not so new fortran.

complex*16 x(8192,8192)
do i=1,8192
do j=1,8192
x(j,i)=cmplx(i, j, kind(x))
enddo
enddo
open(unit=1, file='x.tmp', form='unformatted')
do i=1,8192
write(1) x(:,i)
enddo
end

Seems that this one doesn't have access='stream', so I didn't use that.

Both this form, and writing the whole array, run in 5s.

Then, just to see, I tried one with:

complex*16 x(128,8192,8192)

and got:

unformatted3.f: In function 'MAIN__':
unformatted3.f:1: internal compiler error: in tree_low_cst, at tree.c:4507
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://bugzilla.redhat.com/bugzilla> for instructions.

Since it isn't a current, or even close, version, I won't report the error.

It might be the 32 bit version, where such arrays will likely cause problems
even without ICE, though that is a strange way to report a huuuuuuge(tm) error.

I didn't try an access='direct' version.

Richard Maine

unread,

Apr 19, 2016, 8:14:30 PM4/19/16

to

<herrman...@gmail.com> wrote:

> On Tuesday, April 19, 2016 at 3:52:34 PM UTC-7, herrman...@gmail.com wrote:
>
> (snip)
> > OK, so if it is 128 bit complex, only 1GB, not small but not
huuuuuuge(tm), either.
>
> > I might believe that 10Mb/s is a reasonable speed, so about 100s or so.
>
> > I still probably would do it one row (that is, the contiguous storage
> > dimension) at a time. It would be interesting to see times both ways
> > for some different systems.
>
> So, I tried on a Linux system with a not so new fortran.
>
> complex*16 x(8192,8192)
> do i=1,8192
> do j=1,8192
> x(j,i)=cmplx(i, j, kind(x))
> enddo
> enddo
> open(unit=1, file='x.tmp', form='unformatted')
> do i=1,8192
> write(1) x(:,i)
> enddo
> end
>
>
> Seems that this one doesn't have access='stream', so I didn't use that.
>
> Both this form, and writing the whole array, run in 5s.

Just for kicks, I tried this on my iMac using gfortran. For the 1GB
file, it made no measurable difference whether I wrote a column at a
time or the whole array. Also stream vs sequential made no difference.
All those cases ran in about 3.2 seconds. I have an SSD on this machine;
didn't bother to try on my laptop, which has a hard drive. Let's see,
the spec on this SSD says 520 MB/sec, so the 3.2 seconds would be about
2/3 of spec if I can still do arithmetic (increasingly questionable);
that's believable (pretty good, actually).

When I did single element at a time, things slowed down significantly.
About 10 seconds for stream, and 12 seconds for direct. Sequential
unformatted element-at-a-time takes 14.5 seconds, but then its writing a
file 50% bigger because of all the record headers and trailers.

And if I used recl=4 for direct access, as in the OP's version, I got

Fortran runtime error: Write exceeds length of DIRECT access record

and for extra kicks, tried that with iFort to see its error rmessage...
Hey, it worked? Sort of slow, at 85 seconds, but it did work. Oh yeah,
that's probably iFort defaulting to recl in words instead of bytes. (I
thought they had changed that default, but I guess not). So run in iFort
with recl=1, and then I get the expected sort of error message

forrtl: severe (66): output statement overflows record, unit 1, file
/Users/maine/temp/x.tmp

I didn't bother to try anything with the postulated rank 3 array because
just in case it worked, I neither wanted to hit my SSD that hard, nor
wait around that long.

herrman...@gmail.com

unread,

Apr 19, 2016, 11:36:24 PM4/19/16

to

On Tuesday, April 19, 2016 at 5:14:30 PM UTC-7, Richard Maine wrote:

(snip, I wrote)

> > Both this form, and writing the whole array, run in 5s.

> Just for kicks, I tried this on my iMac using gfortran. For the 1GB
> file, it made no measurable difference whether I wrote a column at a
> time or the whole array. Also stream vs sequential made no difference.
> All those cases ran in about 3.2 seconds. I have an SSD on this machine;
> didn't bother to try on my laptop, which has a hard drive. Let's see,
> the spec on this SSD says 520 MB/sec, so the 3.2 seconds would be about
> 2/3 of spec if I can still do arithmetic (increasingly questionable);
> that's believable (pretty good, actually).

Yes, mine was on a real (rotating) disk.

> When I did single element at a time, things slowed down significantly.
> About 10 seconds for stream, and 12 seconds for direct. Sequential
> unformatted element-at-a-time takes 14.5 seconds, but then its writing a
> file 50% bigger because of all the record headers and trailers.

> And if I used recl=4 for direct access, as in the OP's version, I got

> Fortran runtime error: Write exceeds length of DIRECT access record

> and for extra kicks, tried that with iFort to see its error rmessage...
> Hey, it worked? Sort of slow, at 85 seconds, but it did work. Oh yeah,
> that's probably iFort defaulting to recl in words instead of bytes. (I
> thought they had changed that default, but I guess not). So run in iFort
> with recl=1, and then I get the expected sort of error message

So OP was right, and Robin was wrong.

> forrtl: severe (66): output statement overflows record, unit 1, file
> /Users/maine/temp/x.tmp

> I didn't bother to try anything with the postulated rank 3 array because
> just in case it worked, I neither wanted to hit my SSD that hard, nor
> wait around that long.

You could try compiling, but not running.

If I compile (2,8192,8192) or (3, 8192, 8192), it says the array is too big,
but at (4, 8192, 8192) or more, it is ICE. Just doesn't know what to do
with arrays that big.

For SSD, it is the number of writes to each spot, after any randomizing
algorithm, but otherwise I agree, you don't learn all that much.

It seems that my previous 5s was actually CPU time, and the real time
was about 15s. The delay might be disk seeks.

Ron Shepard

unread,

Apr 20, 2016, 12:39:03 AM4/20/16

to

On 4/19/16 11:59 AM, Richard Maine wrote:
>> I agree with this, but for large enough record sizes there will be no
>> >difference between direct access and sequential for most hardware.
> Of course, unformatted sequential adds complications because of the
> record-length counts typically put at both ends of the records. The
> first complication is that the file produced is actually different.

My statement was poorly worded. I meant to say that the i/o performance
would be about the same for direct and sequential access with large
record sizes. It is only for small record sizes that there might be a
difference due to, as you say, the extra information that must be
written for the sequential unformatted file, along with the extra
compute and latency overhead due to each record. In a later post in this
thread you gave some actual timings that show this timing dependence on
record lengths.

The fortran user does not have much control over the various buffers.
For example, if the file is on a remote file system of some kind, say
nfs, then there will be fortran i/o library buffers, operating system
buffers, local nfs buffers, remote nfs buffers, remote OS buffers, and
RAM cache on the actual remote disk. I just checked an 8TB Seagate SATA
disk spec, and it has 256MB RAM cache. (As an aside, I can remember when
a disk drive the size of a washing machine had that same capacity.)
There will be at least a small latency for transfers between each of
those levels of buffers, along with larger network delays and ultimately
the final disk rotation and head seek delays. The bandwidth bottlenecks
are most likely the network speed, if applicable, and the disk interface
bandwidth; the RAM to RAM transfers are typically much faster.

$.02 -Ron Shepard

FortranFan

unread,

Apr 20, 2016, 11:47:16 AM4/20/16

to

On Tuesday, April 19, 2016 at 8:14:30 PM UTC-4, Richard Maine wrote:

> ..

>
> and for extra kicks, tried that with iFort to see its error rmessage...
> Hey, it worked? Sort of slow, at 85 seconds, but it did work. Oh yeah,
> that's probably iFort defaulting to recl in words instead of bytes. (I

> thought they had changed that default, but I guess not). ..

@Richard Maine,

Perhaps you will consider using the standard-semantics compiler option for things you try with Intel Fortran:

https://software.intel.com/en-us/node/579529

which, as discussed at length on occasion on this forum and elsewhere, is not the default.

Richard Maine

unread,

Apr 20, 2016, 11:57:00 AM4/20/16

to

So I see. Using bytes for the record length units is not actually
required by the standard; it's just recommended. But yes, that page does
say that it is one of the things covered by Intel's standard-semantics
option.

herrman...@gmail.com

unread,

Apr 20, 2016, 12:35:19 PM4/20/16

to

On Wednesday, April 20, 2016 at 8:57:00 AM UTC-7, Richard Maine wrote:
> FortranFan <parek...@gmail.com> wrote:

(snip on unformatted I/O and RECL units)

> > Perhaps you will consider using the standard-semantics compiler option for
> > things you try with Intel Fortran:

(snip)

> So I see. Using bytes for the record length units is not actually
> required by the standard; it's just recommended. But yes, that page does
> say that it is one of the things covered by Intel's standard-semantics
> option.

I suspect the standard is allowing for word addressed machine that aren't
convenient for 8 bit bytes. Not that I expect a Fortran 2008 compiler
for the PDP-10 or CDC 6600, but the standard allows for them.

There are some rules about alignment and padding, too.

Most byte oriented machines don't have so much of a problem with
I/O of unaligned data, but it still might be more efficient on some.

Richard Maine

unread,

Apr 20, 2016, 12:47:00 PM4/20/16

to

<herrman...@gmail.com> wrote:

> On Wednesday, April 20, 2016 at 8:57:00 AM UTC-7, Richard Maine wrote:

> > Using bytes for the record length units is not actually
> > required by the standard; it's just recommended.
>

> I suspect the standard is allowing for word addressed machine that aren't
> convenient for 8 bit bytes. Not that I expect a Fortran 2008 compiler
> for the PDP-10 or CDC 6600, but the standard allows for them.
>
> There are some rules about alignment and padding, too.
>
> Most byte oriented machines don't have so much of a problem with
> I/O of unaligned data, but it still might be more efficient on some.

Yep. And also cases like Intel's, where mandating byterecl could break
existing codes. Vendors don't tend to like that (because their customers
don't like it).

Sometimes it is hard to reconstruct reasons, but I have a pretty good
handle on this one because I was actually the one who suggested the
compromise of making this a recommendation; making it a requirement
wasn't going to fly. There hadn't previously been formal recommendations
in the Fortran standard, but I pointed out that ISO explicitly allows
for such things.