Historical reason why Fortran has chosen column-major order

spectrum

unread,

Dec 9, 2017, 11:26:19 PM12/9/17

to

Recently I have come across this thread (Q&A):

https://stackoverflow.com/questions/47691785/why-does-julia-uses-column-major-is-it-fast/47734127#47734127

which asks why a particular language (Julia in this case) has chosen to use
column-major order for the storage of rectangular arrays. There are several
comments in the above thread, and some of them trace back to the original
choice of column major in Fortran and Matlab. And then, this also
seems to be also related to the way the usual linear algebra is formulated,
e.g., an eigenvalue equation like

A * v = lambda * v,

where A is a rectangular matrix and v is a column vector.
Then, assuming that the convention of linear algebra is the key point,
my another question is why the above form becomes wide-spread
rather than row-oriented form like

v * A = lambda * v

(which is another valid equation, just left- and right-eigenvector things,
but I'm now interested in more popular definitions in many math
text books).

So my question Is:
... Is there any historical reasons (in mathematics) why A * v
became more prevalent (IMO) than v * A?
... Apart from the linear-algebra convention, is there performance
or some more computational reasons why column
major was initially preferred (employed) in the history of Fortran?

herrman...@gmail.com

unread,

Dec 10, 2017, 12:14:49 AM12/10/17

to

On Saturday, December 9, 2017 at 8:26:19 PM UTC-8, spectrum wrote:
> Recently I have come across this thread (Q&A):

> https://stackoverflow.com/questions/47691785/why-does-julia-uses-column-major-is-it-fast/47734127#47734127

> which asks why a particular language (Julia in this case) has chosen to use
> column-major order for the storage of rectangular arrays. There are several
> comments in the above thread, and some of them trace back to the original
> choice of column major in Fortran and Matlab.

Row major is convenient for printing in human readable
form, in a screen or page.

I might have thought that Fortran order was related to
the index registers on the 704, but I don't have a
reference for that one.

I believe that R and Matlab have some connection to
Fortran, as the reason for their use of column major.

> And then, this also
> seems to be also related to the way the usual linear
> algebra is formulated, e.g., an eigenvalue equation like

> A * v = lambda * v,

> where A is a rectangular matrix and v is a column vector.
> Then, assuming that the convention of linear algebra is the key point,
> my another question is why the above form becomes wide-spread
> rather than row-oriented form like

> v * A = lambda * v

> (which is another valid equation, just left- and right-eigenvector things,
> but I'm now interested in more popular definitions in many math
> text books).

> So my question Is:
> ... Is there any historical reasons (in mathematics) why A * v
> became more prevalent (IMO) than v * A?
> ... Apart from the linear-algebra convention, is there performance
> or some more computational reasons why column
> major was initially preferred (employed) in the history of Fortran?

I would guess that mathematicians do it in an order that
is convenient for writing on paper, the way they did it for man
years, before electronic computers.

spectrum

unread,

Dec 10, 2017, 12:53:27 AM12/10/17

to

On Sunday, December 10, 2017 at 2:14:49 PM UTC+9, herrman...@gmail.com wrote:

> I believe that R and Matlab have some connection to
> Fortran, as the reason for their use of column major.

Yes, the connection like Fortran -> { Matlab , R, Mathematica } -> ... -> Julia -> ...

When I do linear algebra programming, column-major is drastically easier
for me to understand (and more comfortable to write), but in other cases
sometimes row-major is also convenient (e.g., to get a slice as
a[i] from N-dim array).

Looking at this wiki page,
https://en.wikipedia.org/wiki/Row-_and_column-major_order
the use of row and column major seems like this:

* Row-major order:
C/C++/Objective-C (for C-style arrays), PL/I, Pascal, etc

* Column-major order
Fortran, MATLAB (and Octave,Scilab etc), R, Julia, etc

> I would guess that mathematicians do it in an order that
> is convenient for writing on paper, the way they did it for man
> years, before electronic computers.

Yes, I also think it may be related to hand writing on paper, from left to right
(if a person is right-handed), for the same reason why English is written from
left to right. I had written a similar thing in a different forum much ago,
but I got no more info at that time :) If I look into some history book of
linear algebra (if any), it might explain something, though.

spectrum

unread,

Dec 10, 2017, 12:55:07 AM12/10/17

to

On Sunday, December 10, 2017 at 2:53:27 PM UTC+9, spectrum wrote:
> Yes, the connection like Fortran -> { Matlab , R, Mathematica } -> ... -> Julia -> ...

Hmm, I'm sorry, although I have included Mathematica above, I don't know
whether it is column or row major... (Because it is math-oriented, I just assumed
it may be column-major).

paul.rich...@gmail.com

unread,

Dec 10, 2017, 5:03:15 AM12/10/17

to

On Sunday, 10 December 2017 05:14:49 UTC, herrman...@gmail.com wrote:
> On Saturday, December 9, 2017 at 8:26:19 PM UTC-8, spectrum wrote:
> > Recently I have come across this thread (Q&A):

...snip..

> I believe that R and Matlab have some connection to
> Fortran, as the reason for their use of column major.
>

Matlab was originally written in fortran. The source is here https://www.math.wustl.edu/~victor/utility/matlab/matlab.zip

Paul

paul.rich...@gmail.com

unread,

Dec 10, 2017, 5:24:22 AM12/10/17

to

I just verified that it still compiles and runs with gfortran-5.1.1 onwards (and probably earlier versions too). Ignore the warnings "Warning: Legacy Extension: REAL array index at (1)".

[pault@pc30 matlab]$ ./matlab_original

< M A T L A B >
Version of 25 May 1982
Ported to NeXT 30 Nov 1991

HELP is available

<>a=[1,2;3,4]

A =

1. 2.
3. 4.

<>b=inv(a)

B =

-2.0000 1.0000
1.5000 -0.5000

<>a*b

ANS =

1.0000 0.0000
0.0000 1.0000

<>help
Type HELP followed by ...
INTRO (To get started)
NEWS (recent revisions)
ABS ATAN BASE CHAR CHOL CHOP COND CONJ
COS DET DIAG DIAR DISP EIG EPS EXEC
EXP EYE FLOP HESS HILB IMAG INV KRON
LINE LOAD LOG LU MAGI NORM ONES ORTH
PINV PLOT POLY PRIN PROD QR RAND RANK
RAT RCON REAL ROOT ROUN RREF SAVE SCHU
SIN SIZE SQRT SUM SVD TRIL TRIU USER
CLEA ELSE END EXIT FOR HELP IF LONG
RETU SEMI SHOR WHAT WHIL WHO WHY
ANS EDIT FILE FUN MACRO
( ) ; : + - * / \ = . , ' < >

<>a.*b

ANS =

-2.0000 2.0000
4.5000 -2.0000

Paul

robin....@gmail.com

unread,

Dec 10, 2017, 5:24:40 AM12/10/17

to

Storing rectangular arrays in row-major order is historically
convenient for both computers and people.

The external printed form, row-major order, is convenient for the user.

The internal form in the same order is convenient for reading in and
for printing such rectangular arrays.

And for many operations, it is the most convenient for within the computer
for carrying out matrix arithmetic.

herrman...@gmail.com

unread,

Dec 10, 2017, 6:38:07 AM12/10/17

to

On Sunday, December 10, 2017 at 2:24:22 AM UTC-8, paul.rich...@gmail.com wrote:
> On Sunday, 10 December 2017 10:03:15 UTC, paul.rich...@gmail.com wrote:

(snip)

> > Matlab was originally written in fortran. The source is here
> > https://www.math.wustl.edu/~victor/utility/matlab/matlab.zip

(snip)

> < M A T L A B >
> Version of 25 May 1982
> Ported to NeXT 30 Nov 1991

Is there a Fortran compiler for NeXT?

-- glen

spectrum

unread,

Dec 10, 2017, 2:06:47 PM12/10/17

to

Thanks very much for sharing info! It is interesting that Matlab
was initially written in Fortran directly, and the source is still available public...

I have just tried compiling it, and after some modification of Makefile (which
assumed f2c), it worked with no problem on my Mac also (with gfortran-7.2) :-)

My modified Makefile is something like this:

----- Makefile : start -----

.SUFFIXES:
CFLAGS = -c -O -finline-functions

FC = gfortran-7
FFLAGS = -c -O2

AWKP = -F, -f awk_program

obj = base.o clause.o comandm.o comqr3.o corth.o eqid.o error.o \
expr.o factorm.o funs.o getch.o getlin.o getsym.o getval.o \
hilber.o htribk.o htridi.o imtql2.o iwamax.o magic.o matfn1.o \
matfn2.o matfn3.o matfn4.o matfn5.o matfn6.o matlab.o parse.o \
print.o prntid.o putid.o pythag.o rat.o roundm.o rref.o rrot.o \
rrotg.o rset.o rswap.o stack1.o stack2.o stackg.o stackp.o sys.o \
term.o urand.o wasum.o watan.o waxpy.o wcopy.o wdiv.o wdotci.o \
wdotcr.o wdotui.o wdotur.o wgeco.o wgedi.o wgefa.o wgesl.o wlog.o \
wmul.o wnrm2.o wpofa.o wqrdc.o wqrsl.o wrscal.o wscal.o wset.o \
wsign.o wsqrt.o wsvdc.o wswap.o

matlab:: $(obj)
$(FC) $(obj) -o matlab

comandm.o:: comandm.f
$(FC) $(FFLAGS) comandm.f | awk $(AWKP) >comandm.c

error.o:: error.f
$(FC) $(FFLAGS) error.f | awk $(AWKP) >error.c

getlin.o:: getlin.f
$(FC) $(FFLAGS) getlin.f | awk $(AWKP) > getlin.c

getsym.o:: getsym.f
$(FC) $(FFLAGS) getsym.f | awk $(AWKP) >getsym.c

matfn5.o:: matfn5.f
$(FC) $(FFLAGS) matfn5.f | awk $(AWKP) > matfn5.c

print.o:: print.f
$(FC) $(FFLAGS) print.f | awk $(AWKP) > print.c

prntid.o:: prntid.f
$(FC) $(FFLAGS) prntid.f | awk $(AWKP) > prntid.c

sys.o:: sys.f
$(FC) $(FFLAGS) sys.f | awk $(AWKP) >sys.c

%.o : %.f
$(FC) $(FFLAGS) -c $<

----- Makefile : end -----

Then an executable "matlab" is created. I also needed to copy "matlab.hlp"
to the current directory. Then, typing "./matlab", we enter an interactive
session (seems like REPL = read-eval-print-loop):

$ ./matlab

< M A T L A B >
Version of 25 May 1982
Ported to NeXT 30 Nov 1991

HELP is available

<> HELP

Type HELP followed by ...
INTRO (To get started)
NEWS (recent revisions)
ABS ATAN BASE CHAR CHOL CHOP COND CONJ
COS DET DIAG DIAR DISP EIG EPS EXEC
EXP EYE FLOP HESS HILB IMAG INV KRON
LINE LOAD LOG LU MAGI NORM ONES ORTH
PINV PLOT POLY PRIN PROD QR RAND RANK
RAT RCON REAL ROOT ROUN RREF SAVE SCHU
SIN SIZE SQRT SUM SVD TRIL TRIU USER
CLEA ELSE END EXIT FOR HELP IF LONG
RETU SEMI SHOR WHAT WHIL WHO WHY
ANS EDIT FILE FUN MACRO
( ) ; : + - * / \ = . , ' < >

<> v1 = [ 1.0 2.0 ]'

V1 =

1.
2.

<> v2 = [ 3.0 4.0 ]'

V2 =

3.
4.

<>v1 .* v2

ANS =

3.
8.

<>v1' * v2

ANS =

11.

<>v1 * v2'

ANS =

3. 4.
6. 8.

<> a = [ 3.0 -1.0 ; -1.0 5.0 ]

A =

3. -1.
-1. 5.

<> EIG( a )

ANS =

2.5858
5.4142

<> CHOL( a )

ANS =

1.7321 -0.5774
0.0000 2.1602

<> CTRL-C to quit (?)

Looks like the results of eigenvalues and Cholesky are also correct (by comparison
to other tools) :-)

spectrum

unread,

Dec 10, 2017, 2:14:14 PM12/10/17

to

For comparison...

https://www.wolframalpha.com/input/?i=eigenvalues+%7B+%7B+3.0,+-1.0+%7D,+%7B+-1.0,+5.0+%7D+%7D

https://www.wolframalpha.com/input/?i=cholesky+%7B+%7B+3.0,+-1.0+%7D,+%7B+-1.0,+5.0+%7D+%7D

Jos Bergervoet

unread,

Dec 10, 2017, 2:53:01 PM12/10/17

to

On 12/10/2017 8:06 PM, spectrum wrote:
> On Sunday, December 10, 2017 at 7:03:15 PM UTC+9, paul.rich...@gmail.com wrote:
>> On Sunday, 10 December 2017 05:14:49 UTC, herrman...@gmail.com wrote:
>>> On Saturday, December 9, 2017 at 8:26:19 PM UTC-8, spectrum wrote:
>>>> Recently I have come across this thread (Q&A):

>> ...snip..

..

> A =
>
> 3. -1.
> -1. 5.
>
> <> EIG( a )
>
> ANS =
>
> 2.5858
> 5.4142
>
> <> CHOL( a )
>
> ANS =
>
> 1.7321 -0.5774
> 0.0000 2.1602
>
> <> CTRL-C to quit (?)
>
> Looks like the results of eigenvalues and Cholesky are also correct (by comparison
> to other tools) :-)

Hmm.. You need a tool to check whether EIG gives 4 +- sqrt(2),
and CHOL gives Sqrt(3), sqrt(3)/3, sqrt(4+2/3).

So, a tool to compute squares would suffice! Checking the
results is clearly easier here then calculating them..

--
Jos

jfh

unread,

Dec 10, 2017, 6:19:38 PM12/10/17

to

On Sunday, December 10, 2017 at 6:53:27 PM UTC+13, spectrum wrote:
>
> Yes, I also think it may be related to hand writing on paper, from left to right
> (if a person is right-handed), for the same reason why English is written from
> left to right. I had written a similar thing in a different forum much ago,
> but I got no more info at that time :)

If you want to convince us that that's why English is written L-R then you'll also have to explain why Hebrew and Arabic are written R-L.

Louis Krupp

unread,

Dec 10, 2017, 7:35:54 PM12/10/17

to

On Sun, 10 Dec 2017 15:19:35 -0800 (PST), jfh <john....@vuw.ac.nz>
wrote:

FWIW, I remember reading somewhere that languages that were originally
chiseled into tablets or stone were more likely to go right to left
(with the right and usually dominant hand doing the pounding) and
languages that were originally written with a pen of some sort were
more likely to go left to right (so that the right hand wouldn't
smudge the letters that had already been written).

Louis

Beliavsky

unread,

Dec 10, 2017, 11:07:16 PM12/10/17

to

On Sunday, December 10, 2017 at 5:24:22 AM UTC-5, paul.rich...@gmail.com wrote:
> On Sunday, 10 December 2017 10:03:15 UTC, paul.rich...@gmail.com wrote:
> > On Sunday, 10 December 2017 05:14:49 UTC, herrman...@gmail.com wrote:
> > > On Saturday, December 9, 2017 at 8:26:19 PM UTC-8, spectrum wrote:
> > > > Recently I have come across this thread (Q&A):
> > ...snip..
> > > I believe that R and Matlab have some connection to
> > > Fortran, as the reason for their use of column major.
> > >
> >
> > Matlab was originally written in fortran. The source is here https://www.math.wustl.edu/~victor/utility/matlab/matlab.zip
> >
> > Paul
>
> I just verified that it still compiles and runs with gfortran-5.1.1 onwards (and probably earlier versions too). Ignore the warnings "Warning: Legacy Extension: REAL array index at (1)".
>

It would be an interesting exercise to write an Octave/Matlab interpreter in modern Fortran. John Burkardt has written MATMAN "a FORTRAN90 program which allows an interactive user to define a matrix and carry out the individual steps of certain algorithms" http://people.sc.fsu.edu/~jburkardt/f_src/matman/matman.html .

Terence

unread,

Dec 10, 2017, 11:35:10 PM12/10/17

to

"Louis Krupp" noted

>>If you want to convince us that that's why English is written L-R then
>>you'll also have to explain why Hebrew and Arabic are written R-L.

>FWIW, I remember reading somewhere that languages that were originally
>chiseled into tablets or stone were more likely to go right to left
>(with the right and usually dominant hand doing the pounding) and
>languages that were originally written with a pen of some sort were
>more likely to go left to right (so that the right hand wouldn't
>smudge the letters that had already been written).

>Louis

But Hieroglyphics were written r-l on the first row, l-r on the seconds and
continuing to alternate "as the plough". But keep thinking about it....
Terence

Terence

unread,

Dec 10, 2017, 11:38:44 PM12/10/17

to

And most of the painted work I've seen in Egyptian tombs was written in
columns from the right side.

Thomas Koenig

unread,

Dec 11, 2017, 2:32:08 AM12/11/17

to

jfh <john....@vuw.ac.nz> schrieb:

And ancient Chinese in T-B.

Message has been deleted

spectrum

unread,

Dec 11, 2017, 7:13:03 PM12/11/17

to

Yes, I'm recently very lazy after getting used to use Python etc
as a "super-calculator"... XD (Indeed, I even do not have a real calculator
in hand, though I used it heavily when I was a student...)
So even for 2x2 things, I just wanted to write something like "eig(A)".
(Of course, for this size, we can do it analytically on paper.)

By the way, symbolic math programs may also be fun to use (sometimes,
though I don't use it very much). For example:

https://www.wolframalpha.com/input/?i=eigenvalues+%7B+%7Ba,-b%7D,%7B-b,c%7D+%7D
https://www.wolframalpha.com/input/?i=cholesky+%7B+%7Ba,-b%7D,%7B-b,c%7D+%7D

spectrum

unread,

Dec 11, 2017, 7:22:11 PM12/11/17

to

As for left->right or right->left language things, I'm not trying to claim which is more
"correct" or "natural" or "suited for math" etc, but interested in how the population
(popularity) of one convention grows in history (which affects the column-major
order in Fortran possibly?). In this sense, the apperance of printing technology
(on paper) and a wide-spread use of "books" for communication might be the key
time frame that affects most severely on the prevalence of some convention.
(Or possibly, some particular math society had some strong influence on
the convention artificially at later times, etc...?)

https://en.wikipedia.org/wiki/Printing

# By the way, in my country, people write/read from right to left and top to bottom
for novel things, but write/read from left to right for more "modern" things (lol).
So, a scientific novel including math needs to show any equations by rotating them
by 90 degrees ... XD

campbel...@gmail.com

unread,

Dec 11, 2017, 8:40:50 PM12/11/17

to

After years of changing array subscript order to improve cacheing, I don't think there is a reason for Row major vs Column major.
Fortran and C chose differently and to use either language, you just need to know the storage convention.
For matrix multiplication in Fortran : C(I,J) = sum ( A(I,:)*B(:,J) )
If Fortran convention was row major, what would change ? Just the order they are stored in memory.

herrman...@gmail.com

unread,

Dec 11, 2017, 9:40:35 PM12/11/17

to

On Monday, December 11, 2017 at 5:40:50 PM UTC-8, campbel...@gmail.com wrote:
> After years of changing array subscript order to improve cacheing,
> I don't think there is a reason for Row major vs Column major.
> Fortran and C chose differently and to use either language,
> you just need to know the storage convention.

It is more fundamental in C. The C [] operator, works on the
expression to the left. If C has wanted column major, it would
have needed an operator that applied to an expression to the right,
and so array element references would come out [j][i]A.
I suppose for Hebrew programmers, and others using a language
written right to left, that wouldn't be so strange.

Even more, note that in C, A[i] is equivalent to i[A],
(in the same way that *(A+i) is equal to *(i+A)).

Continuing, A[i][j] is actually (A[i])[j] which is the same
as j[(A[i])] and j[A[i]] and j[i[A]].

> For matrix multiplication in Fortran : C(I,J) = sum ( A(I,:)*B(:,J) )
> If Fortran convention was row major, what would change ?
> Just the order they are stored in memory.

Yes the order they are stored, but that comes out in a
few places. One is EQUIVALENCE, if you equivalence arrays
of different rank and/or dimensions.

Also, the places where array operations do trace back to
(or before) Fortran 66, I/O statements and DATA statements.

WRITE(6,1) A

writes out the elements of A in memory order, and

DATA A/1, 2, 3, 4/

fills A in memory order.

pehache

unread,

Dec 12, 2017, 8:14:44 AM12/12/17

to

Le 12/12/2017 à 02:40, campbel...@gmail.com a écrit :

> After years of changing array subscript order to improve cacheing, I don't think there is a reason for Row major vs Column major.

I tend to agree with agree that.

> Fortran and C chose differently

Basically, C has only 1D arrays. So it is neither "column major" nor
"row major".

herrman...@gmail.com

unread,

Dec 12, 2017, 9:45:54 AM12/12/17

to

On Tuesday, December 12, 2017 at 5:14:44 AM UTC-8, pehache wrote:

> Le 12/12/2017 à 02:40, campbelljoh....@gmail.com a écrit :
>
> > After years of changing array subscript order to improve
> > cacheing, I don't think there is a reason for Row major vs Column major.

> I tend to agree with agree that.

Well, there is consistency. Once Fortran chose it, it is hard to change.

> > Fortran and C chose differently

> Basically, C has only 1D arrays. So it is neither "column major"
> nor "row major".

Well, C does have something that allocates the same as multiple
dimension arrays. It does, however, reference them as arrays
of arrays. When it does that, the subscripts are (normally) in
the order of row major. (Using the commutativity of the []
operator, you can have much fun in C subscripting.)

On the other hand, as I understand it, COBOL really does have
only 1D arrays. In that case, you get around it with structures,
where you can dimension each level of a structure.

Also, as I understand it, when referencing a structure array,
one can move all the subscripts to the right.

In PL/I notation, you can:

DCL 1 A(10), 2 B(10), 3 C(10), 4 D(10) FLOAT BIN(53);

A.B.C.D(I,J,K,L) = 3;

and with partial qualification, assuming unambiguity:

D(I,J,K,L) = 3;

pehache

unread,

Dec 12, 2017, 12:12:22 PM12/12/17

to

pehache

unread,

Dec 12, 2017, 12:26:48 PM12/12/17

to

Le 12/12/2017 à 15:45, herrman...@gmail.com a écrit :
>

>> > Fortran and C chose differently
>
>> Basically, C has only 1D arrays. So it is neither "column major"
>> nor "row major".
>
> Well, C does have something that allocates the same as multiple
> dimension arrays.

I would rather say "...that simulates multiple dimension arrays."

> It does, however, reference them as arrays
> of arrays. When it does that, the subscripts are (normally) in
> the order of row major.

I get that, but in Fortran "column major" is just a convention. They could
have chosen "row major" from the beginning and Fortran would be exactlly
the same. In contrast, C can not be anything else than "row major", it's
not a convention, it's by design.

herrman...@gmail.com

unread,

Dec 12, 2017, 8:42:04 PM12/12/17

to

On Tuesday, December 12, 2017 at 9:26:48 AM UTC-8, pehache wrote:

(snip, I wrote)

> > Well, C does have something that allocates the same as multiple
> > dimension arrays.

> I would rather say "...that simulates multiple dimension arrays."

If you say:

int x[10][10];

It allocates 100 elements in contiguous storage, and if
referenced, the compiler does the usual multiply and
add to find an element.

On the other hand, if in Java you say:

int[][] x=new int[10][10];

you get an array of 10 object references to 10 arrays
of 10 elements each.

This is what I call array of (references to) arrays.

> > It does, however, reference them as arrays
> > of arrays. When it does that, the subscripts are (normally) in
> > the order of row major.

> I get that, but in Fortran "column major" is just a convention. They could
> have chosen "row major" from the beginning and Fortran would be exactlly
> the same. In contrast, C can not be anything else than "row major", it's
> not a convention, it's by design.

As I noted, it is only the desire for source to be read left
to right that causes that. If [] was right associative, then it would
be column major, but you would write it [j][i]x which looks strange.

But as I noted earlier, since [] is commutative:

x[i][j] == (i[x])[j] == j[i[x]], the latter, with j before i,

could be considered column major. But again, it looks strange.

And yes, the C language requires this to work.

pehache

unread,

Dec 13, 2017, 1:39:31 PM12/13/17

to

Le 13/12/2017 à 02:41, herrman...@gmail.com a écrit :
> On Tuesday, December 12, 2017 at 9:26:48 AM UTC-8, pehache wrote:
>
> (snip, I wrote)
>> > Well, C does have something that allocates the same as multiple
>> > dimension arrays.
>
>> I would rather say "...that simulates multiple dimension arrays."
>
> If you say:
>
> int x[10][10];
>
> It allocates 100 elements in contiguous storage, and if
> referenced, the compiler does the usual multiply and
> add to find an element.

OK, but under the hood this is still pretty different from Fortran. The two
subscripts here are not equivalent: as far as I know (but maybe I'm wrong
?) x is still an address in this case (the address of x[0]). So I
guess that the compiler implicitely creates first a 1d array of pointers.

>
> On the other hand, if in Java you say:
>
> int[][] x=new int[10][10];
>
> you get an array of 10 object references to 10 arrays
> of 10 elements each.

To me it's still the case in C, except that in C the compiler is "forced"
to use contiguous memory for the storage in such cases.

>
>> > It does, however, reference them as arrays
>> > of arrays. When it does that, the subscripts are (normally) in
>> > the order of row major.
>
>> I get that, but in Fortran "column major" is just a convention. They could
>> have chosen "row major" from the beginning and Fortran would be exactlly
>> the same. In contrast, C can not be anything else than "row major", it's
>> not a convention, it's by design.
>
> As I noted, it is only the desire for source to be read left
> to right that causes that. If [] was right associative, then it would

> be column major, but you would write it [j]x which looks strange.

It looks strange, it beats the maths convention for the matrix notations,
and most of all it's a different syntax. In contrast, as I wrote earlier,
choosing a different storage convention in Fortran would have had no impact
on the syntax.

> But as I noted earlier, since [] is commutative:
>

> x[j] == (i[x])[j] == j[i[x]], the latter, with j before i,

>
> could be considered column major. But again, it looks strange.
>
> And yes, the C language requires this to work.

Really ? ? I didn't know that (but I'm not a skilled C programmer)

Louis Krupp

unread,

Dec 13, 2017, 2:07:53 PM12/13/17

to

On Wed, 13 Dec 17 18:39:23 +0000, pehache <peha...@gmail.com> wrote:

>Le 13/12/2017 à 02:41, herrman...@gmail.com a écrit :
>> On Tuesday, December 12, 2017 at 9:26:48 AM UTC-8, pehache wrote:
>>
>> (snip, I wrote)
>>> > Well, C does have something that allocates the same as multiple
>>> > dimension arrays.
>>
>>> I would rather say "...that simulates multiple dimension arrays."
>>
>> If you say:
>>
>> int x[10][10];
>>
>> It allocates 100 elements in contiguous storage, and if
>> referenced, the compiler does the usual multiply and
>> add to find an element.
>
>OK, but under the hood this is still pretty different from Fortran. The two
>subscripts here are not equivalent: as far as I know (but maybe I'm wrong
>?) x is still an address in this case (the address of x[0]). So I
>guess that the compiler implicitely creates first a 1d array of pointers.

A reasonable guess, but it's not how C does it. To use a simpler
example, this array:

int a[2][3];

is laid out like this:

a[0][0], a[0][1], a[0][2], a[1][0], a[1][1], a[1][2]

You *could* have an array of pointers:

int* a[2];

and you could dynamically allocate memory for each row:

a[0] = (int*)malloc(3 * sizeof(int));
a[1] = (int*)malloc(3 * sizeof(int));

The syntax for accessing the third element of the second row would be
the same in either case:

a[1][2]

It gets weirder. If I recall correctly, you could dynamically allocate
a two-dimensional array like this:

typedef int row[3];
a = (row*)malloc(2 * sizeof(row));

and the same element would be accessed the same way, and you'd have
only one pointer, a.

<snip>

Louis

herrman...@gmail.com

unread,

Dec 13, 2017, 9:22:41 PM12/13/17

to

On Wednesday, December 13, 2017 at 10:39:31 AM UTC-8, pehache wrote:

(snip, I wrote)
> >> > Well, C does have something that allocates the same as multiple
> >> > dimension arrays.

> >> I would rather say "...that simulates multiple dimension arrays."

> > If you say:

> > int x[10][10];

> > It allocates 100 elements in contiguous storage, and if
> > referenced, the compiler does the usual multiply and
> > add to find an element.

> OK, but under the hood this is still pretty different from Fortran. The two
> subscripts here are not equivalent: as far as I know (but maybe I'm wrong
> ?) x is still an address in this case (the address of x[0]). So I
> guess that the compiler implicitely creates first a 1d array of pointers.

As far as I know, both are implemented the same way.

Actually, neither Fortran nor C has a requirement on how they
are implemented, as long as they do what they are supposed to do.

I remember the PDP-11 Fortran compiler generated an array of
addresses of rows, as that is faster than multiply.

I would say the main difference is that in C you can say:

float x[10][10], y[];

y=x[i];
y[j]=5;

which you couldn't do in Fortran before pointers, and still
isn't quite as easy.

For most compilers, i will be multiplied by 40, and added to
the origin of x, and assigned to y. Then 4*j is added, and
used to address y[j] to assign 5 to it.

But the thing that is strange about C is that the same
syntax has different meaning, depending on the declaration.

If you say:

float **z, *y;

you can also reference it as z[i][j], and can also

y=z[i];
y[j]=5;

but in this case, z is an array of pointers, and y is a pointer.

> > On the other hand, if in Java you say:

> > int[][] x=new int[10][10];

> > you get an array of 10 object references to 10 arrays
> > of 10 elements each.

> To me it's still the case in C, except that in C the compiler is "forced"
> to use contiguous memory for the storage in such cases.

For the x[10][10] form, most compilers will find the array
element by multiplying i by 40, but, as with Fortran, an array
of pointers that you can't assign to is also a possible implementation.

> >> > It does, however, reference them as arrays
> >> > of arrays. When it does that, the subscripts are (normally) in
> >> > the order of row major.

(snip)

> > As I noted, it is only the desire for source to be read left
> > to right that causes that. If [] was right associative, then it would
> > be column major, but you would write it [j]x which looks strange.

> It looks strange, it beats the maths convention for the matrix notations,
> and most of all it's a different syntax. In contrast, as I wrote earlier,
> choosing a different storage convention in Fortran would have had no impact
> on the syntax.

> > But as I noted earlier, since [] is commutative:

> > x[j] == (i[x])[j] == j[i[x]], the latter, with j before i,

> > could be considered column major. But again, it looks strange.

> > And yes, the C language requires this to work.

> Really ? ? I didn't know that (but I'm not a skilled C programmer)

x[i] is defined to be the same as *(x+i).

Since addition is comutative, even in this case, it equals *(i+x),
which is, by the same definition, i[x].

The same transformation works for higher dimensions.

pehache

unread,

Dec 14, 2017, 5:28:13 AM12/14/17

to

pehache

unread,

Dec 14, 2017, 5:38:30 AM12/14/17

to

Le 13/12/2017 à 20:07, Louis Krupp a écrit :

> On Wed, 13 Dec 17 18:39:23 +0000, pehache <peha...@gmail.com> wrote:
>
>>Le 13/12/2017 à 02:41, herrman...@gmail.com a écrit :
>>> On Tuesday, December 12, 2017 at 9:26:48 AM UTC-8, pehache wrote:
>>>
>>> (snip, I wrote)
>>>> > Well, C does have something that allocates the same as multiple
>>>> > dimension arrays.
>>>
>>>> I would rather say "...that simulates multiple dimension arrays."
>>>
>>> If you say:
>>>
>>> int x[10][10];
>>>
>>> It allocates 100 elements in contiguous storage, and if
>>> referenced, the compiler does the usual multiply and
>>> add to find an element.
>>
>>OK, but under the hood this is still pretty different from Fortran. The two
>>subscripts here are not equivalent: as far as I know (but maybe I'm wrong
>>?) x is still an address in this case (the address of x[0]). So I
>>guess that the compiler implicitely creates first a 1d array of pointers.
>
> A reasonable guess, but it's not how C does it.

I think that it is.

with
x[10][10]

x exists on its own and has to be the pointer to x[O]

So there really exists a 1d array of pointers x[], and the C compiler has
to implement the 2D array in this way.

In contrast, with X(10,10) in Fortran, X(i) means nothing. The Fortran
compiler can implement the 2D array by using a hidden 1D array of pointers,
but it doesn't have to.

pehache

unread,

Dec 14, 2017, 5:47:25 AM12/14/17

to

Le 14/12/2017 à 03:22, herrman...@gmail.com a écrit :
>
>> OK, but under the hood this is still pretty different from Fortran. The
>> two
>> subscripts here are not equivalent: as far as I know (but maybe I'm wrong
>> ?) x is still an address in this case (the address of x[0]). So I
>> guess that the compiler implicitely creates first a 1d array of pointers.
>
> As far as I know, both are implemented the same way.
>
> Actually, neither Fortran nor C has a requirement on how they
> are implemented, as long as they do what they are supposed to do.

Yes, but the point is that in the case of 2D arrays, C and Fortran are not
supposed to do the same thing.

With "int x[10][10]" in C, x is required to exist and to be the pointer
to x[O]

With "integer X(10,10)" in Fortran, there's no such requirement.

>
> I remember the PDP-11 Fortran compiler generated an array of
> addresses of rows, as that is faster than multiply.

Maybe, but this is entirely the choice of the Fortran compiler developer to
do so. In C this not a choice, this is implicitely required.

>
>> > x[j] == (i[x])[j] == j[i[x]], the latter, with j before i,
>
>> > could be considered column major. But again, it looks strange.
>
>> > And yes, the C language requires this to work.
>
>> Really ? ? I didn't know that (but I'm not a skilled C programmer)
>

> x is defined to be the same as *(x+i).

>
> Since addition is comutative, even in this case, it equals *(i+x),
> which is, by the same definition, i[x].
>
> The same transformation works for higher dimensions.
>

Yes, thanks. I'm always and always surprised by the C logic :-)

herrman...@gmail.com

unread,

Dec 14, 2017, 10:36:11 AM12/14/17

to

On Thursday, December 14, 2017 at 2:38:30 AM UTC-8, pehache wrote:
> Le 13/12/2017 à 20:07, Louis Krupp a écrit :

(snip)

> > A reasonable guess, but it's not how C does it.

> I think that it is.

> with
> x[10][10]

> x exists on its own and has to be the pointer to x[O]

> So there really exists a 1d array of pointers x[], and the C compiler has
> to implement the 2D array in this way.

No, there is a logical 1D array of pointers, but it doesn't have
to be physically implemented. Except on compilers without a multiply
instruction, it is commonly not implemented as an array of pointers.

C does have to be able to evaluate the expression x[i] for appropriate
values of i, usually using multiply.

> In contrast, with X(10,10) in Fortran, X(i) means nothing. The Fortran
> compiler can implement the 2D array by using a hidden 1D array of pointers,
> but it doesn't have to.

C is sometimes described using the "as if" rule. If it gives the
result specified by the standard, it can be implemented in any
appropriate way.

I suspect that works for Fortran, too.

herrman...@gmail.com

unread,

Dec 14, 2017, 11:20:40 AM12/14/17

to

On Saturday, December 9, 2017 at 8:26:19 PM UTC-8, spectrum wrote:

(snip)
> which asks why a particular language (Julia in this case) has chosen to use
> column-major order for the storage of rectangular arrays. There are several
> comments in the above thread, and some of them trace back to the original
> choice of column major in Fortran and Matlab.

There has been much discussion here about the advantages of each,
but I believe still not answering the question.

At some point, it is an arbitrary choice.

Some Fortran features have been traced back to the IBM 704
instructions, and especially the index registers.

I don't know, though, of an actual reason that the 704 would
prefer column major.

One that I do know, is that many of the early Fortran compilers
store arrays with increasing subscript value at lower addresses.
The index registers are subtracted, instead of added, to the
base value.

It is also convenient, as COMMON is allocated from the end of
memory toward the beginning. Since they do it consistently,
it doesn't have a visible effect to the programmer.

Ron Shepard

unread,

Dec 14, 2017, 11:51:46 AM12/14/17

to

On 12/14/17 4:38 AM, pehache wrote:
>> A reasonable guess, but it's not how C does it.
> I think that it is.
>
> with
> x[10][10]
>
> x exists on its own and has to be the pointer to x[O]
>
> So there really exists a 1d array of pointers x[], and the C compiler has
> to implement the 2D array in this way.

If this array exists, then it should be possible to demonstrate its
existence. For example, you should be able to print out the values of
those addresses, or pass the address of that array to other functions
and use it as a regular 1D array. If you try that, I believe you will
find that it doesn't exist. Instead, there is only the single address
which corresponds to x[0][0]. But don't just take my word for it, see
for yourself. Look at the assembler code, or with gcc look at the
intermediate code.

$.02 -Ron Shepard

pehache

unread,

Dec 14, 2017, 12:21:45 PM12/14/17

to

Le 14/12/2017 à 16:36, herrman...@gmail.com a écrit :
> On Thursday, December 14, 2017 at 2:38:30 AM UTC-8, pehache wrote:
>> Le 13/12/2017 à 20:07, Louis Krupp a écrit :
>
> (snip)
>
>> > A reasonable guess, but it's not how C does it.
>
>> I think that it is.
>
>> with
>> x[10][10]
>
>> x exists on its own and has to be the pointer to x[O]
>
>> So there really exists a 1d array of pointers x[], and the C compiler has
>> to implement the 2D array in this way.
>
> No, there is a logical 1D array of pointers, but it doesn't have
> to be physically implemented.
>

> C does have to be able to evaluate the expression x for appropriate

> values of i, usually using multiply.

For any behavior of a langage, it is always possible to say "it's just a
logical bahavior, it doesn't have to be implemented that way". But what's
the point ?

Wether x[] physically exists or not in the memory does not make any
difference : the consequence of the (virtual/real) existence of x[] is that
x[10][10] could not be "column major", and that was my point.

pehache

unread,

Dec 14, 2017, 12:46:24 PM12/14/17

to

Le 14/12/2017 à 17:51, Ron Shepard a écrit :
> On 12/14/17 4:38 AM, pehache wrote:
>>> A reasonable guess, but it's not how C does it.
>> I think that it is.
>>
>> with
>> x[10][10]
>>
>> x exists on its own and has to be the pointer to x[O]
>>
>> So there really exists a 1d array of pointers x[], and the C compiler has
>> to implement the 2D array in this way.
>
> If this array exists, then it should be possible to demonstrate its
> existence. For example, you should be able to print out the values of
> those addresses, or pass the address of that array to other functions
> and use it as a regular 1D array. If you try that, I believe you will
> find that it doesn't exist. Instead, there is only the single address
> which corresponds to x[0][0].

hmmm... looks like you're right.

int x[10][10];
int i, j;

for (i=0;i<10;i++) { for (j=0;j<10;j++) { x[j] = 10*i+j; } }

printf(" x = %p\n", x );
printf(" *x = %p\n", *x );
printf("**x = %d\n", **x );

gives

x = 0x7fffffffdc70
*x = 0x7fffffffdc70
**x = 0

..and I'm confused : x is equal to *x, so *x should be equal to **x, but
it's not. I hate C :-)

edmondo.g...@gmail.com

unread,

Dec 14, 2017, 5:58:25 PM12/14/17

to

If I have understood the C standard.

int x[3][4];

x is a pointer to an object of 4 ints.
*x is a pointer to an int.
Actually they are coincident.

**x is the first element that is 0.

When you dereference them
x[i] is equal to *((x)+(i))
as x is a pointer to an object of 4 ints the pointer arithmetics means that the real object is at i*sizeof(int)*4 from x.

But now you get the magic and what you get is still an address, and
x[i][j]
means the starting address plus
i*sizeof(int)*4 + j*sizeof(int)
But now the dereferencing works and you get the value..

But I aggree its confusing, the Fortran is better.
Cheers

Thomas Jahns

unread,

Dec 15, 2017, 3:59:26 AM12/15/17

to

On 12/14/17 16:36, herrman...@gmail.com wrote:
> On Thursday, December 14, 2017 at 2:38:30 AM UTC-8, pehache wrote:
>> x exists on its own and has to be the pointer to x[O]
>
>> So there really exists a 1d array of pointers x[], and the C compiler has
>> to implement the 2D array in this way.
>
> No, there is a logical 1D array of pointers, but it doesn't have
> to be physically implemented. Except on compilers without a multiply
> instruction, it is commonly not implemented as an array of pointers.

For pointer arithmetic to work as prescribed in the C standard, there's no way
to implement arrays of arrays in C with intermediate pointers. I challenge you
to name a platform that indeed uses your implementation idea.

Thomas

Thomas Jahns

unread,

Dec 15, 2017, 4:00:28 AM12/15/17

to

On 12/14/17 11:38, pehache wrote:
> x exists on its own and has to be the pointer to x[O]

No, the expression x evaluates to a pointer to first element of x, but still
it's an array, not a pointer.

Thomas

Thomas Jahns

unread,

Dec 15, 2017, 5:53:09 AM12/15/17

to

On 12/14/17 03:22, herrman...@gmail.com wrote:
> Actually, neither Fortran nor C has a requirement on how they
> are implemented, as long as they do what they are supposed to do.

True, but in many cases alternative implementations are exceedingly hard to do,
i.e. Fortran storage association imposes a lot of rules on the implementation.

> I remember the PDP-11 Fortran compiler generated an array of
> addresses of rows, as that is faster than multiply.
>
> I would say the main difference is that in C you can say:
>
> float x[10][10], y[];

That would only be possible for extern declarations as in

extern float x[10][10], y[];

the actual definition of y would always require a specific size or one derived
from an initializer.

Thomas

herrman...@gmail.com

unread,

Dec 15, 2017, 7:21:27 AM12/15/17

to

On Friday, December 15, 2017 at 2:53:09 AM UTC-8, Thomas Jahns wrote:

(snip, I wrote)

> > float x[10][10], y[];

> That would only be possible for extern declarations as in

> extern float x[10][10], y[];

> the actual definition of y would always require a specific size
> or one derived from an initializer.

Sometimes you can use it with the same meaning as *y, which I
mostly never did. I then didn't remember when you can and can't.

So, yes:

float x[10][10], *y;