Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Looking for 7-parm least squares code

28 views
Skip to first unread message

Terence

unread,
Oct 3, 2011, 5:57:07 PM10/3/11
to
I am working on an old 1967 astronomical program as printed source code in
what is a form of Fortran IV, with sense switches and JCL file assignment. .
I have input test data and the expected output reports in printed form.

I have been able to convince mysalf, due to the printed reports not matching
the source code formats, and the presence of code that is never reached,
(and in particular, a final subroutine without an END or RETURN) that the
published code is not the code used to produce the reports shown.

I have been able to produce exactly, quite a lot of the published reports,
bar only a difference in number of decimal places used and a shift in text
headings and columns, that the logic flow is pretty much correct and
complete.

However, there remains a puzzling problem,

The program computes the position of a Great Circle curve in the sky, from
observations from one ground-based camera at a time, of a moving object.
(Later the data from two or more cameras is consolidated to determine the
real spacial orbit that accounts for the camera views, knowing where the
cameras are sited exactly).

The data from each camera is processed, by a six-term least squares
operation, to find a closest match with a geometric line of form of six
equations in six unknowns. as equations in the "Normal Form" of Herget
(1948):

[aax + [aby + [acz +.....= [an
[bax + [bby + [bcz +.... = {bn
...
[fax + [fby + [fcz +.... = [fn

Where "[" symbolises the summation integral over i measures (around 70)..


The objective is to find a solution to:

f(x,y) = n + Ax + By + Cxy + Dx^2 + Ex(x^2+y^2) + Fxy^2 + Gy^3 +
Hy(x^2+y^2) .

(the x,y are the Xi,Eta calculated spacial angular coordinates of the
great circle curves).

The parameters A..H are actually printed in the report and the compiled
code does not even approximate to them; though strangely, everything else
is almost exact.
(Apart from that the computed residual errors for each observation are
not anywhere close!)

So my question is:
Can anybody point me to a source of downloadabe Fortran code capable of
performing
this operation, (since the code I have appears to be incomplete or
incorrect).

I want to see if there is a subtle error in the published code that was
discovered and corrected 50 years ago.


Gordon Sande

unread,
Oct 3, 2011, 6:34:46 PM10/3/11
to
Would an eight parameter code solve your problem?

Just facetious as I once was a second hand observer of a client who wanted to
know if a solution for eight firms could be used for just six firms.
Over attention
to the immediate problem.

You seem to be describing a linear least squares solution to a
polynomial regression
problem. The usual solution is for N terms. But is sure helps if N is
kept small.
If you use the normal equations for a linear least squares solution you
had better
keep N even smaller. (And you will loose your license to use floating point
numbers as such methods have been viewed as obsolete since the 1970s.
Oh I forgot,
you repeatedly state that the 1960s are the state of the art.) The
substance of the
problem here is that your problem is probably bad but tractable in 48
bits of the
early CDC machine. In 32 bits it is bad news and completely different
answers are
of no surprise. The immediate thing to try is to switch all floating
point to 64 bits.
The longer term modern solution is to switch to using a more modern
numerical method.
The usual buzz words are QR decomposition. Any source of statistical
computing would
have multiple versions. I believe that you have stated that you supply
statistical
software to others so you will be familiar with the various
repositories. The use of
the normal equations squares the condition number compared to the
condition number of
the R of the QR. The use of the normal equations dates from the days of
mechanical
calculators which were of much higher precision that modern computers
and were applied
to much smaller and tamer data sets. This is all very old history as
most of the
literature on this sort of problem will date from the 1960s and 1970s
which is later
that your 1948 citation.

The bad coefficients will lead to bad residuals as you notice. One
presumes that
if you look carefully you will find a matrix inversion routine hidden
in your code.
Maybe even a Gauss-Jordan inversion as might have been in the old
Ralston and Wilf
books.

Polynomials are not a good basis for fitting but that is a different,
if related,
discussion that can be avoided in the short term here.

A good modern reference would be Bjorck's SIAM book on Linear Least
Squares. Even it
has been around for more years that most of us have fingers.




Terence

unread,
Oct 3, 2011, 6:37:48 PM10/3/11
to
I should have added that the normal metthod of solution, by inverting a
matrix, would be ill-conditioned in this case, and the best way would be to
consider the equations as a matrix mutiply:-

A * X = B

where A is the matrix and X is the vector of unknowns and B the measures.
Then a reducution of the matrix to an upper right triangular matrix should
be the first step, followed by a simple back calculation, should have a
lower error rate.
But I need a code to check the old code against, not an optimised general
purpose application for solving this type of problem.


Terence

unread,
Oct 3, 2011, 6:51:00 PM10/3/11
to
Not only all of the above, but I have post-graduate (Hons. Maths)
additional qualifications in Numerical Analysis. I KNOW how to do
this, but the existing code is staring me in the face, and I don't see
where the problem is and I'm supposed to get this working "as is", not
to replace it unduly. The venerable author is even on hand if not too
approachable now...

The old machine was a CDC 1604 and the internal evidence in the code
points to s 32-bit machine with one's complement hardware.

glen herrmannsfeldt

unread,
Oct 3, 2011, 8:40:15 PM10/3/11
to
Terence <tbwr...@cantv.net> wrote:

(big snip)
> The old machine was a CDC 1604 and the internal evidence in the code
> points to s 32-bit machine with one's complement hardware.

From "Blaauw and Brooks", it has a 48 bit floating point format
with a 36 bit significand. No hardware double precision, and rarely
software double precision.

It seems that prior to the design of the IBM S/360, the designers
surveyed users of other machines around at the time. The result
was that 48bits was a good floating point size, but wasn't used
for S/360.

Try double precision and see if the results change.

It doesn't say anything about the fixed point format.

-- glen

glen herrmannsfeldt

unread,
Oct 3, 2011, 8:47:25 PM10/3/11
to
Terence <tbwr...@bigpond.net.au> wrote:
> I should have added that the normal metthod of solution, by inverting a
> matrix, would be ill-conditioned in this case, and the best way would be to
> consider the equations as a matrix mutiply:-

> A * X = B

> where A is the matrix and X is the vector of unknowns and B the measures.

(snip)

The other favorite one from days passed was inversion with a
correction. That is, invert, see how far off it is, then do
one correction. Maybe also iterative, but the ones I remember
just did it once.

Any comments to say what it is supposed to be doing?
(That is, the algorithm for the least squares.)

-- glen

Gordon Sande

unread,
Oct 3, 2011, 8:59:39 PM10/3/11
to
Which gets back to the advice that if 48 bits seemed to work and 32 bits
patently does not then try 64 bits. At least it is just fiddling with the
code and it is a good bet it will only have serious effect on the results
of the equation solving.

Keep a backup and only fiddle with a copy.





Terence

unread,
Oct 3, 2011, 9:01:14 PM10/3/11
to
Very interesting Glen!
I went on the fact that the code stores 8 characters in reals.and integers
This is a strong hint at 32 words.
The presence of -0 (negative zero) as a character in single integer output
when there is blank input, indicates a ones-compleiment cpu.

A 48-bit word would be capable of six 8-bit characters (or even eight of the
96 readable ascii lower table at 6 bits each; Modcomp machines did something
like like that).

The code LOOKS right for a two-pass 6 by 7 (6*6 plus vector) matrix reorder
to triangular, with a final multiply out of terms and back-calculate. But
it's like any problem of this kind; the error is there and you never
understand why you missed it till it's pointed out. This is the main problem
with not-very-good OCR translation resurrections from a book of 210 pages.



Terence

unread,
Oct 3, 2011, 10:03:17 PM10/3/11
to
I tried compiling and running as a DOS program, and as a Windows
command-line native windows program (4 times larger).
Both give the same digits in the output, except that the windows version
puts a leading zero before a number less than 1.0; the F77 compilation
version did not do that.. That extra zero actually makes the report far less
readable as the line of digits run together.

I decreased the precision to Real*4 and got slightly different results, but
in the main the same reports, but the junk part was different junk of the
same order of magnitude.

I'd show the code but it is atomic in nature, pages but pages of single
operations on the matrices as uniquely named cells and not within do-loops.
.


Terence

unread,
Oct 3, 2011, 10:14:10 PM10/3/11
to
Terence <tbwr...@bigpond.net.au> wrote:

Any comments to say what it is supposed to be doing?
(That is, the algorithm for the least squares.)
-- glen
I honestly thought I'd explained that, although it's much more complex that
a simple least-squares.
.
It has 6 equeations in six unknowns and needs to find a best fit. (I didn't
say that there is a weighting term for each k-th sets of measures which is a
computed confidence level on 5 repeated measures; these are straight
multipliers by values <=1.0)
It's forming a matrix of Q(6,7), with the last index being one extra for the
equality vector of the equations, then reforming an upper-triangle matrix
wth all zeroes below the diagonal. Then working back form the last term to
the first term to solve the values of the parameters of the sought-after
equation involving single, double and triple powers of the variables.


Louisa

unread,
Oct 4, 2011, 4:08:21 AM10/4/11
to
On Oct 4, 12:01 pm, "Terence" <tbwri...@bigpond.net.au> wrote:
> Very interesting Glen!
> I went on the fact that the code stores  8 characters in reals.and integers
> This is a strong hint at 32 words.

No. CDC machines of the period used 6-bit characters.
8 x 6 = 48.

I don't know how many bits/character the 1604 used,
but odds are 6 bits.

Louisa

unread,
Oct 4, 2011, 4:12:06 AM10/4/11
to
On Oct 4, 8:57 am, "Terence" <tbwri...@bigpond.net.au> wrote:
> I am working on an old 1967 astronomical program as printed source code in
> what is a form of Fortran IV, with sense switches and JCL file assignment. .

> Can anybody point me to a source of downloadabe Fortran code capable of
> performing
> this operation, (since the code I have appears to be incomplete or
> incorrect).

That is insufficient information.
What is needed is a listing of the subroutine
as far as the first executable statement.

Terence

unread,
Oct 4, 2011, 4:51:47 PM10/4/11
to
Whether or not a "machine" used 6 bits or 8 bits, for each character stored
into a 32 or 48 bit word, it was the way the data was put into that word
that mattered and that was a human-implemented piece of of code ("machine"
code or otherwise).

And as I have observed, the compiler for the target computer for these
programs under discussion, declared to be a CDC 1604 in a book about the
programs, somehow passed character strings as one, two, or three
8-character Hollerith string sections as part sof an array of characters to
be printed bya plootter routine. Were these stored as 8 sets of 6 bits per
48 bit word? Or did this computer really have 32 bits only? Or only use 32
bits when called upon to deal with ascii strings?

As an example, the 8 character observatory names in these programs were
stored as a single real value and there was no definition of double
precision anywhere in either of the programs When regurgitating the string,
a format of A8 was used. I have assumed the real variables to be the
equivalent of real*8 as default. And as some integer data had 8 digits, I
have to also assume the default integer size is also Integer*8, both
pointing to a 32 bir word.


Terence

unread,
Oct 4, 2011, 5:23:55 PM10/4/11
to
ok. Here are the two routines for solving the equations. k is 40-100
I added some explanations as comments

SUBROUTINE NORMEQ(XN,A,B,C,D,E,F,W)
DIMENSION Q(6,7)
COMMON Q
C SOLVING [A] * |X| =|N|
C W IS WEIGHT; XN IS THE =N PART.
C A,B,C,D,E,F ARE THE SOUGHT PARAMETERS
C Q IS ZEROED, THEN THIS ROUTINE IS CALLED k TIMES TO PASS THE kth VALUES
C (XN IS THE INPUT VARIABLE AVERAGE; ITS WEIGHT IS W)
Q(1,1) = Q(1,1) + A*A*W
Q(1,2) = Q(1,2) + A*B*W
Q(1,3) = Q(1,3) + A*C*W
Q(1,4) = Q(1,4) + A*D*W
Q(1,5) = Q(1,5) + A*E*W
Q(1,6) = Q(1,6) + A*F*W
Q(1,7) = Q(1,7) + A*XN*W
Q(2,2) = Q(2,2) + B*B*W
Q(2,3) = Q(2,3) + B*C*W
Q(2,4) = Q(2,4) + B*D*W
Q(2,5) = Q(2,5) + B*E*W
Q(2,6) = Q(2,6) + B*F*W
Q(2,7) = Q(2,7) + B*XN*W
Q(3,3) = Q(3,3) + C*C*W
Q(3,4) = Q(3,4) + C*D*W
Q(3,5) = Q(3,5) + C*E*W
Q(3,6) = Q(3,6) + C*F*W
Q(3,7) = Q(3,7) + C*XN*W
Q(4,4) = Q(4,4) + D*D*W
Q(4,5) = Q(4,5) + D*E*W
Q(4,6) = Q(4,6) + D*F*W
Q(4,7) = Q(4,7) + D*XN*W
Q(5,5) = Q(5,5) + E*E*W
Q(5,6) = Q(5,6) + E*F*W
Q(5,7) = Q(5,7) + E*XN*W
Q(6,6) = Q(6,6) + F*F*W
Q(6,7) = Q(6,7) + F*XN*W
RETURN
END
SUBROUTINE LSQSLN(X1,X2,X3,X4,X5,X6)
DIMENSION Q(6,7)
COMMON Q
C THIS ROUTINE IS CALLED TO SOLVE THE EQUATION
C REDEFINE SYMBOLS
AA = Q(1,1)
AB = Q(1,2)
AC = Q(1,3)
AD = Q(1,4)
AE = Q(1,5)
AF = Q(1,6)
AN = Q(1,7)
BB = Q(2,2)
BC = Q(2,3)
BD = Q(2,4)
BE = Q(2,5)
BF = Q(2,6)
BN = Q(2,7)
CC = Q(3,3)
CD = Q(3,4)
CE = Q(3,5)
CF = Q(3,6)
CN = Q(3,7)
DD = Q(4,4)
DE = Q(4,5)
DF = Q(4,6)
DN = Q(4,7)
EE = Q(5,5)
EF = Q(5,6)
EN = Q(5,7)
FF = Q(6,6)
FN = Q(6,7)
C FORWARD SOLUTION
A1 = AB/AA
B1 = AC/AA
C1 = AD/AA
D1 = AE/AA
E1 = AF/AA
BB1 = BB-A1*AB
C PAGE 177
BC1 = BC-A1*AC
BD1 = BD-A1*AD
BE1 = BE-A1*AE
BF1 = BF-A1*AF
BN1 = BN-A1*AN
B2 = BC1/BB1
C2 = BD1/BB1
D2 = BE1/BB1
E2 = BF1/BB1
CC2 = CC-B1*AC-B2*BC1
CD2 = CD-B1*AD-B2*BD1
CE2 = CE-B1*AE-B2*BE1
CF2 = CF-B1*AF-B2*BF1
CN2 = CN-B1*AN-B2*BN1
C3 = CD2/CC2
D3 = CE2/CC2
E3 = CF2/CC2
DD3 = DD-C1*AD-C2*BD1-C3*CD2
DE3 = DE-C1*AE-C2*BE1-C3*CE2
DF3 = DF-C1*AF-C2*BF1-C3*CF2
DN3 = DN-C1*AN-C2*BN1-C3*CN2
D4 = DE3/DD3
E4 = DF3/DD3
EE4 = EE-D1*AE-D2*BE1-D3*CE2-D4*DE3
EF4 = EF-D1*AF-D2*BF1-D3*CF2-D4*DF3
EN4 = EN-D1*AN-D2*BN1-D3*CN2-D4*DN3
E5 = EF4/EE4
FF5 = FF-E1*AF-E2*BF1-E3*CF2-E4*DF3-E5*EF4
FN5 = FN-E1*AN-E2*BN1-E3*CN2-E4*DN3-E5*EN4
C BACK SOLUTION
X6 = FN5/FF5
X5 = (EN4-X6*EF4)/EE4
X4 = (DN3-X6*DF3-X5*DE3)/DD3
X3 = (CN2-X6*CF2-X5*CE2-X4*CD2)/CC2
X2 = (BN1-X6*BF1-X5*BE1-X4*BD1-X3*BC1)/BB1
X1 = (AN -X6*AF -X5*AE -X4*AD -X3*AC -X2*AB)/AA
RETURN
END


glen herrmannsfeldt

unread,
Oct 4, 2011, 6:21:58 PM10/4/11
to
Terence <tbwr...@bigpond.net.au> wrote:
> Whether or not a "machine" used 6 bits or 8 bits, for each character stored
> into a 32 or 48 bit word, it was the way the data was put into that word
> that mattered and that was a human-implemented piece of of code ("machine"
> code or otherwise).

48 bit word, and 6 bit characters allows 8 characters per word.

> And as I have observed, the compiler for the target computer for these
> programs under discussion, declared to be a CDC 1604 in a book about the
> programs, somehow passed character strings as one, two, or three
> 8-character Hollerith string sections as part sof an array of characters to
> be printed bya plootter routine. Were these stored as 8 sets of 6 bits per
> 48 bit word? Or did this computer really have 32 bits only? Or only use 32
> bits when called upon to deal with ascii strings?

Six bit characters go pretty far back. The 704 where Fortran started
has 36 bits, and six bit characters.

> As an example, the 8 character observatory names in these programs were
> stored as a single real value and there was no definition of double
> precision anywhere in either of the programs

References say that double precision software (no hardware) was rare
on this machine. Only one system surveyed by Blaauw & Brooks had it.

What I don't know is the size of the integer. Note that the 704 uses
36 bit words for floating point, but only 16 bits of that word
for integer arithmetic. The five digit statement numbers in Fortran,
originally only up to 32767, but later to 99999, are one result.

Also, the 704 card reader reads cards on row at a time into two 36
bit words. Only 72 columns read, a feature still part of Fortran.

> When regurgitating the string,
> a format of A8 was used. I have assumed the real variables to be the
> equivalent of real*8 as default. And as some integer data had 8 digits, I
> have to also assume the default integer size is also Integer*8, both
> pointing to a 32 bir word.

Fortran requires (though this is before the first standard) that the
storage format for REAL and INTEGER take up the same size, but not
that all bits are used. As I noted, the 704 used only 16 of 36
for integers. Signs are that INTEGER is 48 bits, but it might
be less.

-- glen

glen herrmannsfeldt

unread,
Oct 4, 2011, 6:30:42 PM10/4/11
to
Terence <tbwr...@bigpond.net.au> wrote:
> ok. Here are the two routines for solving the equations. k is 40-100
> I added some explanations as comments

> SUBROUTINE NORMEQ(XN,A,B,C,D,E,F,W)
> DIMENSION Q(6,7)
> COMMON Q
> C SOLVING [A] * |X| =|N|
> C W IS WEIGHT; XN IS THE =N PART.
> C A,B,C,D,E,F ARE THE SOUGHT PARAMETERS
> C Q IS ZEROED, THEN THIS ROUTINE IS CALLED k TIMES TO PASS THE kth VALUES
> C (XN IS THE INPUT VARIABLE AVERAGE; ITS WEIGHT IS W)
> Q(1,1) = Q(1,1) + A*A*W
> Q(1,2) = Q(1,2) + A*B*W
> Q(1,3) = Q(1,3) + A*C*W

Wow, it would have been much easier with an array and DO loops.

> Q(1,4) = Q(1,4) + A*D*W
> Q(1,5) = Q(1,5) + A*E*W
> Q(5,7) = Q(5,7) + E*XN*W

Note that only a triangular matrix is filled. I had forgotten about
this. The matrix is symmetric, and there are inversion routines that
take that into account. I haven't seen one in years, though.

I used to have source for routines that allowed for rectangular,
triangular, symmetric, or diagonal matrices. The routine would then
compute the appropriate offset into the matrix as a 1D array and
extract or store into the element.

> Q(6,6) = Q(6,6) + F*F*W
> Q(6,7) = Q(6,7) + F*XN*W
> RETURN
> END
> SUBROUTINE LSQSLN(X1,X2,X3,X4,X5,X6)
> DIMENSION Q(6,7)
> COMMON Q
> C THIS ROUTINE IS CALLED TO SOLVE THE EQUATION
> C REDEFINE SYMBOLS
> AA = Q(1,1)
> AB = Q(1,2)
> AC = Q(1,3)

(again, element by element an no loops)

Maybe this is supposed to be faster, as loop unrolling.

-- glen

e p chandler

unread,
Oct 4, 2011, 6:53:54 PM10/4/11
to


"glen herrmannsfeldt" wrote in message
news:j6g1ei$ti8$1...@speranza.aioe.org...
----> Well, there is no looping logic here and no branching logic either
[smile].
Suppose that the formulas are transcribed correctly. It's somewhat difficult
to see except that variable names are aliases for subscripts (mostly). Could
it be that this method of solution falls apart on ill-conditioned data as
opposed to a more modern method of matrix factoring?

--- e



Terence

unread,
Oct 4, 2011, 9:29:16 PM10/4/11
to
I have suspected that; even the author said (in a book of 1967) that he
wasn't doing the solution the normal way. but using the upper triangual
rectuion method because of the ill-conditioned matix problem.
.
And the problem with THAT however, is that the number of operations per cell
affected, increases with cell number processed, even if the total number
of operations is less. That could be a source of problems in such a "compute
forward and solve back" method.

I think an iterative "find a good solution, somehow plug this back in and
repeat", might be a better choice of method.
Ok, where's my venerable1961 "Notes on Applied Science No.16: Moderen
Computing Methods" from the National Physical Laboratory.
If you still can find paper and a pencil, or a hand-cranked.computer, this
is the way to go..

H'm. This starts with the pivotal method;
Next pages, the method of Doolittle: compact elimination .
Then "if you have a mechanical calculator" (you can)..
resolve matrix into two triabular forms.

I notice all these keep a check line (sum of equations) to check for errors
and loss of precision.

Ah! Page 19 "Ill conditioned matrices":

[A] * |x| = |b|
solve for x' as first attempt
set r'=b-Ax'
so that A * |(x-x')| = |r'|
find largest element of r' as between 2^(-k) and 2^-(k+1)
Derive r''=r'*2^k
solve [A] * |d| = |r''|
x''=x'+d*2^-k is a solution vector which is more exact (improved upon)
x'

repeat...


Dick Hendrickson

unread,
Oct 5, 2011, 1:52:39 PM10/5/11
to
Back in the good old days, it was normal to rewrite subscripted formulas
with individual variable names. I'd expect Q11 instead of AA. Makes it
much more clear ;). Many compilers did no compile time optimization and
would do run time address arithmetic to compute the address of Q(1,1)
and do it every time they saw it.

Dick Hendrickson
>
> -- glen

Robert Miles

unread,
Oct 8, 2011, 2:01:11 AM10/8/11
to

Robert Miles

unread,
Oct 8, 2011, 2:16:17 AM10/8/11
to
On 10/3/2011 5:51 PM, Terence wrote:
> On Oct 4, 8:34 am, Gordon Sande<Gordon.Sa...@gmail.com> wrote:
>> On 2011-10-03 18:57:07 -0300, Terence said:
[snip]
> Not only all of the above, but I have post-graduate (Hons. Maths)
> additional qualifications in Numerical Analysis. I KNOW how to do
> this, but the existing code is staring me in the face, and I don't see
> where the problem is and I'm supposed to get this working "as is", not
> to replace it unduly. The venerable author is even on hand if not too
> approachable now...
>
> The old machine was a CDC 1604 and the internal evidence in the code
> points to s 32-bit machine with one's complement hardware.

For a program that old, you may want to look for a section of the
code written in assembly language instead of Fortran.

I remember a program I saw from around that time, mostly in
Fortran IV, but with some subroutines in assembly language to
allow higher precision floating point than the hardware could
handle directly (originally double precision on such an early
IBM 360 that the maximum memory was 64 KB).

carolus

unread,
Oct 14, 2011, 3:52:11 PM10/14/11
to
On 10/3/2011 4:57 PM, Terence wrote:
> So my question is:
> Can anybody point me to a source of downloadabe Fortran code capable of
> performing
> this operation,

Mittelman's optimization guide:

http://plato.la.asu.edu/guide.html

Terence

unread,
Oct 15, 2011, 3:28:59 AM10/15/11
to
Thanks, Carolus; site noted.
I can in turn recommend the following for bare code.
http://www.ozemail.com.au/~milleraj


Tom Micevski

unread,
Oct 15, 2011, 8:05:21 AM10/15/11
to
that link is dead

e p chandler

unread,
Oct 15, 2011, 1:46:52 PM10/15/11
to


"Tom Micevski" wrote in message news:j7bstu$egq$1...@dont-email.me...
Try

http://jblevins.org/mirror/amiller/

instead. But I don't know if anything there matches Terence's request.

-- e


0 new messages