Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

physical dimensions and units in scientific programs

5 views
Skip to first unread message

Grant W. Petty

unread,
Feb 3, 2000, 3:00:00 AM2/3/00
to
Hello,

I am intentionally resurrecting a discussion that appears (based on a
DejaNews search) to have last run its course (in slightly different
form) in the mid-1990s:

Is it not possible to make a very modest extension to a widely used
scientific programming language (the most entrenched of which,
for better or for worse, is Fortran 77) to allow a rational, natural,
and intuitive approach to physical dimension checking and unit
conversion?

I believe it is, and I am trying to muster support for making the
necessary modifications to an existing compiler, like g77, so as to
at least demonstrate the concept with working examples.

If interested, please read my more detailed discussion at

http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt

and post your comments here, or send e-mail to gpe...@purdue.edu (NOT
the obsolete reply-to address in the header!).


I should mentione that my reservations about previously posted or
published opinions on the subject lie primarily in the following
areas:

1) The extensions have to be simple and intuitive to use for a large
category of people (e.g., natural and physical scientists like me) who
still customarily program in F77. That means, among other things, not
throwing F77 out in favor of another "modern" language, like C++ or
the like, which few of my colleagues would ever bother to learn for
various reasons (some of which are actually legitimate). Besides,
there's a huge body of legacy code in F77 that we don't want to have
to rewrite from scratch.

2) The extensions should not be restrictive -- i.e., they shouldn't
make it significantly more work to write a program that utilizes the
new data types, nor should they prevent one from continuing to program
"the old way", if one so chooses.

3) In contrast to many previous proposals, I am primarily concerned
with run-time dimension checking rather than compile-time checking,
because the whole point is to spare the programmer/scientist the
tedium of pre-calculating the dimensionality of every intermediate
variable in complex calculations.

If I had the expertise and the time, I would just grab the g77
distribution and attempt to modify it myself. But I don't, so I have
to persuade someone who does to take the approach I am proposing and
help me implement it.

Looking forward to your comments.

regards,

Grant
--
**** Reply to: gpe...@purdue.edu, NOT address in header! *****
Grant W. Petty |Assoc. Prof., Atmospheric Science
Dept. of Earth & Atmospheric Sciences |Voice: (765)-494-2544
Purdue University, West Lafayette IN |Fax: (765)-496-1210

Richard Maine

unread,
Feb 3, 2000, 3:00:00 AM2/3/00
to
gpe...@rain.atms.purdue.edu (Grant W. Petty) writes:

> If interested, please read my more detailed discussion at
> http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt

which, among other things, says

> In particular, there is no physical problem for which it is ever
> meaningful to do any of the following:

> 1) Add, subtract, or equate two values having different physical
> dimensions. For example, it is never physically meaningful to add a
> variable with dimensions of time to one having dimensions of
> distance. Any physical formula which violates this rule is simply
> incorrect, period!

> 2) Supply anything but a pure dimensionless number as an argument to>
> any transcendental functions, such as sine, logarithm, exponent, etc.
> For example, it is never meaningful to take the sine of a length, but
> it is meaningful to take the sine of ratio of lengths (e.g., the
> coordinate x divided by a wavelength lambda).

This seems overly generalized to me. Let me think of both cases:

1. Lets see, I don't offhand have a handy example of adding length to
time, but I've seen things like length**2 added to time**2 in
minimization problems, where one might minimize the sum of such
squares, possibly with weightings. Well, I suppose one might consider
the weightings to have appropriate units of length**(-2) and
time**(-2) so that it all worked out. Sometimes one might not have
such explicit weightings...but I suppose one could take the view that
this is wrong and that one should always have weightings, even if
their value is 1.0, in order to get the units right. So perhaps its
not really a counter-example.

2. How about the other one? I can't off-hand think of a quick example
of taking the sine of a length, but surely I'm not the only person
to have ever taken the logarithm of a dimensional quantity in order
to plot it on a logarithmic scale. Well...I suppose one could again
take the position that this was "wrong" and that I should have taken
the logarithm of the quantity divided by some normalization length.
I suppose its a plausible position, but it seems to me a bit pedantic
to say that my operation was "wrong, period."

Along another line, it doesn't seem to me that dimensions play a role
in a very large fraction of the lines of code I end up writing or
using. Perhaps this is just another way that I'm "strange". I'll
accept discounting all of the code involved in things like user
interface, both input and output, even though that's substantial
portion of most production-quality codes in my experience. But even
in the computational part, major portions of an awful lot of codes
use algorthims that are basically dimensionless (at least if they
are written in any generality). I suppose one could argue that
the language should ensure that the data fed to these algorithms
is dimensionless. But since the algorithms themselves are also
written in Fortran, it seems to me that, for the sake of checking
units in the (relatively small in some sense) portion of the code
where they are relevant, you would be penalizing performance throughout
the whole code, including the parts where units are irrelevant - quite
possibly the most time-critical parts. Ok, my blatant, unsupported
statements about which parts might be larger or smaller in various senses
may not be accurate. But I still see an issue here.

I guess that I've seen a whole lot more Fortran programming errors in
areas addressed by f90 (which this posting pretty explicitly
dismissed) than in units of measurement. If I were interested in
reducing the number of errors in Fortran code (which I very much
am), then I'd concentrate more on convincing people to use the
features of f90 than can help with this. I think the payoff is
bigger. I won't say I've never made an error in units (that would
be a pretty foolish statement - and a quite incorrect one). But
I will claim that I have made an awful lot more f77 errors in things
that f90 can help with. I'd never consider dropping back from f90 to
f77 in order to get units checking.

P.S. But I quite acknowledge that opinions will vary here. I've given mine.

--
Richard Maine
ma...@altair.dfrc.nasa.gov

Tony T. Warnock

unread,
Feb 3, 2000, 3:00:00 AM2/3/00
to
Los Alamos, New Mexico
Population 17,234
Altitude 7,488
Founded 1,943
Total 26,665


Grant W. Petty

unread,
Feb 3, 2000, 3:00:00 AM2/3/00
to
In article <ueog9y1...@altair.dfrc.nasa.gov>,
Richard Maine <ma...@altair.dfrc.nasa.gov> wrote:

>
>2. How about the other one? I can't off-hand think of a quick example
>of taking the sine of a length, but surely I'm not the only person
>to have ever taken the logarithm of a dimensional quantity in order
>to plot it on a logarithmic scale.

I can give you a common example from my own experience: taking a logarithm
of a radar reflectivity Z (units mm^6/m^3) to get something with units
of dBZ. The conversion is usually given as:

reflectivity (dBZ) = 10*log10(Z)

It may be pedantic, but it is also less ambiguous, to write

reflectivity (dBZ) = 10*log10(Z / Zunit)

where Zunit could be set via

parameter (Zunit = u_mm**6/u_m**3)

according to my proposed extension. Note that the numerical value of
the reflectivity in dBZ is critically dependent on the reference unit.
In principle, Z could just as easily be expressed as m^3 rather than
mm^6/m^3, but taking the logarithm would then give a very different
(non-dimensional) number.

Note that the price to pay to eliminate the ambiguity is quite small
in this instance.

> Well...I suppose one could again
>take the position that this was "wrong" and that I should have taken
>the logarithm of the quantity divided by some normalization length.

Exactly my view!

>I suppose its a plausible position, but it seems to me a bit pedantic
>to say that my operation was "wrong, period."

Of course, according to my proposal, programmers who choose not to
be so pedantic could just as easily ignore the existence of a "physical
real" (PREAL) data type, and everything would remain as it always was.

>
>Along another line, it doesn't seem to me that dimensions play a role
>in a very large fraction of the lines of code I end up writing or
>using. Perhaps this is just another way that I'm "strange". I'll
>accept discounting all of the code involved in things like user
>interface, both input and output, even though that's substantial
>portion of most production-quality codes in my experience.

I think you have just highlighted a key distinction that needs to be
made -- the vast majority of the scientific programming that I do, and
that I encounter in my field, is NOT production-quality code and never
will be, because it won't be re-used often enough, or is not targeted
at a large enough user base, to be worth the effort.

Rather, the vast majority of Fortran code occupying my hard disk right
now consists of a massive, hairy clot of subroutines or highly
specialized programs for reading various oddball data sets, computing
some oddball property, and spitting the results out into a text file
that I can subsequently ingest into a plotting program. As an
academic researcher, if I took time to write production-quality code
every time I undertake an analysis of some new data, I'd never get
anything else done! (BTW, when I do write production-quality code, I
do it in C, which has its own annoyances!).

Thus, it is when doing quick-and-dirty one-time calculations or when
trying to reconcile the calling conventions of different subroutines
in my private library that I find myself cursing the need to deal with
the problem of converting (for example) viscosities from cgi to SI
units, because my reference book has them in one set of units but a
canned subroutine in my Fortran zoo expects something different! My
view is that computers are ideally suited for dealing with this kind
of tedium -- why not take advantage of them? I love the MathCad
software package precisely because of this capability. Implementing
and utilizing exactly the same functionality in Fortran should not be
a problem, in my opinion. If I had the time and the necessary
background, I would try modifying g77 myself!


> But even
>in the computational part, major portions of an awful lot of codes
>use algorthims that are basically dimensionless (at least if they
>are written in any generality). I suppose one could argue that
>the language should ensure that the data fed to these algorithms
>is dimensionless. But since the algorithms themselves are also
>written in Fortran, it seems to me that, for the sake of checking
>units in the (relatively small in some sense) portion of the code
>where they are relevant, you would be penalizing performance throughout
>the whole code, including the parts where units are irrelevant - quite
>possibly the most time-critical parts.

I would argue that one always has the choice of feeding
non-dimensional (i.e., traditional REAL) data to time-sensitive parts
of a program, just as scientific programmers frequently specify
single-precision rather than double-precision REALs when execution
time matters more than 20-digit precision (a flexibility C doesn't
give us, by the way, since it automatically promotes single to double
for almost everything!)

> Ok, my blatant, unsupported
>statements about which parts might be larger or smaller in various senses
>may not be accurate. But I still see an issue here.
>
>I guess that I've seen a whole lot more Fortran programming errors in
>areas addressed by f90 (which this posting pretty explicitly
>dismissed) than in units of measurement. If I were interested in
>reducing the number of errors in Fortran code (which I very much
>am), then I'd concentrate more on convincing people to use the
>features of f90 than can help with this.

For me the issue is of return vs. investment. The total payoff could
potentially be bigger in other areas, but the investment in designing
and/or learning to effectively use new languages (e.g., F90, C++,
etc.) is also very large. Most people like me simply don't have time
to start from scratch learning a complete new language (or major
dialect of a language) without a very good reason, let alone
convert existing code.

My thesis is that the investment needed to implement my proposal, both
from the perspective of the person modifying the compiler and the
perspective of the person taking advantage of the new features, is
actually quite modest, and the payoff over the long term could be
significant.

Recently a Mars probe was lost because NASA and a contractor exchanged
non-dimensional numbers that actually represented dimensional numbers
expressed in incompatible systems of units. The same kind of disaster
(on a much less expensive scale, fortunately) is possible every time
one person's Fortran program calls someone else's Fortran algorithm
for computing a physical quantity. In my opinion, this is completely
unnecessary and would take comparatively little effort to fix.

> I think the payoff is
> bigger. I won't say I've never made an error in units (that would
>be a pretty foolish statement - and a quite incorrect one). But
>I will claim that I have made an awful lot more f77 errors in things
>that f90 can help with. I'd never consider dropping back from f90 to
>f77 in order to get units checking.

I'll turn this around and say that if f90 had addressed the unit and
physical dimension issue in a user-friendly way, I would have long ago
taken the trouble to learn it and pay the bucks for an f90 compiler
(which no one around here seems to have yet!). As it stands, the
benefits of f90 for someone like me have not yet been explained
clearly enough to make it seem worth the time and effort.

>
>P.S. But I quite acknowledge that opinions will vary here. I've given mine.
>

Thanks! That is what I was asking for.

- Grant

Barry Margolin

unread,
Feb 3, 2000, 3:00:00 AM2/3/00
to
In article <ueog9y1...@altair.dfrc.nasa.gov>,
Richard Maine <ma...@altair.dfrc.nasa.gov> wrote:
>written in Fortran, it seems to me that, for the sake of checking
>units in the (relatively small in some sense) portion of the code
>where they are relevant, you would be penalizing performance throughout
>the whole code, including the parts where units are irrelevant - quite
>possibly the most time-critical parts.

I haven't read the guy's paper, but I don't see how dimension checking
could impact performance. If you convert some dimensioned values to
dimensionless ones for the sake of a computation that violates the
dimension rules, that should just affect compile-time diagnostics, and have
no impact at runtime.

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

Dick Hendrickson

unread,
Feb 3, 2000, 3:00:00 AM2/3/00
to

"Grant W. Petty" wrote:
>
> In article <ueog9y1...@altair.dfrc.nasa.gov>,
> Richard Maine <ma...@altair.dfrc.nasa.gov> wrote:
>
> >

[snip]

> > I think the payoff is
> > bigger. I won't say I've never made an error in units (that would
> >be a pretty foolish statement - and a quite incorrect one). But
> >I will claim that I have made an awful lot more f77 errors in things
> >that f90 can help with. I'd never consider dropping back from f90 to
> >f77 in order to get units checking.
>
> I'll turn this around and say that if f90 had addressed the unit and
> physical dimension issue in a user-friendly way, I would have long ago
> taken the trouble to learn it and pay the bucks for an f90 compiler
> (which no one around here seems to have yet!). As it stands, the
> benefits of f90 for someone like me have not yet been explained
> clearly enough to make it seem worth the time and effort.
>

Could you explain a little more about what you want? Why isn't
F90's derived type stuff good enough. Something like
type (mass_in_grams) :: m
type (velocity_in_cm_per_sec) :: c=3e10
type (energy_in_slug_ft_sec_sq) :: e
e = m*c**2

where you define enough derived types to cover whatever you are
interested in. The little snippet would work fine IF you define
overloads for squaring velocity, multiplying mass by velocity
squared, and converting energy from metric to english in a store.
There's a ton of overloads to work out, but aren't they all
straight forward? What would you have the compiler do differently?

How would you deal with high energy physicists who work in a system
of units where e and c are 1 and (I think) dimensionless? Isn't it
always(?) going to be true that you will need to specify what
operations are allowed? And if you want the machine to do the
right thing with unit mismatches you'll have to tell it what to
do?

It's not that I think dimensions are bad, I just don't see exactly
what you want to happen.

Dick Hendrickson
work out

Craig Powers

unread,
Feb 3, 2000, 3:00:00 AM2/3/00
to
gpe...@rain.atms.purdue.edu (Grant W. Petty) writes:

1) The extensions have to be simple and intuitive to use for a large
category of people (e.g., natural and physical scientists like me) who
still customarily program in F77. That means, among other things, not
throwing F77 out in favor of another "modern" language, like C++ or
the like, which few of my colleagues would ever bother to learn for
various reasons (some of which are actually legitimate). Besides,
there's a huge body of legacy code in F77 that we don't want to have
to rewrite from scratch.

2) The extensions should not be restrictive -- i.e., they shouldn't
make it significantly more work to write a program that utilizes the
new data types, nor should they prevent one from continuing to program
"the old way", if one so chooses.

3) In contrast to many previous proposals, I am primarily concerned
with run-time dimension checking rather than compile-time checking,
because the whole point is to spare the programmer/scientist the
tedium of pre-calculating the dimensionality of every intermediate
variable in complex calculations.

If interested, please read my more detailed discussion at
http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt

[which, among other things, says:]

> In particular, there is no physical problem for which it is ever
> meaningful to do any of the following:

> 1) Add, subtract, or equate two values having different physical
> dimensions. For example, it is never physically meaningful to add a
> variable with dimensions of time to one having dimensions of
> distance. Any physical formula which violates this rule is simply
> incorrect, period!

> 2) Supply anything but a pure dimensionless number as an argument to>
> any transcendental functions, such as sine, logarithm, exponent, etc.
> For example, it is never meaningful to take the sine of a length, but
> it is meaningful to take the sine of ratio of lengths (e.g., the
> coordinate x divided by a wavelength lambda).

=================

In response, Richard Maine wrote:

[massive snippage]

> I guess that I've seen a whole lot more Fortran programming errors in
> areas addressed by f90 (which this posting pretty explicitly
> dismissed) than in units of measurement. If I were interested in
> reducing the number of errors in Fortran code (which I very much
> am), then I'd concentrate more on convincing people to use the

> features of f90 than can help with this. I think the payoff is


> bigger. I won't say I've never made an error in units (that would
> be a pretty foolish statement - and a quite incorrect one). But
> I will claim that I have made an awful lot more f77 errors in things
> that f90 can help with. I'd never consider dropping back from f90 to
> f77 in order to get units checking.

=================

And my comments:

Another benefit of moving to F90 would be that it should be possible to
write a library in F90 implementing dimensioned numbers as a derived
type. Although assignment syntax would change for a value assignment
(it would be necessary to use the constructor), all of the other
desired features (mathematical and comparison operations, with unit
compatibility enforcement) should be handled by overloading those
operators. As a result, usage of the type would be very intuitive
assuming a good desing and implementation.

Furthermore, usage of an F90 compiler need not preclude using the
existing F77 code with minimal modifications, assuming the F77 code
is sufficiently portable.

--
Craig Powers NU ChE class of '98
cpo...@lynx.dac.neu.edu
http://lynx.neu.edu/home/httpd/c/cpowers
eni...@coe.neu.edu http://www.coe.neu.edu/~enigma

"Good..bad....I'm the guy with the gun." -- "Ash" in *Army of Darkness*

Gordon Sande

unread,
Feb 3, 2000, 3:00:00 AM2/3/00
to
On Thu, 03 Feb 2000 20:09:44 GMT, Dick Hendrickson
<dick.hen...@att.net> wrote:

>
>
>"Grant W. Petty" wrote:
>>
>> In article <ueog9y1...@altair.dfrc.nasa.gov>,
>> Richard Maine <ma...@altair.dfrc.nasa.gov> wrote:
>>
>> >

>[snip]


>
>> > I think the payoff is
>> > bigger. I won't say I've never made an error in units (that would
>> >be a pretty foolish statement - and a quite incorrect one). But
>> >I will claim that I have made an awful lot more f77 errors in things
>> >that f90 can help with. I'd never consider dropping back from f90 to
>> >f77 in order to get units checking.
>>
>> I'll turn this around and say that if f90 had addressed the unit and
>> physical dimension issue in a user-friendly way, I would have long ago
>> taken the trouble to learn it and pay the bucks for an f90 compiler
>> (which no one around here seems to have yet!). As it stands, the
>> benefits of f90 for someone like me have not yet been explained
>> clearly enough to make it seem worth the time and effort.
>>
>

>Could you explain a little more about what you want? Why isn't
>F90's derived type stuff good enough. Something like
> type (mass_in_grams) :: m
> type (velocity_in_cm_per_sec) :: c=3e10
> type (energy_in_slug_ft_sec_sq) :: e
> e = m*c**2
>
>where you define enough derived types to cover whatever you are
>interested in. The little snippet would work fine IF you define
>overloads for squaring velocity, multiplying mass by velocity
>squared, and converting energy from metric to english in a store.
>There's a ton of overloads to work out, but aren't they all
>straight forward? What would you have the compiler do differently?
>
>How would you deal with high energy physicists who work in a system
>of units where e and c are 1 and (I think) dimensionless? Isn't it
>always(?) going to be true that you will need to specify what
>operations are allowed? And if you want the machine to do the
>right thing with unit mismatches you'll have to tell it what to
>do?
>
>It's not that I think dimensions are bad, I just don't see exactly
>what you want to happen.
>
>Dick Hendrickson
>work out


In a simulation language I was involved in developing/enhancing
(it translated into Fortran) there were _unit_ declarations which
were checked for consistency in the computational equations.

For an equation of "stock = stock + flow * time_inc" one had

unit stock = mass
unit flow = mass/time
unit time_inc = time

where "mass" and "time" were just formal indeterminants. In the
simulation models (pure computation) the analyser found about
5 errors per 1000 lines of old models that we thought had been
checked carefully several times. (Several errors were in
interest calculations of physical/financial production models
where time duration was dropped.) The units were all sorts of
things - physical things like mass, length and time as well as
constant dollars and current dollars (can you say inflation?).
The benefit was one no longer had to even think about being
careful about whether the cost was in constant or current dollars
etc as the analyser would keep one absolutely honest. One set
of important nagging and annoying concerns just vanished for
a minor nuisance of filling in all those @*#!% unit statements.
When an old model was being upgraded it sometimes got a bit
tense for a while.

The models ended up with a lot of named constants to do the unit
conversions. Tons and pounds were different and there was a
conversion constant to do the job. There was also a notion
of _domain_ for subscripts so a subscript would be across
geography or vegetation type which helped lower the confusion
when there was 10 of both geographies and vegetation types.

This is all just enhanced typing but was not elaborate enough
to deal with inhomogeneous types (e.g. differing elements of an
array having differing types). As such it was all compile time.

Gordon Sande

Grant W. Petty

unread,
Feb 3, 2000, 3:00:00 AM2/3/00
to
In article <3899E108...@att.net>,

Dick Hendrickson <dick.hen...@att.net> wrote:
>
>
>"Grant W. Petty" wrote:
>> I'll turn this around and say that if f90 had addressed the unit and
>> physical dimension issue in a user-friendly way, I would have long ago
>> taken the trouble to learn it and pay the bucks for an f90 compiler
>> (which no one around here seems to have yet!). As it stands, the
>> benefits of f90 for someone like me have not yet been explained
>> clearly enough to make it seem worth the time and effort.
>>
>
>Could you explain a little more about what you want? Why isn't
>F90's derived type stuff good enough. Something like
> type (mass_in_grams) :: m
> type (velocity_in_cm_per_sec) :: c=3e10
> type (energy_in_slug_ft_sec_sq) :: e
> e = m*c**2
>
>where you define enough derived types to cover whatever you are
>interested in. The little snippet would work fine IF you define
>overloads for squaring velocity, multiplying mass by velocity
>squared, and converting energy from metric to english in a store.
>There's a ton of overloads to work out, but aren't they all
>straight forward? What would you have the compiler do differently?

The simplest way to answer the last question is to refer you back
to my document for a second reading:

http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt

Unless I'm completely misunderstanding your example above (as I said,
I'm not an F90 programmer yet), the differences between it and my
proposal are profoundly important in several respects:

1) In my proposed extension, you wouldn't necessarily need to specify
in advance that e = m*c**2 has dimensions of energy. Those new
dimensions would be computed automatically from the stored physical
dimensions of the variables on the right of the assignment operator,
at run time. If you wanted, you could then add code to your program
to compare the actual computed dimensions with the expected dimensions
and have it spit out a diagnostic if the two don't agree; e.g., if you
inadvertently typed

e = m*c**3

2) You wouldn't need to declare that 'm is a mass expressed in grams'
or even that m is a mass. You would simply declare m as generic type
PREAL ('physical real'). It would then take on whatever value +
physical dimensions you assigned to it, at run time.

3) You wouldn't need "a ton of overloads" to work out. For example,
the multiplication operator '*' would have to be extended only once,
in a relatively trivial way, to handle the multiplication of two
values of type PREAL, in addition to the standard cases of type REAL,
INTEGER, and so on. Again, see my web document. Any variable of type
PREAL can be used to store a mass, an energy, a current, a volume, or
whatever you assigned to it. But whatever dimensions you assign, they
get carried around along with the numerical part of the value. And
algebraic manipulations do the correct things with them, which doesn't
depend on exactly what kind of a physical quantity it is.

>
>How would you deal with high energy physicists who work in a system
>of units where e and c are 1 and (I think) dimensionless?

Simple. Dimensionless quantities are treated as dimensionless,
period. That means they can either be stored in the standard type
REAL, in which case they are implicitly dimensionless, or in my type
PREAL, in which case the extra bits that encode physical dimensions
are all set to zero.

> Isn't it
>always(?) going to be true that you will need to specify what
>operations are allowed?

Again, the rules for what is allowed are extremely few and extremely
simple. I don't write compilers, but I can't imagine that my proposed
run-time checks are much more difficult or time consuming than
checking for division by zero.

>
>It's not that I think dimensions are bad, I just don't see exactly
>what you want to happen.

Hopefully my responses above helped clarify things somewhat. Thanks
for your comments.

Jim Cornwall

unread,
Feb 3, 2000, 3:00:00 AM2/3/00
to
On 3 Feb 2000 18:31:45 GMT, gpe...@rain.atms.purdue.edu (Grant W. Petty)
wrote:

>In article <ueog9y1...@altair.dfrc.nasa.gov>,
>Richard Maine <ma...@altair.dfrc.nasa.gov> wrote:
>

(snip)


>> I think the payoff is
>> bigger. I won't say I've never made an error in units (that would
>>be a pretty foolish statement - and a quite incorrect one). But
>>I will claim that I have made an awful lot more f77 errors in things
>>that f90 can help with. I'd never consider dropping back from f90 to
>>f77 in order to get units checking.
>
>I'll turn this around and say that if f90 had addressed the unit and
>physical dimension issue in a user-friendly way, I would have long ago
>taken the trouble to learn it and pay the bucks for an f90 compiler
>(which no one around here seems to have yet!). As it stands, the
>benefits of f90 for someone like me have not yet been explained
>clearly enough to make it seem worth the time and effort.
>

The latest project I did at work (a post-processor to take model output
files & collate/process/aggregate the data into another format for a GIS
package) is much like what you describe as most of your code. The
method I used (multiple arrays of sorted data, with an array of pointers
used to keep track of the current data set --<Thanks Richard!>) could
perhaps have been done in F77 some other way, but it was much easier in
F90 with all the improvements in array-handling features, and the
addition of pointers. After years of using F77 and maintaining other
people's non-production code (that was being used for production
weather processing years later...), I'm absolutely sold on learning F90
and later versions when I can afford them!

Jim Cornwall (ex-USAF computer systems programmer/manager, now
hydrogeology graduate student doing programming for the USGS)

Mark D. Dewing

unread,
Feb 3, 2000, 3:00:00 AM2/3/00
to

On 3 Feb 2000, Grant W. Petty wrote:

> Is it not possible to make a very modest extension to a widely used
> scientific programming language (the most entrenched of which,
> for better or for worse, is Fortran 77) to allow a rational, natural,
> and intuitive approach to physical dimension checking and unit
> conversion?
>

<stuff snipped>


>
> 3) In contrast to many previous proposals, I am primarily concerned
> with run-time dimension checking rather than compile-time checking,
> because the whole point is to spare the programmer/scientist the
> tedium of pre-calculating the dimensionality of every intermediate
> variable in complex calculations.
>

It seems to me that there are two separate issues involved:
checking dimensional consistency and automatic unit conversion.

Checking dimensional consistency seems like it coulde be accomplished
at compile time (or by using a separate 'dimensional lint' program)
Are there cases where static analysis wouldn't work?

Unit conversions could be implemented as a library, and the programmer
would have to make explicit calls to do unit conversions.
(they wouldn't be automatic, but it would be much easier than trying
to make the conversion automatic).


Mark Dewing
m-d...@uiuc.edu

John Harper

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
>gpe...@rain.atms.purdue.edu (Grant W. Petty) writes:
>
>> In particular, there is no physical problem for which it is ever
>> meaningful to do any of the following:
>
>> 1) Add, subtract, or equate two values having different physical
>> dimensions. For example, it is never physically meaningful to add a
>> variable with dimensions of time to one having dimensions of
>> distance. Any physical formula which violates this rule is simply
>> incorrect, period!
>
>> 2) Supply anything but a pure dimensionless number as an argument to>
>> any transcendental functions, such as sine, logarithm, exponent, etc.
>> For example, it is never meaningful to take the sine of a length, but
>> it is meaningful to take the sine of ratio of lengths (e.g., the
>> coordinate x divided by a wavelength lambda).
>
>1. Lets see, I don't offhand have a handy example of adding length to
>time, but I've seen things like length**2 added to time**2

NASA did that sort of thing recently, and then changed the unit of length
from a foot to a meter (or was it a meter to a foot?) Ask them if they
enjoyed their rocket crashing on Mars as a result!

>2. How about the other one? I can't off-hand think of a quick example
>of taking the sine of a length, but surely I'm not the only person
>to have ever taken the logarithm of a dimensional quantity in order
>to plot it on a logarithmic scale.

You can find log(3.14), but log($3.14) or log(3.14 m) are undefined.
There are standards in mathematics as well as in Fortran!
log x = integral 1 to x of (1/t) dt, and that makes no sense unless
t = 1 is possible, i.e. t must be dimensionless, and therefore so is x.
The trouble is of course that many first-year calculus courses say the
indefinite integral 1/x dx = log(abs(x)) + c.
On the LHS of that, x need not be dimensionless. I therefore use and
recommend this antiderivative (or indefinite integral) in the form
integral 1/x dx = log(x/k), where k is an arbitrary constant with the
same dimensions as x, and if x is real (it might after all have been
complex even in a Fortran program, in which case log(abs(x)) is wrong
even if x is dimensionless) then k must be of the same sign as x.

Similarly with sine: we have sin(x) = x - x**3/6.0 + x**5/120.0 - ...
The RHS makes no sense unless x is dimensionless, and it is wrong
unless x is measured in radians. The standard Fortran trig intrinsics
SIN, ASIN etc. all use radians, though some vendors have extensions
SIND, ASIND etc. using degrees, e.g. sind(x) = sin(x*pi/180.0)

There is another way out of this quandary: some people say "Let the
length of a foobar-widget be x metres" instead of "Let the length of a
foobar-widget be x", and they can then consistently operate with x using
dimensionless algebra. I don't normally do that because I value the
opportunity to detect dimensional errors, but of course I had to when I
used to solve for my students old exam questions including things like
"The velocity v of a particle moving along the x-axis obeys
v**2 = -9*x**2 + 18*x + 27, (m/sec)**2, where x is measured in metres."
One of the pleasures of retirement is not having to deny one's natural
inclination to object to things like that. (Students want to know how
to solve problems they might be set, not how to object to the wording.)

Chemists are particularly fond of dimensionally wrong equations:
many of them say pH = -log10(a) where a = hydrogen ion activity, but
they should say something like pH = -log10( a / (mol L^{-1}) )

John Harper, School of Mathematical and Computing Sciences,
Victoria University, Wellington, New Zealand
e-mail john....@vuw.ac.nz phone (+64)(4)463 5341 fax (+64)(4)463 5045

Nick Benton

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to

> 3) In contrast to many previous proposals, I am primarily concerned
> with run-time dimension checking rather than compile-time checking,
> because the whole point is to spare the programmer/scientist the
> tedium of pre-calculating the dimensionality of every intermediate
> variable in complex calculations.

If the compiler does dimension *inference*, then you can have static
checking (and hence no runtime cost) without the programmer having to
calculate explicitly all the intermediate dimensions.

And for a usable system you need dimension polymorphism too, or you'll
end up having to duplicate bits of generic code so that you can use
them at different dimensions.

A clear and elegant account of how to do polymorphic dimension
inference can be found in Andrew Kennedy's PhD thesis and subsequent
papers - the best introduction being

Dimension Types
Andrew Kennedy. In Proceedings of the 5th European Symposium on
Programming: Lecture Notes in Computer Science volume 788, Springer-
Verlag, 1994. Available from
http://research.microsoft.com/users/akenn/papers/index.html

Andrew's work is, I think, pretty much the definitive account of how to
solve this problem.

Nick Benton


Sent via Deja.com http://www.deja.com/
Before you buy.

Andrew Cooke

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <87c65o$k4u$1...@mozo.cc.purdue.edu>,

gpe...@rain.atms.purdue.edu (Grant W. Petty) wrote:
> Is it not possible to make a very modest extension to a widely used
> scientific programming language (the most entrenched of which,
> for better or for worse, is Fortran 77) to allow a rational, natural,
> and intuitive approach to physical dimension checking and unit
> conversion?

A few points after reading all of this thread so far:

- Yes, it's a good idea.

- Having global base units is probably not good enough. In
Astrophysics, for example, (I was once an astronomer) you could be
calculating at both atomic and inter-galactic scales in the same code.

- Rather than use decimal for fractional powers of dimensions, consider
ratios of integers. I suspect this covers nearly all cases, avoids
worrying about rounding errors, and probably saves space (a byte each is
probably sufficient) (Do you ever have a dimension to an irrational
number? - perhaps in fractal analysis?! What about imaginary exponents?
:-).

- I didn't read the paper given as a definitive reference but since the
asbtract talks about type inference and much of the other papers on that
page discuss ML, I suspect the results are nothing like you are
expecting. Having used ML I can see how the type system could be used
to support this, and suspect that the paper will describe an elegant,
powerful solution. But I am sure it will *not* be attractive to people
who find F90 advanced... :-)

- Take a look at F90. I don't know it at all well, but the kind of
problem you are trying to solve can be tackled with modern languages
without extending compilers and without huge performance hits. In fact,
I'd be surprised if there isn't some kind of F90 library already
available... (if not, I think you'd get much more support for that than
for extending a language that should, in all honesty, be quietly laid to
rest). (I used to write a lot of F77 code, and I know the kind of
environment you're coming from, so I can sympathise, but F90 *is*
worth the effort (and money)).

Cheers,
Andrew

Andrew Cooke

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to

I forgot one point in my post a moment ago:

- Some way of doing coercion would be nice. So, for example, given a
dimensionless value, and some base units, you could ask for the
equivalent energy. This would help those high-energy people someone
mentioned who work with c=G=e=1, for example (it's nice for doing the
maths, but tedious to work out what the final result is in real
units...)

Andrew
http://www.andrewcooke.free-online.co.uk/index.html

Michel OLAGNON

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <94963236...@bats.mcs.vuw.ac.nz>, har...@mcs.vuw.ac.nz (John Harper) writes:
> [...]

>
>You can find log(3.14), but log($3.14) or log(3.14 m) are undefined.
>There are standards in mathematics as well as in Fortran!

I am very confused. Isn't log(L/(gT**2)) equal (log(L) - log(g) - 2log(T))
any more, because log(L), log(g), and log(T) would be undefined ?
(L=length, g=9.81m/s2, T=period)

Michel

--
| Michel OLAGNON email : Michel....@ifremer.fr|
| IFREMER: Institut Francais de Recherches pour l'Exploitation de la Mer|
| Centre de Brest - B.P. 70 phone : +33-2-9822 4144|
| F-29280 PLOUZANE - FRANCE fax : +33-2-9822 4650|
| http://www.ifremer.fr/ditigo/molagnon/molagnon.html |


Alois Steindl

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
Hello,
I would treat that expression as follows:

L = l * [length]
g = g1 * [length]/ [time]^2
T = t * [time]

where l, g1 and t are numbers.

Then log(L/(gT**2)) = log(l/g1*t**2) = log(l)-log(g1)-2*log(t).

No need to take log(1 meter).

Alois

Jan Vorbrueggen

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
har...@mcs.vuw.ac.nz (John Harper) writes:

> NASA did that sort of thing recently, and then changed the unit of length
> from a foot to a meter (or was it a meter to a foot?) Ask them if they
> enjoyed their rocket crashing on Mars as a result!

Note quite correct. One program wrote a file with data containing forces
measured in lbs-force (itself an abomination of a unit), without explicitly
specifying the unit. The next program in the processing chain read these
values as containing newtons, which is wrong by a factor lbs/kg8g (which
isn't dimensionless, I know!). The current proposal wouldn't have helped
with that, unless formatted input/output for PREALs also included the unit.

Jan

George Russell

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
"Grant W. Petty" wrote:
>
> Hello,
>
> I am intentionally resurrecting a discussion that appears (based on a
> DejaNews search) to have last run its course (in slightly different
> form) in the mid-1990s:
>
> Is it not possible to make a very modest extension to a widely used
> scientific programming language (the most entrenched of which,
> for better or for worse, is Fortran 77) to allow a rational, natural,
> and intuitive approach to physical dimension checking and unit
> conversion?
I believe it is. Andrew Kennedy wrote a PhD thesis showing how to extend
Standard ML's type system to do it. Checking is done at compile-time,
but you do NOT need lots of programmer annotations to work out what the
dimension of everything is, because like other ML-types that can be inferred
automatically. I see no reason why Andrew Kennedy's approach should not be
adapted to FORTRAN, since the FORTRAN type system is a rather small subset of
ML's. I would imagine such a system working like lint, so that you would give
it the complete set of source files to a program, and it would go away and infer
the dimensions of all the functions from them and tell you if it had problems.

Clearly you need some way of getting the system started by giving it some
dimensions (otherwise it could just assume that everything was dimensionless).
If you don't like extending FORTRAN or encoding such information in comments
you could put it in some kind of auxiliary file.

Note that you need some kind of parametric polymorphism, at least in dimensions
of types, otherwise for example a routine to sort an array of reals will have
to have separate versions for every unit.

You can download details via his home page at
http://research.microsoft.com/~akenn/

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <y4u2jpi...@mailhost.neuroinformatik.ruhr-uni-bochum.de>,

Jan Vorbrueggen <j...@mailhost.neuroinformatik.ruhr-uni-bochum.de> wrote:
>
>measured in lbs-force (itself an abomination of a unit), without explicitly
>specifying the unit. The next program in the processing chain read these
>values as containing newtons, which is wrong by a factor lbs/kg8g (which
>isn't dimensionless, I know!). The current proposal wouldn't have helped
>with that, unless formatted input/output for PREALs also included the unit.

Let me mention, BTW, that this is the single thorniest issue in my
proposal, in my opinion. I think rewriting Fortran I/O functions to
handle units in a general yet not too cumbsersome way might be
very messy. That's why I would initially settle for leaving the I/O
routines as they are and simply requiring that programmers cast PREAL
to non-dimensional REAL (by dividing by the appropriate units) before
printing them with a standard WRITE or PRINT statement. E.g.,

REAL x,t
PREAL dist,time,speed

WRITE(*,*)' Enter distance traveled [km]:'
READ(*,*)x
dist = x * u_km

WRITE(*,*)' Enter time elapsed [days]:'
READ(*,*)t
time = t * u_day

speed = dist/time

WRITE(*,*)'Speed is ', speed/u_fps , ' feet per second'

STOP
END

The above complete sample program prompts for a distance and a time
elapsed and computes the corresponding speed. Input values are
initially read into variables of standard type REAL, avoiding the need
to modify the Fortran77 READ routine. The initially non-dimensioned
numbers are immediately converted to appropriate dimensioned values by
multiplying by the intended units (which were pre-defined by the
compiler or by a table that is loaded at run time).

The computed speed (whose internal representation can be in any
standard set of base units -- the programmer doesn't need to know
which) can be written out as a value expressed in arbitrary units of
speed by dividing by the chosen units. The ratio speed/u_fps is
nondimensional, so the standard WRITE routine could again be used
without modification. However, the compiler would need to add checks
for whether the ratio is truly dimensionless and produce a run-time
error if it is not (e.g. due to an inappropriate choice of output
units or a prior coding error).

As Jan stated, this might not have prevented the NASA disaster, but it
would have at least forced the programmer to think about what units
he/she was dealing with when performing I/O while at the same time
saving him/her the effort of doing explicit conversions between units. And
the above program requires a nearly trivial extension of F77: the
creation of data type PREAL and extension of operators '*' and '/' to
work correctly with type PREAL, plus run-time checks for dimensional
consistency where appropriate.

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <Pine.SOL.3.95.100020...@blava.ncsa.uiuc.edu>,

Mark D. Dewing <mde...@blava.ncsa.uiuc.edu> wrote:
>
>
>On 3 Feb 2000, Grant W. Petty wrote:
>
>> Is it not possible to make a very modest extension to a widely used
>> scientific programming language (the most entrenched of which,
>> for better or for worse, is Fortran 77) to allow a rational, natural,
>> and intuitive approach to physical dimension checking and unit
>> conversion?
>>
><stuff snipped>

>>
>> 3) In contrast to many previous proposals, I am primarily concerned
>> with run-time dimension checking rather than compile-time checking,
>> because the whole point is to spare the programmer/scientist the
>> tedium of pre-calculating the dimensionality of every intermediate
>> variable in complex calculations.
>>
>
>It seems to me that there are two separate issues involved:
>checking dimensional consistency and automatic unit conversion.
>
>Checking dimensional consistency seems like it coulde be accomplished
>at compile time (or by using a separate 'dimensional lint' program)
>Are there cases where static analysis wouldn't work?

You could do this in cases where the dimensions of each variable can
be determined unambiguously at compile-time. But this strikes me as
an unnecessary and undesirable restriction. What if you want a
program (or seperately compiled library subroutine) to perform a
generic calculation on variables x, y, and z, and you want the user to
be able to specify at run-time what kinds of physical variables x, y,
and z represent? My proposed system would easily accommodate this.

>
>Unit conversions could be implemented as a library, and the programmer
>would have to make explicit calls to do unit conversions.
>(they wouldn't be automatic, but it would be much easier than trying
>to make the conversion automatic).

Automatic conversion is shockingly easy in the scheme I propose. In
fact, the popular software package MathCad already does exactly what
I'm talking about. I fail to see why it would be so difficult to
adapt the same approach to a compiled programming language. Again, my
web document at

http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt

describes what I believe to be a completely workable internal
mechanism. Unfortunately, I personally lack the ability to write or
modify a compiler to incorporate these ideas, otherwise I would have
done it a year ago, but I have a hard time believing it is so
difficult.

George Russell

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
Andrew Cooke wrote:
> - I didn't read the paper given as a definitive reference but since the
> asbtract talks about type inference and much of the other papers on that
> page discuss ML, I suspect the results are nothing like you are
> expecting. Having used ML I can see how the type system could be used
> to support this, and suspect that the paper will describe an elegant,
> powerful solution. But I am sure it will *not* be attractive to people
> who find F90 advanced... :-)
No honestly, I think FORTRAN users have more brains than you credit them with.
What you will typically get as output of a system such as I describe is
information that in some operation the dimensions don't match. For example with

IF (X) THEN
A=(expression 1)
ELSE
A=(expression 2)
ENDIF

or something like that, the checker will have to complain if the two
expressions have different dimensions, and current ML type-checking
technology would give you the types (or rather in this case the units)
of the two expressions. If the error lies directly in one of the expressions,
(eg you wrote A=X*Y rather than A=X+Y) then it's likely to be fairly easy to
spot. The problem arises if the error is somewhere else, EG expression 1 is
"A = 2*B" and there is a bug in your definition of B which means it has
different units to those you expect. So you have to think a bit
(perhaps you need a point-and-click interface where you ask the compiler what
it thinks the type of B is), but it's no worse than having to specify the
types of absolutely everything in the first place, which is the alternative
proposal. However ML type systems _do_ allow you to specify types if you want
to, so FORTRAN extended in such a way would actually allow the programmer to
adopt any policy between specifying absolutely all units and specifying absolutely
none.

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <87eb6v$ud1$1...@nnrp1.deja.com>,

Nick Benton <nic...@my-deja.com> wrote:
>
>> 3) In contrast to many previous proposals, I am primarily concerned
>> with run-time dimension checking rather than compile-time checking,
>> because the whole point is to spare the programmer/scientist the
>> tedium of pre-calculating the dimensionality of every intermediate
>> variable in complex calculations.
>
>If the compiler does dimension *inference*, then you can have static
>checking (and hence no runtime cost) without the programmer having to
>calculate explicitly all the intermediate dimensions.
>
>And for a usable system you need dimension polymorphism too, or you'll
>end up having to duplicate bits of generic code so that you can use
>them at different dimensions.

Unless I'm misunderstanding what is meant by dimension polymorphism
(which is possible), I really don't see the need in the system I'm
proposing.

>A clear and elegant account of how to do polymorphic dimension
>inference can be found in Andrew Kennedy's PhD thesis and subsequent
>papers - the best introduction being
>
>Dimension Types
>Andrew Kennedy. In Proceedings of the 5th European Symposium on
>Programming: Lecture Notes in Computer Science volume 788, Springer-
>Verlag, 1994. Available from
>http://research.microsoft.com/users/akenn/papers/index.html

I read this yesterday. Perhaps because I'm not a computer scientist
and am not familiar with much of the terminology and notation used,
it's impossible for me to see the connection between what Kennedy
discusses and my own proposal. I wish someone who understands
Kennedy's paper could tell me in layman's terms how it addresses (or
refutes) the validity of my proposed method. (BTW I tried to find a
working e-mail address for Kennedy but it doesn't even appear on his
own web page!)

What I'm describing is so blindingly simple (and has already been
convincingly implemented in the popular MathCad software package) that
I have difficulty believing that Kennedy is really talking about the
same thing. If anyone is interested, please look at my web page

http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt

and then Kennedy's document cited above and educate me as to the
relationship between the two.

>
>Andrew's work is, I think, pretty much the definitive account of how to
>solve this problem.
>

You may be right, but if so, it appears the ideas have not yet made it
into language that physical scientists like myself are actually using.

Gordon Sande

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
On Fri, 04 Feb 2000 12:22:26 +0100, George Russell
<g...@informatik.uni-bremen.de> wrote:

>"Grant W. Petty" wrote:
>>
>> Hello,
>>
>> I am intentionally resurrecting a discussion that appears (based on a
>> DejaNews search) to have last run its course (in slightly different
>> form) in the mid-1990s:
>>

>> Is it not possible to make a very modest extension to a widely used
>> scientific programming language (the most entrenched of which,
>> for better or for worse, is Fortran 77) to allow a rational, natural,
>> and intuitive approach to physical dimension checking and unit
>> conversion?

>I believe it is. Andrew Kennedy wrote a PhD thesis showing how to extend
>Standard ML's type system to do it. Checking is done at compile-time,
>but you do NOT need lots of programmer annotations to work out what the
>dimension of everything is, because like other ML-types that can be inferred
>automatically.

This assumes that the program is semantically correct in the users
problem domain. Experience with other systems (typically for modeling)
shows that this is a very strong assumption. Most of the requests for
unit typing are in fact directed at providing more redundency in the
programs so that errors can be detected early and accurately. The
requests are often given in the rather stilted language of syntactic
analysis when they are really after another layer of error checking.
(They then of course object to the added work of providing the
necessary additional information. :-( It's like using a big CASE
system to write a "Hello World" program.)

There was an interesting article on error rates in large scientific
production codes (siesmic analysis IIRC) in IEEE Software Engineering
a while back. This was clearly about semantic errors as the codes had
been in production use for some time.

> I see no reason why Andrew Kennedy's approach should not be
>adapted to FORTRAN, since the FORTRAN type system is a rather small subset of
>ML's. I would imagine such a system working like lint, so that you would give
>it the complete set of source files to a program,

Which is a major nuisance when reusing subroutine libraries and such.
The problem is common to most extra analysis so is nothing new.

> and it would go away and infer
>the dimensions of all the functions from them and tell you if it had problems.

All you would know is that somewhere in the derivation chain there is
a problem. Diagnostics are much more useful if they can be localized.

I recall using Fortran H for IBM/360 which produced long prose
paragraphs after the program unit describing the error in detail and
Fortran G which put a $ marker in the middle of the offending line and
said "Trouble here" (or sometime very similar). Fortran G was a lot
easier to use for initial compilations.

And you get the spelling corrector phenomena of it being even harder
to find the wrong word (come or cone?) when they are all spelt
correctly.

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <389AB672...@informatik.uni-bremen.de>,

George Russell <g...@informatik.uni-bremen.de> wrote:
>"Grant W. Petty" wrote:
>>
>> Hello,
>>
>> I am intentionally resurrecting a discussion that appears (based on a
>> DejaNews search) to have last run its course (in slightly different
>> form) in the mid-1990s:
>>
>> Is it not possible to make a very modest extension to a widely used
>> scientific programming language (the most entrenched of which,
>> for better or for worse, is Fortran 77) to allow a rational, natural,
>> and intuitive approach to physical dimension checking and unit
>> conversion?
>I believe it is. Andrew Kennedy wrote a PhD thesis showing how to extend
>Standard ML's type system to do it. Checking is done at compile-time,

See my comments in an earlier post. I read Kennedy's paper and to my
non-specialist's eye, it makes the problem look much more complicated
than it really is, in my opinion.


> ML's. I would imagine such a system working like lint, so that you

> would give it the complete set of source files to a program, and it


> would go away and infer the dimensions of all the functions from
> them and tell you if it had problems.

According to my own proposal, a lint-like preprocessor is completely
unnecessary.

>
>Clearly you need some way of getting the system started by giving it some
>dimensions (otherwise it could just assume that everything was dimensionless).
>If you don't like extending FORTRAN or encoding such information in comments
>you could put it in some kind of auxiliary file.
>
>Note that you need some kind of parametric polymorphism, at least in
> dimensions
>of types, otherwise for example a routine to sort an array of reals will have
>to have separate versions for every unit.

In my proposal, types would not have physical dimensions. There would
be a single generic type PREAL ('physical real') that would allow
dimensional information to be stored and manipulated dynamically along
with the "pure" numerical part of the value.

Side note: I am completely dumbfounded by my inability to find someone
in this newsgroup who seems to understand what I'm saying. And yet I
can point to a concrete example of EXACTLY what I mean: the software
package MathCad. So let me rephrase my original question: why is it
so difficult (allegedly) to make a programming language like F77 do
what MathCad does so simply and elegantly? It's really not rocket
science, as far as I can tell!

- Gran

Clive Page

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <87el8b$qkq$1...@mozo.cc.purdue.edu>,

Grant W. Petty <gpe...@rain.atms.purdue.edu> wrote:
>>> Is it not possible to make a very modest extension to a widely used
>>> scientific programming language (the most entrenched of which,
>>> for better or for worse, is Fortran 77) to allow a rational, natural,
>>> and intuitive approach to physical dimension checking and unit
>>> conversion?

It's rather easy to do what you want in Fortran, provided only that you use
the current standard, Fortran 95, or the preceding one, Fortran 90. You
just have to define a new data type, for example, myreal, which contains a
real value and a physical units string, and overload the necessary
operators and intrinsic functions to handle this. It's hard, perhaps
impossible, only if you insist on using a version of Fortran two
generations out-of-date.

The most difficult part, I think you will find, is working out a simple
unambiguous easily-parsed representation of complicated units. Just to
take a simple case such as density, it could be represented (in MKS units)
as kg/m**3
kg.m**-3
kg m**(-3)
etc.
And if you allow cgs units (let alone imperial ones) you have an awful mess
to deal with.

The FITS file format, widely used in astronomical research, includes the
ability to put physical units against stored values, and astronomers have
come up with some conventions to handle these with a view to making
automatic unit conversions possible, along the lines you envisage.
You might like to look at:
http://legacy.gsfc.nasa.gov/docs/heasarc/ofwg/ofwg_recomm.html
under Recommendation R5.

--
--
Clive Page,
Dept of Physics & Astronomy,
University of Leicester.

Ben Franchuk

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
George Russell wrote:
>
> "Grant W. Petty" wrote:
> >
> > Hello,
> > Is it not possible to make a very modest extension to a widely used
> > scientific programming language (the most entrenched of which,
> > for better or for worse, is Fortran 77) to allow a rational, natural,
> > and intuitive approach to physical dimension checking and unit
> > conversion?
> I believe it is. Andrew Kennedy wrote a PhD thesis showing how to extend
> Standard ML's type system to do it. Checking is done at compile-time,
> but you do NOT need lots of programmer annotations to work out what the
> dimension of everything is, because like other ML-types that can be inferred
> automatically.

Adding units does have the disadvantage of making the language non
portable.
(Have too keep all the fortran 4 users happy :))
Unit errors belong to two classes, input/out and program writing.
Internally things have to end up as a floating point number in some
unit,
and you must supply all conversions from all formats.
At sea level x lb's of thrust is equal to n horsepower for engine Y.

Some conversion factors
may change over the programs use. Take a "what do you weigh on a
different planet"
program for example.

Ben.
BTW does anybody know a ball park figure conversion for the lbs's of
thrust to horsepower?
--
"We do not inherit our time on this planet from our parents...
We borrow it from our children."
The Lagging edge of technology:
http://www.jetnet.ab.ca/users/bfranchuk/woodelf/index.html

George Russell

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
"Grant W. Petty" wrote:
> http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt
>
> and then Kennedy's document cited above and educate me as to the
> relationship between the two.
OK, your system makes a new type PREAL which includes a representation
of the units involved. So if I add two quantities with different units,
I will presumably get an exception of some sort at run-time. If we could
identify errors at compile-time instead it would be much better because
(a) at compile time the programmer is still around to fix the problem;
at run-time the spacecraft may just be in the middle of taking off . . .
(If you think this is fanciful look up the Ariane V disaster.)
(b) if you do it at compile time programs will run faster and need less
space (though of course compile more slowly). This may be a rather
important consideration to typical FORTRAN users.
So it seems to me pretty obvious that if you can do dimension checking at
compile-time, you should. Andrew Kennedy's thesis shows precisely how to
set up a rigorous framework for inferring and checking types.

But if you are doing typechecking at compile time, you do definitely need some
kind of polymorphism. If you have a routine to multiply a real vector by
a real matrix, you really don't want to have separate versions of this routine
for every possible unit. You just need to know have some kind of inference
that shows that if you multiply (say) a vector in meters per second by a matrix
in seconds, the resulting vector is in vectors. But this is a very simple
linear relationship between the dimensions, and I can't believe that the
typical physicist is going to find it hard to express or understand.

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <87ee3v$7d$1...@nnrp1.deja.com>,

Andrew Cooke <and...@andrewcooke.free-online.co.uk> wrote:
>In article <87c65o$k4u$1...@mozo.cc.purdue.edu>,
> gpe...@rain.atms.purdue.edu (Grant W. Petty) wrote:
>> Is it not possible to make a very modest extension to a widely used
>> scientific programming language (the most entrenched of which,
>> for better or for worse, is Fortran 77) to allow a rational, natural,
>> and intuitive approach to physical dimension checking and unit
>> conversion?
>
>A few points after reading all of this thread so far:
>
>- Yes, it's a good idea.

I retract my previous remark about nobody understanding me. :-)

>
>- Having global base units is probably not good enough. In
>Astrophysics, for example, (I was once an astronomer) you could be
>calculating at both atomic and inter-galactic scales in the same code.

That's a problem that didn't occur to me. I guess one way around it
would be to use a floating point representation that spans a very
large range of exponents, but that seems wasteful as a default
condition. Maybe the astrophysicist could set a compiler flag that
forces extra large floats, if they were willing to tolerate the extra
computational overhead.

But I agree this is a problem for people in that unique situation. I
suspect it's rare in other disciplines. Okay, so the astrophycisists
are on their own on this one. :-)


>- Rather than use decimal for fractional powers of dimensions, consider
>ratios of integers. I suspect this covers nearly all cases, avoids
>worrying about rounding errors, and probably saves space (a byte each is
>probably sufficient)

I think this is an excellent idea!

> (Do you ever have a dimension to an irrational
>number? - perhaps in fractal analysis?!

A more common problem would be in empirical power law relationships.
For example, in my field a common expression would be

Z = A*(R**b)

where Z is a radar reflectivity, R is a rain rate, and b is an
empirical "constant" between 1 and 2. A ratio of byte-size integers
might actually cover this case pretty well, as long as you're
satisfied with about three digits of precision (usually enough for
empirical relationships).

> What about imaginary exponents? >:-).

Hmm... does one ever have a real reason to take a dimensioned quantity to
an imaginary exponent? I don't know -- I can only think of cases
like exp(x*i) where x is non-dimensional.

>
>- I didn't read the paper given as a definitive reference but since the
>asbtract talks about type inference and much of the other papers on that
>page discuss ML, I suspect the results are nothing like you are
>expecting. Having used ML I can see how the type system could be used
>to support this, and suspect that the paper will describe an elegant,
>powerful solution. But I am sure it will *not* be attractive to people
>who find F90 advanced... :-)


>


>- Take a look at F90. I don't know it at all well, but the kind of
>problem you are trying to solve can be tackled with modern languages
>without extending compilers and without huge performance hits.

I'm gradually realizing that this may be true, provided one can
implement a generic type that includes bits allocated for dimension
storage and then overload standard operators so as to deal with those
bits correctly. I guess I'll have to take the plunge and learn F90
after all, just to find out whether I've been make noise over nothing.


> In fact,
>I'd be surprised if there isn't some kind of F90 library already
>available... (if not, I think you'd get much more support for that than
>for extending a language that should, in all honesty, be quietly laid to
>rest). (I used to write a lot of F77 code, and I know the kind of
>environment you're coming from, so I can sympathise, but F90 *is*
>worth the effort (and money)).

Thanks for your comments -- they're just what I was looking for.

cheers,

Grant

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <87en38$3h...@owl.le.ac.uk>, Clive Page <c...@nospam.le.ac.uk> wrote:
>In article <87el8b$qkq$1...@mozo.cc.purdue.edu>,
>Grant W. Petty <gpe...@rain.atms.purdue.edu> wrote:
>>>> Is it not possible to make a very modest extension to a widely used
>>>> scientific programming language (the most entrenched of which,
>>>> for better or for worse, is Fortran 77) to allow a rational, natural,
>>>> and intuitive approach to physical dimension checking and unit
>>>> conversion?
>
>It's rather easy to do what you want in Fortran, provided only that you use
>the current standard, Fortran 95, or the preceding one, Fortran 90. You
>just have to define a new data type, for example, myreal, which contains a
>real value and a physical units string, and overload the necessary
>operators and intrinsic functions to handle this. It's hard, perhaps
>impossible, only if you insist on using a version of Fortran two
>generations out-of-date.

This may be true. I'll have to better educate myself about F90 or F95.

>
>The most difficult part, I think you will find, is working out a simple
>unambiguous easily-parsed representation of complicated units. Just to
>take a simple case such as density, it could be represented (in MKS units)
>as kg/m**3
> kg.m**-3
> kg m**(-3)
>etc.
>And if you allow cgs units (let alone imperial ones) you have an awful mess
>to deal with.

Not if you do it the way I describe in

http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt

>ability to put physical units against stored values, and astronomers have
>come up with some conventions to handle these with a view to making
>automatic unit conversions possible, along the lines you envisage.
>You might like to look at:
>http://legacy.gsfc.nasa.gov/docs/heasarc/ofwg/ofwg_recomm.html
>under Recommendation R5.
>

I'll look at it ... thanks!

- Grant

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <389AE317...@informatik.uni-bremen.de>,

George Russell <g...@informatik.uni-bremen.de> wrote:
>"Grant W. Petty" wrote:
>> http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt
>>
>> and then Kennedy's document cited above and educate me as to the
>> relationship between the two.


>OK, your system makes a new type PREAL which includes a representation
>of the units involved. So if I add two quantities with different units,
>I will presumably get an exception of some sort at run-time. If we could
>identify errors at compile-time instead it would be much better because
>(a) at compile time the programmer is still around to fix the problem;
> at run-time the spacecraft may just be in the middle of taking off . . .
> (If you think this is fanciful look up the Ariane V disaster.)

I would think that any mission-critical program should have
appropriate error handlers for run-time errors, and of course it
shouldn't be written in a form that allows dimensional exceptions to
creep in after it has been successfully (and thoroughly) tested for
the application it was designed for.

>(b) if you do it at compile time programs will run faster and need less
> space (though of course compile more slowly). This may be a rather
> important consideration to typical FORTRAN users.

Space is not an issue in my opinion -- see my web document.

And I don't think speed really is either, any more so at least than
using type REAL*8 or COMPLEX in place of REAL*4 for some variables



>
>But if you are doing typechecking at compile time, you do definitely need some
>kind of polymorphism. If you have a routine to multiply a real vector by
>a real matrix, you really don't want to have separate versions of this routine
>for every possible unit. You just need to know have some kind of inference

Again, my proposed system doesn't have that problem, unless I'm
seriously misunderstanding what you mean.

thanks for your comments

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <389AE0F8...@jetnet.ab.ca>,
Ben Franchuk <bfra...@jetnet.ab.ca> wrote:

>George Russell wrote:
>>
>> "Grant W. Petty" wrote:
>> >
>> > Hello,

>> > Is it not possible to make a very modest extension to a widely used
>> > scientific programming language (the most entrenched of which,
>> > for better or for worse, is Fortran 77) to allow a rational, natural,
>> > and intuitive approach to physical dimension checking and unit
>> > conversion?

<snip>

>
>Adding units does have the disadvantage of making the language non
>portable.

No more so than any other language extension, as far as I can tell. At
least in the context that I describe. I can't speak to Kennedy's
proposal.

>(Have too keep all the fortran 4 users happy :))
>Unit errors belong to two classes, input/out and program writing.
>Internally things have to end up as a floating point number in some
>unit,
>and you must supply all conversions from all formats.
>At sea level x lb's of thrust is equal to n horsepower for engine Y.

Can't do this: thrust and horsepower are not interchangeable. Thrust
is a force (dimensions mass*length/time**2) and horsepower is power
(energy per unit time, or mass*length**2/time**3).

Only if you multiply thrust by forward velocity do you get dimensions
of power.

This perfectly illustrates the kind of ambiguity that enforced
dimensions in a program would eliminate!


>
>Some conversion factors
>may change over the programs use. Take a "what do you weigh on a
>different planet"
>program for example.

I'm not sure I see the problem here. Your mass is the same
everywhere. Your weight is your mass times the local gravitational
acceleration g. The latter is a different value on the surface of
each planet. Obviously ANY program would have to account for the
difference in g. g is not a conversion factor but a physical quantity
that has dimensions of acceleration.

>
>Ben.
>BTW does anybody know a ball park figure conversion for the lbs's of
>thrust to horsepower?

See above --- there is none unless you know the velocity of the
vehicle.

- Grant

Richard Maine

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
gpe...@rain.atms.purdue.edu (Grant W. Petty) writes:

> And I don't think speed really is either, any more so at least than
> using type REAL*8 or COMPLEX in place of REAL*4 for some variables

I'm off of the main thread subject here, but realize that 64-bit reals
(real*8) are often as fast as 32-bit ones (real*4) on many of today's
machines. I've measured (and I'm not the only one) cases where the
64-bit reals were even slightly faster.

Memory bandwidth is twice as much with the 64-bit reals, so that
becomes an issue in memory-intensive codes. But the actual
computation may be just as fast. There have existed machines that do
32-bit floatting point by doing something like converting to 64-bit,
doing the 64-bit operation, and then converting back to 32-bit.
This is basically how you can explain occasionally measuring 64-bit
to be faster (though seldom by much - often in the noise).

--
Richard Maine
ma...@altair.dfrc.nasa.gov

George Russell

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
"Grant W. Petty" wrote:
> I would think that any mission-critical program should have
> appropriate error handlers for run-time errors, and of course it
> shouldn't be written in a form that allows dimensional exceptions to
> creep in after it has been successfully (and thoroughly) tested for
> the application it was designed for.
Well what about Ariane? (Which failed because a run-time error happened
in a place where it wasn't covered by a handler.) What about the last-but-one
Mars probe failure? Both these were supposed to be thoroughly tested, but
the bugs weren't caught before they'd done millions of dollars worth of damage.
You can only do so much with "thorough" testing. I don't deny that
Grant W. Petty's system will catch a lot of bugs at the testing stage, but it
will only test the possibilities that actually happen. Perhaps a vector
will have nearly all its entries of one unit, but a few of another. (This is
meaningless, but allowed by Grant W. Petty's system, though not by Andrew
Kennedy's.) In testing you might only select only the commonest type of
entry, and miss the type error.

> Space is not an issue in my opinion -- see my web document.

You haven't been writing the FORTRAN programs I've been writing then! I've
had to use REAL*4 sometimes rather than REAL*8 just to save space . . .


>
> And I don't think speed really is either, any more so at least than
> using type REAL*8 or COMPLEX in place of REAL*4 for some variables

Well if you're going to be able to persuade Intel to modify their chip design
to support your floats, you may be right. In the meantime Andrew Kennedy's
system will run on the hardware we actually have.

> Again, my proposed system doesn't have that problem, unless I'm
> seriously misunderstanding what you mean.

You are right - your system does not need polymorphism of any kind.

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <87euhk$1kp$1...@mozo.cc.purdue.edu>,

Grant W. Petty <gpe...@rain.atms.purdue.edu> wrote:
>In article <87en38$3h...@owl.le.ac.uk>, Clive Page <c...@nospam.le.ac.uk> wrote:
>>
>>The most difficult part, I think you will find, is working out a simple
>>unambiguous easily-parsed representation of complicated units. Just to
>>take a simple case such as density, it could be represented (in MKS units)
>>as kg/m**3
>> kg.m**-3
>> kg m**(-3)
>>etc.
>>And if you allow cgs units (let alone imperial ones) you have an awful mess
>>to deal with.
>
>Not if you do it the way I describe in
>
>http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt
>


Since I didn't have time to elaborate on this before, let me do so
now. All of the above cases are completely indistinguishable from the
perspective of the program; all would be encoded according to my
system as

L M T Q K

-3 1 0 0 0

where L is length, M is mass, etc., and the integers represent the
powers of those dimensions. What could be simpler?

And the cgs vs MKS units 'complication' doesn't exist. There is one
base system of units for internally storing physical values but this
is invisible to the programmer (barring overflow or underflow
problem). All he/she cares about is what value (with units) to assign
to a given variable of type generic type PREAL, and what units to use
when outputting the result of a calculation. Just like MathCad.

Again, I find that a lot of objections being raised in this thread
seem to reflect misunderstandings of the proposed method. Please see
the above web page if you're unsure what I mean.

thanks

Ben Franchuk

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
"Grant W. Petty" wrote:
>
> In article <389AE0F8...@jetnet.ab.ca>,
> Ben Franchuk <bfra...@jetnet.ab.ca> wrote:

> Can't do this: thrust and horsepower are not interchangeable. Thrust
> is a force (dimensions mass*length/time**2) and horsepower is power
> (energy per unit time, or mass*length**2/time**3).
>
> Only if you multiply thrust by forward velocity do you get dimensions
> of power.
>
Problem #21 50 lb rocket @ sea level,has 75 lbs of thrust. What is the
horse power
of the engine?

At what point does it become a program change rather than a units
change and can the software supply what hints are needed keep the units
constant
like the above example?

Ben.

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <389B2F81...@jetnet.ab.ca>,

Ben Franchuk <bfra...@jetnet.ab.ca> wrote:
>"Grant W. Petty" wrote:
>>
>> In article <389AE0F8...@jetnet.ab.ca>,
>> Ben Franchuk <bfra...@jetnet.ab.ca> wrote:
>
>> Can't do this: thrust and horsepower are not interchangeable. Thrust
>> is a force (dimensions mass*length/time**2) and horsepower is power
>> (energy per unit time, or mass*length**2/time**3).
>>
>> Only if you multiply thrust by forward velocity do you get dimensions
>> of power.
>>
> Problem #21 50 lb rocket @ sea level,has 75 lbs of thrust. What is the
>horse power
>of the engine?

The fact that I'm basically an applied physicist, not an aeronautical
engineer, might explain why the above question completely mystifies
me. To my simple mind, thrust is force, and horsepower is work, two
completely distinct measures of a property of a motor. Unless you tie
the two together via "power = force x velocity".

There must be some non-obvious constraint or assumption that engineers
invoke that allows problem #21 to even have an answer, unless I'm
completely out of my tree.

But we're getting off topic...

>
>At what point does it become a program change rather than a units
>change and can the software supply what hints are needed keep the units
>constant
>like the above example?

Since I evidently don't grasp the nuances of the above example, I
am unable to give an intelligent answer.


- Grant

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <389B25DF...@informatik.uni-bremen.de>,

George Russell <g...@informatik.uni-bremen.de> wrote:
>"Grant W. Petty" wrote:
>> I would think that any mission-critical program should have
>> appropriate error handlers for run-time errors, and of course it
>> shouldn't be written in a form that allows dimensional exceptions to
>> creep in after it has been successfully (and thoroughly) tested for
>> the application it was designed for.
>Well what about Ariane? (Which failed because a run-time error happened
>in a place where it wasn't covered by a handler.) What about the last-but-one
>Mars probe failure? Both these were supposed to be thoroughly tested, but
>the bugs weren't caught before they'd done millions of dollars worth of damage.
>You can only do so much with "thorough" testing. I don't deny that
>Grant W. Petty's system will catch a lot of bugs at the testing stage, but it
>will only test the possibilities that actually happen. Perhaps a vector
>will have nearly all its entries of one unit, but a few of another. (This is
>meaningless, but allowed by Grant W. Petty's system, though not by Andrew
>Kennedy's.) In testing you might only select only the commonest type of
>entry, and miss the type error.

I don't really want to comment on the issue of how to ensure that a
program is bug-free. That's a separate can of worms that I'll happily
leave to the CS types. :-)

>
>> Space is not an issue in my opinion -- see my web document.
>You haven't been writing the FORTRAN programs I've been writing then! I've
>had to use REAL*4 sometimes rather than REAL*8 just to save space . . .

The point that I make in my web page is that most large memory
requirements are due to large arrays, not large collections of scalar
variables. And in most applications that I'm familiar with, large
arrays contain values all having the same phyical dimensions (i.e.,
mass, temperature, etc), in which case only one 28-bit tag (4 bits
times 7 base SI dimensions) is needed to store the dimensional
information for the entire array.

In the few cases I have seen where a single vector or multidimensional
array contained values possessing incompatible physical dimensions
(usually across rows or columns but not both), I think one could make
a strong case that the structure could be split up into logically
distinct subvectors or subarrays all of containing the same type of
physical variable.

In summary, I still don't view storage as a major problem, simply
because it's hard for me to imagine a common case in which the added
storage would be large.

>>
>> And I don't think speed really is either, any more so at least than
>> using type REAL*8 or COMPLEX in place of REAL*4 for some variables
>Well if you're going to be able to persuade Intel to modify their chip design
>to support your floats, you may be right. In the meantime Andrew Kennedy's
>system will run on the hardware we actually have.

What??
Why would my system require a new chip design any more than Andrew
Kennedy's? Everything I've been talking about has been at the
software level. Sure, if the system were widely adopted then you
might eventually design a chip that handles the extended operations more
efficiently, but talking about hardware is putting the cart before the
horse, when I'm all I'm pushing right now is a simple modification to
an existing compiler, like g77.

>
>> Again, my proposed system doesn't have that problem, unless I'm
>> seriously misunderstanding what you mean.
>You are right - your system does not need polymorphism of any kind.

Grant W. Petty

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <87fdau$n94$1...@mozo.cc.purdue.edu>,

Grant W. Petty <gpe...@rain.atms.purdue.edu> wrote:
>me. To my simple mind, thrust is force, and horsepower is work, two

Sorry, I meant "horsepower is power"...

Richard Maine

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
Ben Franchuk <bfra...@jetnet.ab.ca> writes:

> Problem #21 50 lb rocket @ sea level,has 75 lbs of thrust. What is the
> horse power of the engine?

This is simply a nonsense problem. It has no answer. I won't say that
you won't find someone asking the question on a test anyway, but that
doesn't make it any less nonsense. If you are given a problem like this
to solve, you need to fix the problem. No computer language is going to
help you do that.

--
Richard Maine
ma...@altair.dfrc.nasa.gov

Gerry Thomas

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
1. Matlab 5.3 has a collection of m-files defining an experimental object
that allows simple computations on quantities involving units of
measurement. This looks relevant to what you're proposing.
2. In Vol. 2 of Knuth's 'Art of Programming' he succeeds in adding
Avagadro's Number to Planck's constant in a short discussion on interval
arithmetic. When I tried to collect the $0.32 he had offered in the Preface
for any error found, he referred me to Vol. 3 where his transgression was
condoned in the entry indexed as 'Adding Apples and Oranges,' <g>.

Good luck,
Gerry T.

Grant W. Petty wrote in message <87c65o$k4u$1...@mozo.cc.purdue.edu>...


>Hello,
>
>I am intentionally resurrecting a discussion that appears (based on a
>DejaNews search) to have last run its course (in slightly different
>form) in the mid-1990s:
>

>Is it not possible to make a very modest extension to a widely used
>scientific programming language (the most entrenched of which,
>for better or for worse, is Fortran 77) to allow a rational, natural,
>and intuitive approach to physical dimension checking and unit
>conversion?
>

>I believe it is, and I am trying to muster support for making the
>necessary modifications to an existing compiler, like g77, so as to
>at least demonstrate the concept with working examples.
>
>If interested, please read my more detailed discussion at
>
> http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt
>
>and post your comments here, or send e-mail to gpe...@purdue.edu (NOT
>the obsolete reply-to address in the header!).
>
>
>I should mentione that my reservations about previously posted or
>published opinions on the subject lie primarily in the following
>areas:
>
>1) The extensions have to be simple and intuitive to use for a large
>category of people (e.g., natural and physical scientists like me) who
>still customarily program in F77. That means, among other things, not
>throwing F77 out in favor of another "modern" language, like C++ or
>the like, which few of my colleagues would ever bother to learn for
>various reasons (some of which are actually legitimate). Besides,
>there's a huge body of legacy code in F77 that we don't want to have
>to rewrite from scratch.
>
>2) The extensions should not be restrictive -- i.e., they shouldn't
>make it significantly more work to write a program that utilizes the
>new data types, nor should they prevent one from continuing to program
>"the old way", if one so chooses.


>
>3) In contrast to many previous proposals, I am primarily concerned
>with run-time dimension checking rather than compile-time checking,
>because the whole point is to spare the programmer/scientist the
>tedium of pre-calculating the dimensionality of every intermediate
>variable in complex calculations.
>

>If I had the expertise and the time, I would just grab the g77
>distribution and attempt to modify it myself. But I don't, so I have
>to persuade someone who does to take the approach I am proposing and
>help me implement it.
>
>Looking forward to your comments.
>
> regards,
>
> Grant

Richard Maine

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
gpe...@rain.atms.purdue.edu (Grant W. Petty) writes:

> In the few cases I have seen where a single vector or multidimensional
> array contained values possessing incompatible physical dimensions
> (usually across rows or columns but not both), I think one could make
> a strong case that the structure could be split up into logically
> distinct subvectors or subarrays all of containing the same type of
> physical variable.
>
> In summary, I still don't view storage as a major problem, simply
> because it's hard for me to imagine a common case in which the added
> storage would be large.

There are some very common cases in f66 and f77 codes. I've looked at
2 that I recall in the last 2 weeks alone (and I don't even do f77 any more;
this is 2 separate cases of code from other people that I was helping debug).
Since f66 and f77 didn't have "native" dynamic storage allocation, it
is/was *VERY* common to declare one big array of storage and then have
user-code to allocate and hand out portions of that big array. Sometimes
you get variants where there are separate arrays for reals and integers,
or some such distinction. Other times, its all in one big mess.

Everything gets dumped into the one big array, and the usage is so embedded
throughout the whole of the large code, that its a major job to undo. Very
painful. But also very prevalent in f66 and f77 code.

One of the last two cases I debugged was actually a y2k bug in some
code dated 1973. The value being thrown into the big array right in
the middle of almost everything else was a year*10000+month*100+day
(whetever units that's in). It overflowed a 6-character field in an
internal write/read.

Those of us that are big f90/f95 fans are happy to leave that messy world
behind. But it still keeps catching up with me in debugging old codes.

I think its a bit problematic to aim at an f77 extension, and then rule
out a very common usage that there aren't really good alternatives
for in f77.

Admittedly, its mostly big codes that do that kind of thing. Heck,
its a fair amount of overhead to set up all the
allocation/deallocation stuff manually. You don't see this nearly as
often in the one-off quick hacks. But large codes - the ones that are
still around from 1973 - have this kind of thing pretty often in my
experience. (Too often for my taste, but that doesn't seem to be a
factor). I think the "in my experience" part is particularly relevant
here. From your description, I think we get exposed to a different
cross-section of Fortran code.

--
Richard Maine
ma...@altair.dfrc.nasa.gov

Surendar Jeyadev

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
In article <94963236...@bats.mcs.vuw.ac.nz>,
John Harper <har...@mcs.vuw.ac.nz> wrote:
>In article <ueog9y1...@altair.dfrc.nasa.gov>,
>Richard Maine <ma...@altair.dfrc.nasa.gov> wrote:
^^^^

>>gpe...@rain.atms.purdue.edu (Grant W. Petty) writes:
>>
>>> In particular, there is no physical problem for which it is ever
>>> meaningful to do any of the following:
>>
>>> 1) Add, subtract, or equate two values having different physical
>>> dimensions. For example, it is never physically meaningful to add a
>>> variable with dimensions of time to one having dimensions of
>>> distance. Any physical formula which violates this rule is simply
>>> incorrect, period!
>>
>>> 2) Supply anything but a pure dimensionless number as an argument to>
>>> any transcendental functions, such as sine, logarithm, exponent, etc.
>>> For example, it is never meaningful to take the sine of a length, but
>>> it is meaningful to take the sine of ratio of lengths (e.g., the
>>> coordinate x divided by a wavelength lambda).
>>
>>1. Lets see, I don't offhand have a handy example of adding length to
>>time, but I've seen things like length**2 added to time**2

>
>NASA did that sort of thing recently, and then changed the unit of length
>from a foot to a meter (or was it a meter to a foot?) Ask them if they
^^^^

>enjoyed their rocket crashing on Mars as a result!

Ooooooo ..... This was the first thing that came to my mind, but
kept quiet because of the address of one of the respondents ... :-)
Look above ....

Another thing that should be noted is that this whole issue was much
less of a problem in my academic days as the scientific world works
with metric units. In contrast, the engineering world (which I now
inhabit) is a mess. I can make a much stronger case for such checking
as there is virtually no chance that industry will hire people to
rewrite all their legacy Fortran (IV ?!) code in F90, or whatever.
(No graduate students here.)
--

Surendar Jeyadev jey...@wrc.xerox.com

Richard Maine

unread,
Feb 4, 2000, 3:00:00 AM2/4/00
to
jey...@wrc.xerox.com (Surendar Jeyadev) writes:

> >NASA...


> Ooooooo ..... This was the first thing that came to my mind, but
> kept quiet because of the address of one of the respondents ... :-)
> Look above ....

No need to be bashful on my account. (I'm at home right now, so you
won't see the NASA in the address this time, but its still me). I
don't represent NASA and didn't have anything to do with that project,
so you won't be hurting my feelings. This also means that I don't
know any more about it than anyone else who reads decent newspapers
regularly - that's where I got most of my data on it. My son, who
doesn't work for NASA, was actually closer to the project than I am -
he was hoping to get to work with some of the data.

--
Richard Maine
ma...@qnet.com

James Giles

unread,
Feb 5, 2000, 3:00:00 AM2/5/00
to
Well, I've read through most of this thread now with mixed
emotions. I really only have two comments (which I should
probably keep to myself, but here goes).

First, the proposal by Grant Petty seems simple enough, and
could probably be made to work in F90 with little effort. The
problem is that what it does is not particularly satisfactory.
I suppose it's better than nothing.

The problem is that, as the author clearly states himself, the
variables of the new type PREAL can hold a value of any
unit(s) you wish. This is not what is desired. I want to be able
to declare a variable that can only hold energy, or area, or
temperature, etc.. I want, either at compile time or at run time,
for the dimensional consistency to be tested on assignment to
the variables in question - and an error issued if they're wrong.

For example, if I say E = M*V/2 (forgot to square the velocity),
Professor Petty's feature will accept the assignment and simply
track the units that the variable actually holds. To be sure, the
fault will eventually come to light (when I attempt to write out
the results, if not sooner). But that is likely to be much further
on, and after a large number of additional intermediate calculations.
Tracing the error back to the offending assignment is still likely
to be a chore. As I said, better than nothing I suppose.

Second, the proposal of Andrew Kennedy would be much more
satisfactory in the sense that most errors would be found at compile
time, and signaled at the site of the actual error. This is much to
be preferred. However, it seems unlikely to be implemented in
a language like Fortran unless and until a complete facility for
parametric polymorphism is in place.

Now, I support parametric polymorphism anyway, for other
reasons. I think it would be easier to learn and to use than
object oriented features, and would be more powerful in the
hands of typical users. But, it's not likely to be in the language
any time soon. Yet, in order to implement Andrew Kennedy's
proposal in Fortran, you'd have to do all the work required for
a general parametric polymorphism feature and then some.

---

A suggestion for Professor Petty: design your system so that
the units vector is statically associated with each variable (ie.
the units of the variable are defined once when it is declared
and are not to subsequently change). Operations on such
variables are overloaded (the same way that Professor Petty
does now) except that the overloaded assignment checks
for consistency rather than just copying the resultant units vector.

I still find this unsatisfactory for the same reasons as others give: the
intrinsic and off-the-shelf procedure libraries that operate on reals
would have had to be rewritten to accept the new types - or I'd
have to convert the data to unit-less REALs in order to use those
procedures, abandoning the protection of the units feature.
Full polymorphism in the language would still be a boon.

--
J. Giles

Andrew Cooke

unread,
Feb 5, 2000, 3:00:00 AM2/5/00
to
In article <389ADA28...@informatik.uni-bremen.de>,
George Russell <g...@informatik.uni-bremen.de> wrote:
> Andrew Cooke wrote:
[...]

> > powerful solution. But I am sure it will *not* be attractive to
people
> > who find F90 advanced... :-)

> No honestly, I think FORTRAN users have more brains than you credit
them with.
[...]

Sorry, I wasn't being very clear (mainly because, as I admitted, I
haven't read the paper). What I meant was that F77 users would have a
very hard time moving to ML. If you could fit that kind of power into a
Fortran-like language then it would indeed be wonderful (but I suspect
it would involve writing a new compiler rather than making a few changes
to an old one - or perhaps you parse the Fortran to an AST, then run
some ML on that to check it, then compile...!)

Before I say more daft things I'll read the paper.

Andrew
http://www.andrewcooke.free-online.co.uk/index.html


Sent via Deja.com http://www.deja.com/
Before you buy.

Andrew Cooke

unread,
Feb 5, 2000, 3:00:00 AM2/5/00
to

Kevin G. Rhoads

unread,
Feb 5, 2000, 3:00:00 AM2/5/00
to
> lbs-force (itself an abomination of a unit), without explicitly
>specifying the unit. The next program in the processing chain read these
>values as containing newtons, which is wrong by a factor lbs/kg8g (which
>isn't dimensionless, I know!).

Actually the pound is a unit of force -- always has been. The "pound-mass"
is the abhomination, just as the "kg-force" (occasionally seen) is the
corresponding
abhomination for metric units, wherein the fundamental is mass, not force.

lbs/(kg*g) is therefore (and quite properly so) unitless -- and always has
been

Truly, English units are more difficult and arcane than are metric units
(with
the old cgs electrical mess excepted: abamps and statcoulombs &c, fie!)
but that is no reason to condemn English units with incorrect
vilifications.
--
Kevin G. Rhoads, Ph.D. (The Cheshire Cat for official Internet mascot.)
kgrhoads@NO_SPAM.alum.mit.edu

KAMA1

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to

Ben Franchuk <bfra...@jetnet.ab.ca> wrote in message
news:389B2F81...@jetnet.ab.ca...

>Problem #21 50 lb rocket @ sea level,has 75 lbs of thrust. What is the
> horse power
> of the engine?
>

Is this from the qualification test for rocket scientists?

Can I add

Problem #22: 200 lb me* at sea level sitting in my car. What is the horse
power of the car seat necessary to exert the 200 + lb force on my lower body
necessary to hold me no less than 1 foot above the floor ?


Btw. Why horse or bull power and lb or slugs? Hey, it's allegedly the 21st
century! A CV or hp equals three quarters of a kW.
Old way:
A car having a 150 hp engine and 166 lb ft torque, mileage is 26.7 miles/US
gallon; price: 20,000.00 USD
New (to some)(since the seventies) way:
The car has a 110 kW engine and 226 N m torque; fuel consumption is 8.8
l/100 km price: 20,328.00 EUR

And the good news is you can use this newfangled system in rocket science as
well . What you are doing now is an illegal criminal activity in most
places. Abandon your wicked ways and conform to the Standard.

Mirek Gruszkiewicz

* weight loss spammers: not my true weight.

Clive Page

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to
In article <87f949$7au$1...@mozo.cc.purdue.edu>,

Grant W. Petty <gpe...@rain.atms.purdue.edu> wrote:

>Since I didn't have time to elaborate on this before, let me do so
>now. All of the above cases are completely indistinguishable from the
>perspective of the program; all would be encoded according to my
>system as
>
> L M T Q K
>
> -3 1 0 0 0
>
>where L is length, M is mass, etc., and the integers represent the
>powers of those dimensions. What could be simpler?

I did have a look at your document, but have to admit I didn't read it
(it looked as if it would take quite a bit of time). But your scheme
doesn't seem to cope with other SI base and supplementary units such as:
radians, moles, candelas, and so on. Not to mention all those units which
practitioners in various scientific trades depend on, such as decibels,
stellar magnitudes, pH values etc. The theory is nice, but I think you
will find the practice becomes messy. But I look forward to seeing a
Fortran95 module which handles them all. :-)

andrewj...@my-deja.com

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to
In article <87eb6v$ud1$1...@nnrp1.deja.com>,
Nick Benton <nic...@my-deja.com> wrote:
> Andrew's work is, I think, pretty much the definitive account of how
to
> solve this problem.

...and with this flattery (thank you, Nick) it's time I joined the
thread. As someone else has pointed out, my thesis and papers on the
topic are not very accessible to the Fortran community. It's high time
I put some HTML on my web page that gives an overview for people not
familiar with ML and its type system. For now, here's a plain text
description. (I'm not a FORTRAN expert so my suggested syntax is more
in the spirit of C that of FORTRAN, but I hope that it's
comprehensible).

The starting point for *static* checking of units or dimensions is to
parameterise numeric types (real, complex) on their units or
dimensions. Before going any further I'd like to get the units vs
dimensions question out of the way. As you all know, dimensions
are "classes" of interconvertible units. If only one unit of measure is
associated with each base dimension (e.g. one chooses to adopt m/kg/s
for length/mass/time) then it's really a matter of taste whether types
are parameterised on units or dimensions. If, however, different units
of measure with the same dimension can be used in a single program then
one desirable feature of the system is the automatic insertion of
conversions between units. This is something that I have not considered
in my work but which should fit into the framework. For the rest of
this description, assume that types are parameterised on units but that
the unit system is fixed.

Start with some set of base units (e.g. kg, m, s) and then define
derived units by "product of powers" using notation such as

kg m s^-2

for the units associated with force. Then write

real [kg m s^-2]

for the type of real-valued quantities that have these units. Clearly
we need a notation for dimensionless quantities such as angles (a ratio
of lengths) that have been mentioned on previous postings.

real []

is as good a notation as any. Obviously we need some way of introducing
values with particular units, as ordinary real-valued literals must be
dimensionless. I assume that for each base unit there is a constant
with the same name e.g.

kg : real [kg]
m : real [m]
s : real [s]

You can then multiply these constants by literal reals e.g.

gravity = 9.8 * m * s**-2

(Behind the scenes, units are not carried around at run-time and the
compiler would use the value 1.0 for all base units so the
multiplications would not actually be performed at run-time; also, you
might want some syntactic sugar that provides a neater notation than
the above).

With this much the compiler can check the units of simple formulae;
note, though, that this *cannot* be encoded in a conventional type
system (as previous posters have suggested, through overloading etc.)
as it relies on the algebraic properties of units e.g. kg m s^-2 is
equivalent to m s^-2 kg. (To be precise, the units form an Abelian
group if the powers are integers or a vector space if one uses
rationals for powers).

Unfortunately, this system only takes you so far -- what if I want to
write *generic* functions that work on values of different units?
Here's a toy example:

real sum_of_squares(real a, real b) = a*a + b*b

I surely do not want to duplicate this function for every unit of
measure that is passed to it e.g.

real[m^2] sum_of_squares(real[m] a, real[m] b) = a*a + b*b
real[kg^4] sum_of_squares(real[kg^2] a, real[kg^2] b) = a*a + b*b

Instead, I'd like to write it once and give *generic* types to the
inputs and outputs (akin to templates in C++, generics in Ada,
parametric polymorphism in ML). Suppose that U is a "unit variable".
Then I can write

real[U^2] sum_of_squares(real[U] a, real[U] b) = a*a + b*b

which is interpreted "for any units U, sum_of_squares accepts two
parameters both with units U and returns a result with units U
squared".

Now, as a previous poster pointed out, it can be a trifle painful to
annotate all numeric types with their units of measure. What I have
shown for ML is that the compiler can automatically *infer* the "most
general" generic type for any function, such as the sum_of_squares
example above. I see no reason why this should not be applied to
FORTRAN too - and it would confer some kind of compatibility on
existing programs that lack unit annotations.

I should point out that not everything in the garden is rosy:

1. Static typing has its limits. For example, consider a function that
calculates the product of the elements in a list whose size is not
known at compile-time. The units of the result depend on the size of
the list -- and hence cannot be determined at compile-time. In fact,
type inference will insist that the inputs and outputs are
dimensionless.

2. A certain amount of "cheating" is required. Obviously one has to
assume that the built-in arithmetic operations (+, -, *, /, etc.) are
given predetermined generic types:

real[U] +(real[U], real[U])
real[U] -(real[U], real[U])
real[U1 U2] *(real[U1], real[U2])
real[U1 U2^-1] /(real[U1], real[U2])

What's not so obvious is that certain derived operations cannot be
coded in the programming language and yet still be assigned their
natural generic types. For example, one might hope that an
implementation of square root (by iterative approximation) would be
assigned the type

real[U] sqrt(real[U^2])

It can be shown that *no* non-trivial programs have this type in a
language with just the four built-in arithmetic operations listed
above. (Trivial programs include those that always return zero or go
into an infinite loop). One of my papers (POPL'97) presents a
mathematical proof using techniques from formal semantics. But more
intuitively, consider the operation of such a root-finding function. It
must start with an initial estimate of the root (type : real[U]) and
yet it cannot generate this estimate from its argument (type : real
[U^2]) using only the built-in arithmetic operations.

There are two ways out of this bind. The first is simply to
build "enough" operations into the language as primitive and to assign
these their natural generic types (roots being an obvious candidate).
The second is to allow the user to break the type system and to "cast"
from one unit of measure to another. This is rather unsatisfactory as
arguably it defeats the whole purpose of units-of-measure checking, but
perhaps if it were limited to "library writers" then its abuse would be
curtailed.

This has been a long posting, but I hope that I've succeeded in showing
that you can get a surprisingly long way with static type checking.

See
http://research.microsoft.com/~akenn/papers/
for my publications on this topic.

- Andrew Kennedy.

George Russell

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to
Andrew Cooke wrote:
> Sorry, I wasn't being very clear (mainly because, as I admitted, I
> haven't read the paper). What I meant was that F77 users would have a
> very hard time moving to ML. If you could fit that kind of power into a
> Fortran-like language then it would indeed be wonderful (but I suspect
> it would involve writing a new compiler rather than making a few changes
> to an old one - or perhaps you parse the Fortran to an AST, then run
> some ML on that to check it, then compile...!)
What I was thinking of was that there would be a lint-like tool that people
would run on their FORTRAN programs which would complain about unit errors.
Of course such a lint-like tool could be incorporated in the compiler but it
doesn't have to be. There would also have to be some extra names or something
so that people could get the process going, as in Grant Petty's system. So
people would type
X = 1.0 * METRES
or something like that.

If there are no errors, FORTRAN programmers won't need to understand
any extra concepts. 8-)

If there are, then the tool will have to find a way of expressing something like
"The routine MATMUL(A,B,RESULT) expects A to be a two-dimensional array of
units R, B to be a one-dimensional array of units S, and RESULT to be a one-
dimensional array of units R*S. However at line blah file blah, A is in meters,
B is in seconds^-1, and RESULT is in meters per seconds squared." This is
admittedly complicated, but is only really talking about what the programmer
should understand anyway. This is about as complicated as I think it is likely
to get in practice, though it will get more complicated if for example the
programmer is passing functions around which do dimensional calculations. Even so,
though, the things the tool finds out are things the programmer needs to think
about anyway - if you are passing functions around which do things with dimensions,
you probably do need to understand what relation the units all have to each other.

Andrew Cooke

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to
In article <87m3c2$8i...@owl.le.ac.uk>,
Clive Page <c...@nospam.le.ac.uk> wrote:
[...]

> (it looked as if it would take quite a bit of time). But your scheme
> doesn't seem to cope with other SI base and supplementary units such
as:
> radians, moles, candelas, and so on. Not to mention all those units
which
> practitioners in various scientific trades depend on, such as
decibels,
> stellar magnitudes, pH values etc. The theory is nice, but I think
you
> will find the practice becomes messy. But I look forward to seeing a
> Fortran95 module which handles them all. :-)

Radians is an odd one. A unit with no physical dimensions (I think).

Andrew

PS Hi - I probably know you vaguely (ie you may know me vaguely as
Paulina Lira's better half :-)

Tony T. Warnock

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to
It's even trickier than it looks. Radians and steradians are both
"unitless" but not really dimensionless.


Ben Franchuk

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to
KAMA1 wrote:
>
>
> Btw. Why horse or bull power and lb or slugs? Hey, it's allegedly the 21st
> century! A CV or hp equals three quarters of a kW.
> Old way:
> A car having a 150 hp engine and 166 lb ft torque, mileage is 26.7 miles/US
> gallon; price: 20,000.00 USD
> New (to some)(since the seventies) way:
> The car has a 110 kW engine and 226 N m torque; fuel consumption is 8.8
> l/100 km price: 20,328.00 EUR
>
> And the good news is you can use this newfangled system in rocket science as
> well . What you are doing now is an illegal criminal activity in most
> places. Abandon your wicked ways and conform to the Standard.
>

What I can't use cubits for length and bars of copper for weight?
It is only illegal to sell items not using the standard of the country,
you can measure
things any way you like.

If you do write a program say in metric and need to convert it to
imperial does one need
to rewrite all the I/O or can they predefinded smart I/O that will
handle the correct
outputs by a simple parameter change.

default length = cubits
default mass = copper
default ...

Ben.

Tony T. Warnock

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to
Other problems occur with object whose properties change with arithmetic
operations. The difference between two times is not a time; the
difference between two temperatures is not a temperature; the difference
between two addresses is not an address. These act funny because there
is a specified origin. However numerically address1+address2-address3 is
an address, etc.


Tony T. Warnock

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to
Another funny thing can happen with dot products. In a flat space the
distance bewteen two points is given by Sqrt(Sum(X_i - Y_i)**2) where
the sum is over i and the X_i and Y_i are coordinate values. In a curved
space, this is not correct anymore. The units for each coordinate are
different. It's like doing analytic geometry with the X and Y
coordinates being in feet and pounds respectively.

This actually occurs rather often. It's possible in relativity theory
and common in statistics. The problem is fixed by having a metric tensor
(really just a matrix) A which fits into the distance formula as:
(X_i-Y_i)^T*A*(X_i-y_i) where ^T means transpose. The units of A exactly
cancel out the coordinate unites of X and Y. In statistics A is the
inverse of the variance-covariance matrix; in relativity it's the metric
tensor.)

Even with the same type of units, problems arise. If the X's and Y's are
vectors of rank two with units feet and pounds respectively (for example
people's height and weight) then A has the units: A11=1/ft**2,
A22=1/lbs**2, A12=A21=1/(ft*lbs) [using the = sign very loosely]. If two
A's are produced by measuring for example men and women separately, it's
not clear how to combine measurements.

I'm not against the proposals (except for requiring them to be in some
post 2000 Fortran) but they do not solve all problems. In many years as
a consultant, there were really only two types of problems: first, a
pointer problem, and second, a misprint that was still legimate. The .1%
left would have been caught by these types of proposals.


Grant W. Petty

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to
In article <uRJm4.319$gH....@bgtnsc04-news.ops.worldnet.att.net>,

James Giles <james...@worldnet.att.net> wrote:
>Well, I've read through most of this thread now with mixed
>emotions. I really only have two comments (which I should
>probably keep to myself, but here goes).

< snip >

>
>The problem is that, as the author clearly states himself, the
>variables of the new type PREAL can hold a value of any
>unit(s) you wish. This is not what is desired. I want to be able
>to declare a variable that can only hold energy, or area, or
>temperature, etc.. I want, either at compile time or at run time,
>for the dimensional consistency to be tested on assignment to
>the variables in question - and an error issued if they're wrong.

Well, I think a difference in opinion here is acceptable. Your
priorities are simply different than mine, and your solution would
therefore be different as well.

I tend to believe that if a compiler could be modified to do what I
describe, there would be enough people who do like the idea as to
justify the relatively minor effort.

It's also increasingly clear to me that F90 may be capable of adding
this capability without extending the language, though I can't be
certain until I actually get my hands on an F90 compiler somewhere and
try to implement the system I describe using modules and operator
overloading.

>
>For example, if I say E = M*V/2 (forgot to square the velocity),
>Professor Petty's feature will accept the assignment and simply
>track the units that the variable actually holds. To be sure, the
>fault will eventually come to light (when I attempt to write out
>the results, if not sooner). But that is likely to be much further
>on, and after a large number of additional intermediate calculations.
>Tracing the error back to the offending assignment is still likely
>to be a chore. As I said, better than nothing I suppose.

If you're concerned about that, you could always add a line as
follows:

e = m*v/2
etest = e + u_joule ! flagged at run-time as invalid operation
! if e doesnt' have dimensions of energy

>Second, the proposal of Andrew Kennedy would be much more
>satisfactory in the sense that most errors would be found at compile
>time, and signaled at the site of the actual error. This is much to
>be preferred.

Again, I think the "preferred" system depends on one's point of view.

The primary value I see in my approach is that one can undertake
complex calculations involving multiple systems of units all mixed up,
and have lots of intermediate variables without having to work out in
advance what their dimensions must be, and yet everything ultimately
falls out in the end with the right values and whatever units one
chooses to use. And it only requires one single new data type and an
extension of the basic operators to do the right thing with that one
type.

A strong selling point, in my opinion, is that the required
modification to existing F77 compilers would be pretty trivial
compared with some of the other proposals I have heard in this thread.

>
>A suggestion for Professor Petty: design your system so that
>the units vector is statically associated with each variable (ie.

I appreciate the suggestion, but see several drawbacks with respect to
my main reasons for suggesting my system in the first place. The most
important of these is that a great deal more effort would be required
by the programmer to actually use, as compared with my proposal,
because the programmer would still need to work out in advance the
correct dimensions for every variable used in the program. My whole
objective is to let computers do these rote calculations wherever
possible.

Dick Hendrickson

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to

"Grant W. Petty" wrote:
>
> In article <uRJm4.319$gH....@bgtnsc04-news.ops.worldnet.att.net>,
> James Giles <james...@worldnet.att.net> wrote:
> >Well, I've read through most of this thread now with mixed
> >emotions. I really only have two comments (which I should
> >probably keep to myself, but here goes).
>
> < snip >

[snip]


>
> >
> >For example, if I say E = M*V/2 (forgot to square the velocity),
> >Professor Petty's feature will accept the assignment and simply
> >track the units that the variable actually holds. To be sure, the
> >fault will eventually come to light (when I attempt to write out
> >the results, if not sooner). But that is likely to be much further
> >on, and after a large number of additional intermediate calculations.
> >Tracing the error back to the offending assignment is still likely
> >to be a chore. As I said, better than nothing I suppose.
>
> If you're concerned about that, you could always add a line as
> follows:
>
> e = m*v/2
> etest = e + u_joule ! flagged at run-time as invalid operation
> ! if e doesnt' have dimensions of energy
>

Another option would be to add an optional required units field to the
variables. If "required_units" is .TRUE. then the assignment overload
will verify that the right-hand-side units are the same as the
required units. If the variable is .FALSE. the assignment routine
will simply copy the units from right to left. You'd need some
sort of initializer routine, something like
call set_up_required_units(c,"furlongs per fortnight")
that can have as much magic as you need.

You could also add another field, base_unit_system, which would
have values like "cgs" or "SI" or "english" and have the overloaded
arithmetic routines do the necessary multiplies by 2.54 or whatever
if there is a units mismatch between operands. This potentially a
lot slower, it might be worthwhile to produce 2 different F90 style
modules; one with unit_system checking and one without for people
who don't make that kind of mistake.

Dick Hendrickson

Grant W. Petty

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to

I appreciate everyone's comments on my original proposal
(http://meso.eas.purdue.edu/~gpetty/fortran_proposal.txt).

It is clear that there is range of opinions about what can and/or
should be done about units and dimensions in Fortran. My impression
is that the CS-oriented posters gravitate toward a language
implementing static (physical dimensioned) types, compile-time unit
checking, polymorphism, and the like, while programmer-scientists "in
the trenches" (like me) probably don't want to learn any more of a new
language or syntax than is absolutely necessary to do their job.
Plus, for various reasons that I won't repeat here, I personally
continue to prefer the idea of run-time dimensional computations and
consistency checking.

Regardless, nothing I have read so far has dissuaded me from the view that

1) as a purely technical matter, my proposal should be
relatively simple to implement in a Fortran compiler, POSSIBLY (I
haven't verified this!) without even extending the language in
the case of F90 and beyond; and

2) my proposed system MIGHT be regarded as very useful to at
least SOME people who are primarily physical or natural
scientists rather than people paid primarily to write
production-quality code.

The easiest way to prove or disprove either assertion, I think, is to
actually demonstrate the concept with a real working compiler and
real-world code, and see whether the idea lives up to my own
expectations. If it does, it will catch on. If it doesn't, it will
die a quiet death.

So let's assume for the sake of argument that, for whatever reason,
standard F90 doesn't quite manage to do what I want (again I still
have to research this, since my knowledge of F90 so far is derived
entirely from reading a smattering of web documents, most of which
don't devote much space to modules and operator overloading).

Hypothetically speaking, how hard would it actually be (for someone
who knows about such things) to start with a publicly available
compiler code like g77 and make the necessary modifications? And how
could I find someone willing to collaborate on such a project? It's
even possible, I think, that a small amount of funding could be found
to support a modest demonstration project.

I have already approached the GNU (gcc) folks and found relatively
little enthusiasm for making modifications that wouldn't move g77 in
the direction of becoming g90 or g95. But as long as there IS no g90
or g95, I'm still stuck with g77 as a potential starting point for my
proposed language extension.

Thanks again for any suggestions.

Grant

Grant W. Petty

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to
In article <87m3c2$8i...@owl.le.ac.uk>, Clive Page <c...@nospam.le.ac.uk> wrote:
>In article <87f949$7au$1...@mozo.cc.purdue.edu>,
>Grant W. Petty <gpe...@rain.atms.purdue.edu> wrote:
>
>>Since I didn't have time to elaborate on this before, let me do so
>>now. All of the above cases are completely indistinguishable from the
>>perspective of the program; all would be encoded according to my
>>system as
>>
>> L M T Q K
>>
>> -3 1 0 0 0
>>
>>where L is length, M is mass, etc., and the integers represent the
>>powers of those dimensions. What could be simpler?
>
>I did have a look at your document, but have to admit I didn't read it
>(it looked as if it would take quite a bit of time). But your scheme
>doesn't seem to cope with other SI base and supplementary units such as:
>radians, moles, candelas, and so on.

There's a total of 7 SI base dimensions. The two I didn't previously
mention are candelas and moles. So in practice my system would
allocate 4-8 storage bits to each of these, for a total of 28-56 bits.
And since computer hardware likes round multiples of 8 bits, lets
generously add an 8th entry for radians (which, as a ratio of two
lengths, is actually non-dimensional) and make it 32-64 bits, or 4-8
bytes. Almost every other supplementary unit, in my experience, can
be finessed in terms of these eight. And those that can't,
well...can't have everything! You can always commandeer one of the
dimensions that's not being used in a given program and assign it a
different meaning.

> Not to mention all those units which
>practitioners in various scientific trades depend on, such as decibels,
>stellar magnitudes, pH values etc.

These are all logarithmic and therefore dimensionless. And there's an
infinity of possible variations on the theme. I wouldn't feel any
moral or practical obligation to include them in my scheme!

> The theory is nice, but I think you
>will find the practice becomes messy.

I'm already fairly familiar with the practice, via a commercial
software package called MathCad (not a programming language but a
WYSIWYG equation editor and computation engine), and it works
extremely well for the kinds of physical problems I deal with. It is
also extremely intuitive to learn and use. I use it in the teaching
of Atmospheric Physics to reinforce student skills in both abstract
physical reasoning and in dimensional analysis.

It is precisely that positive experience that led me to ask "why not
transplant the same concept into Fortran"? I can think of no reason
why the practice would be any messier.

- Grant

James Giles

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to

Grant W. Petty wrote in message <87mstt$qpt$1...@mozo.cc.purdue.edu>...

>In article <uRJm4.319$gH....@bgtnsc04-news.ops.worldnet.att.net>,
>James Giles <james...@worldnet.att.net> wrote:
...

>>For example, if I say E = M*V/2 (forgot to square the velocity),
>>Professor Petty's feature will accept the assignment and simply
>>track the units that the variable actually holds. To be sure, the
>>fault will eventually come to light (when I attempt to write out
>>the results, if not sooner). But that is likely to be much further
>>on, and after a large number of additional intermediate calculations.
>>Tracing the error back to the offending assignment is still likely
>>to be a chore. As I said, better than nothing I suppose.
>
>If you're concerned about that, you could always add a line as
>follows:
>
> e = m*v/2
> etest = e + u_joule ! flagged at run-time as invalid operation
> ! if e doesnt' have dimensions of energy
>

I think I can understand why you would say so. Your background
experience (at least in the implementation of this kind of facility)
is with tools like MathCad. As a language, MathCad would be
described as a type-less language. That is, names are applied
to results of calculations and take their types from the forms of
those calculations. The same name can, at various times in a
single session, take on values of type integer, type float, type
rational, type string, etc. (including complex stuff such as polynomials
and differential equations I suppose).

Fortran is a strong-typed language. The philosophy is that the type
of a given identifier is declared once and all uses must conform.
I think any facility added to Fortran should conform to this philosophy
as closely as possible.

And, I can't think of a situation in which units are important to
me where I wouldn't regard them as part of the type of the variables
involved. That is to say, in many contexts (such as a function to
compute the average of two values) the units of the arguments
are irrelevant and I want the program to be polymorphic with
respect to units. However, in the places where I declare the
basic structures for use in a scientific or engineering code, I know
the units of all the entities involved a-priori. It would be preferable,
if the units facility is present at all, for the consistency of assignments
to such variables to be checked automatically.

Requiring a supplimentary test is counterintuitive and a possible
source of additional mistakes. It would be as if I had to check
each integer variable after every assignment to see if it were still
an integer.

--
J. Giles

Mark D. Dewing

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to

On 7 Feb 2000, Grant W. Petty wrote:

<snip>

>
> The easiest way to prove or disprove either assertion, I think, is to
> actually demonstrate the concept with a real working compiler and
> real-world code, and see whether the idea lives up to my own
> expectations. If it does, it will catch on. If it doesn't, it will
> die a quiet death.
>

<some comments deleted>

>
> Hypothetically speaking, how hard would it actually be (for someone
> who knows about such things) to start with a publicly available
> compiler code like g77 and make the necessary modifications? And how
> could I find someone willing to collaborate on such a project? It's
> even possible, I think, that a small amount of funding could be found
> to support a modest demonstration project.
>
> I have already approached the GNU (gcc) folks and found relatively
> little enthusiasm for making modifications that wouldn't move g77 in
> the direction of becoming g90 or g95. But as long as there IS no g90
> or g95, I'm still stuck with g77 as a potential starting point for my
> proposed language extension.
>

I suspect starting from g77 would be a difficult way to go.
Modifying f2c might be easier. Then, at least, all the run time
bits could be written and debugged in C.

Can your proposal be implemented entirely in fortran?
In other words, could you write your desired functionality entirely as
a standard fortran program? (It would look ugly, but the idea is to
eventually make the machine do the translation).
If so, a fortran source to source translator would be by far the easiest
way to implement it.

An Adaptor/Cocktail based solution might be implemented fairly quickly,
since a fortran parser and unparser already exist.


Mark Dewing
m-d...@uiuc.edu

Grant W. Petty

unread,
Feb 7, 2000, 3:00:00 AM2/7/00
to
In article <Pine.SOL.3.95.100020...@blava.ncsa.uiuc.edu>,

Mark D. Dewing <mde...@blava.ncsa.uiuc.edu> wrote:
>
>I suspect starting from g77 would be a difficult way to go.
>Modifying f2c might be easier. Then, at least, all the run time
>bits could be written and debugged in C.

Good suggestion. I hadn't thought of f2c.

>
>Can your proposal be implemented entirely in fortran?
>In other words, could you write your desired functionality entirely as
>a standard fortran program? (It would look ugly, but the idea is to
>eventually make the machine do the translation).

It's possible (as I have acknowledged elsewhere) that it might be
implemented fairly easily in F90 using modules. I don't know enough yet
about F90 to say for certain.

In F77, I think I could do it, but only at the cost of replacing
built-in operators (*, +, /, etc.) with subroutine calls. And of
course there'd be bitwise operations on the dimension flags which
might be a bit cumbersome in pure standard F77.

>If so, a fortran source to source translator would be by far the easiest
>way to implement it.

Another interesting suggestion. I assume this is similar to the idea
behind RATFOR (which I also have no personal experience). It might be
possible to get such a thing to work. It probably wouldn't generate
terribly efficient code, but maybe that's not needed in a
demonstration.

Even so, it would require assistance from someone who knows how to
write or modify a translator.

>
>An Adaptor/Cocktail based solution might be implemented fairly quickly,
>since a fortran parser and unparser already exist.


thanks,

0 new messages