Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: "Strong typing vs. strong testing"

14 views
Skip to first unread message

namekuseijin

unread,
Sep 27, 2010, 1:46:32 PM9/27/10
to
On 27 set, 05:46, TheFlyingDutchman <zzbba...@aol.com> wrote:
> On Sep 27, 12:58 am, p...@informatimago.com (Pascal J. Bourguignon)
> wrote:
> > RG <rNOSPA...@flownet.com> writes:
> > > In article
> > > <7df0eb06-9be1-4c9c-8057-e9fdb7f0b...@q16g2000prf.googlegroups.com>,
> > >  TheFlyingDutchman <zzbba...@aol.com> wrote:
>
> > >> On Sep 22, 10:26 pm, "Scott L. Burson" <Sc...@ergy.com> wrote:
> > >> > This might have been mentioned here before, but I just came across it: a
> > >> > 2003 essay by Bruce Eckel on how reliable systems can get built in
> > >> > dynamically-typed languages.  It echoes things we've all said here, but
> > >> > I think it's interesting because it describes a conversion experience:
> > >> > Eckel started out in the strong-typing camp and was won over.
>
> > >> >    https://docs.google.com/View?id=dcsvntt2_25wpjvbbhk
>
> > >> > -- Scott
>
> > >> If you are writing a function to determine the maximum of two numbers
> > >> passed as arguents in a dynamic typed language, what is the normal
> > >> procedure used by Eckel and others to handle someone passing in
> > >> invalid values - such as a file handle for one varible and an array
> > >> for the other?
>
> > > The normal procedure is to hit such a person over the head with a stick
> > > and shout "FOO".
>
> > Moreover, the functions returning the maximum may be able to work on
> > non-numbers, as long as they're comparable.  What's more, there are
> > numbers that are NOT comparable by the operator you're thinking about!.
>
> > So to implement your specifications, that function would have to be
> > implemented for example as:
>
> > (defmethod lessp ((x real) (y real)) (< x y))
> > (defmethod lessp ((x complex) (y complex))
> >   (or (< (real-part x) (real-part y))
> >       (and (= (real-part x) (real-part y))
> >            (< (imag-part x) (imag-part y)))))
>
> > (defun maximum (a b)
> >   (if (lessp a b) b a))
>
> > And then the client of that function could very well add methods:
>
> > (defmethod lessp ((x symbol) (y t)) (lessp (string x) y))
> > (defmethod lessp ((x t) (y symbol)) (lessp x (string y)))
> > (defmethod lessp ((x string) (y string)) (string< x y))
>
> > and call:
>
> > (maximum 'hello "WORLD") --> "WORLD"
>
> > and who are you to forbid it!?
>
> > --
> > __Pascal Bourguignon__                    http://www.informatimago.com/-Hide quoted text -
>
> > - Show quoted text -
>
> in C I can have a function maximum(int a, int b) that will always
> work. Never blow up, and never give an invalid answer. If someone
> tries to call it incorrectly it is a compile error.
> In a dynamic typed language maximum(a, b) can be called with incorrect
> datatypes. Even if I make it so it can handle many types as you did
> above, it could still be inadvertantly called with a file handle for a
> parameter or some other type not provided for. So does Eckel and
> others, when they are writing their dynamically typed code advocate
> just letting the function blow up or give a bogus answer, or do they
> check for valid types passed? If they are checking for valid types it
> would seem that any benefits gained by not specifying type are lost by
> checking for type. And if they don't check for type it would seem that
> their code's error handling is poor.

that is a lie.

Compilation only makes sure that values provided at compilation-time
are of the right datatype.

What happens though is that in the real world, pretty much all
computation depends on user provided values at runtime. See where are
we heading?

this works at compilation time without warnings:
int m=numbermax( 2, 6 );

this too:
int a, b, m;
scanf( "%d", &a );
scanf( "%d", &b );
m=numbermax( a, b );

no compiler issues, but will not work just as much as in python if
user provides "foo" and "bar" for a and b... fail.

What you do if you're feeling insecure and paranoid? Just what
dynamically typed languages do: add runtime checks. Unit tests are
great to assert those.

Fact is: almost all user data from the external words comes into
programs as strings. No typesystem or compiler handles this fact all
that graceful...

Pascal J. Bourguignon

unread,
Sep 27, 2010, 3:29:44 PM9/27/10
to
namekuseijin <nameku...@gmail.com> writes:


I would even go further.

Types are only part of the story. You may distinguish between integers
and floating points, fine. But what about distinguishing between
floating points representing lengths and floating points representing
volumes? Worse, what about distinguishing and converting floating
points representing lengths expressed in feets and floating points
representing lengths expressed in meters.

If you start with the mindset of static type checking, you will consider
that your types are checked and if the types at the interface of two
modules matches you'll think that everything's ok. And six months later
you Mars mission will crash.

On the other hand, with the dynamic typing mindset, you might even wrap
your values (of whatever numerical type) in a symbolic expression
mentionning the unit and perhaps other meta data, so that when the other
module receives it, it may notice (dynamically) that two values are not
of the same unit, but if compatible, it could (dynamically) convert into
the expected unit. Mission saved!


--
__Pascal Bourguignon__ http://www.informatimago.com/

Scott L. Burson

unread,
Sep 27, 2010, 4:36:39 PM9/27/10
to
Pascal J. Bourguignon wrote:
>
> On the other hand, with the dynamic typing mindset, you might even wrap
> your values (of whatever numerical type) in a symbolic expression
> mentionning the unit and perhaps other meta data, so that when the other
> module receives it, it may notice (dynamically) that two values are not
> of the same unit, but if compatible, it could (dynamically) convert into
> the expected unit. Mission saved!

In fairness, you could do this statically too, and without the consing
required by the dynamic approach.

-- Scott

Pascal J. Bourguignon

unread,
Sep 27, 2010, 4:38:17 PM9/27/10
to

I don't deny it. My point is that it's a question of mindset.

John Nagle

unread,
Sep 27, 2010, 10:14:44 PM9/27/10
to
On 9/27/2010 10:46 AM, namekuseijin wrote:
> On 27 set, 05:46, TheFlyingDutchman<zzbba...@aol.com> wrote:
>> On Sep 27, 12:58 am, p...@informatimago.com (Pascal J. Bourguignon)
>> wrote:
>>> RG<rNOSPA...@flownet.com> writes:
>>>> In article
>>>> <7df0eb06-9be1-4c9c-8057-e9fdb7f0b...@q16g2000prf.googlegroups.com>,
>>>> TheFlyingDutchman<zzbba...@aol.com> wrote:
>>
>>>>> On Sep 22, 10:26 pm, "Scott L. Burson"<Sc...@ergy.com> wrote:
>>>>>> This might have been mentioned here before, but I just came across it: a
>>>>>> 2003 essay by Bruce Eckel on how reliable systems can get built in
>>>>>> dynamically-typed languages. It echoes things we've all said here, but
>>>>>> I think it's interesting because it describes a conversion experience:
>>>>>> Eckel started out in the strong-typing camp and was won over.
>>
>>>>>> https://docs.google.com/View?id=dcsvntt2_25wpjvbbhk
>>

The trouble with that essay is that he's comparing with C++.
C++ stands alone as offering hiding without memory safety.
No language did that before C++, and no language has done it
since.

The basic problem with C++ is that it take's C's rather lame
concept of "array=pointer" and wallpapers over it with
objects. This never quite works. Raw pointers keep seeping
out. The mold always comes through the wallpaper.

There have been better strongly typed languages. Modula III
was quite good, but it was from DEC's R&D operation, which was
closed down when Compaq bought DEC.

John Nagle

Malcolm McLean

unread,
Sep 28, 2010, 5:13:19 AM9/28/10
to
> that graceful...- Hide quoted text -
>
You're right. C should have a much better library than it does for
parsing user-supplied string input.

The scanf() family of functions is fine for everyday use, but not
robust enough for potentially hostile inputs. atoi() had to be
replaced by strtol(), but there's a need for a higher-leve function
built on strtol().

I wrote a generic commandline parser once, however it's almost
impossible to achieve something that is both useable and 100%
bulletproof.


Malcolm McLean

unread,
Sep 28, 2010, 5:55:19 AM9/28/10
to
On Sep 27, 9:29 pm, p...@informatimago.com (Pascal J. Bourguignon)
wrote:
>

> On the other hand, with the dynamic typing mindset, you might even wrap
> your values (of whatever numerical type) in a symbolic expression
> mentionning the unit and perhaps other meta data, so that when the other
> module receives it, it may notice (dynamically) that two values are not
> of the same unit, but if compatible, it could (dynamically) convert into
> the expected unit.  Mission saved!
>
I'd like to design a language like this. If you add a quantity in
inches to a quantity in centimetres you get a quantity in (say)
metres. If you multiply them together you get an area, if you divide
them you get a dimeionless scalar. If you divide a quantity in metres
by a quantity in seconds you get a velocity, if you try to subtract
them you get an error.

Richard

unread,
Sep 28, 2010, 6:10:26 AM9/28/10
to
Malcolm McLean <malcolm...@btinternet.com> writes:

or simply use c++ etc and simply use overridden operators which pick the correct
algorithm....

--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c

Tim Bradshaw

unread,
Sep 28, 2010, 6:19:55 AM9/28/10
to
On 2010-09-28 10:55:19 +0100, Malcolm McLean said:

> I'd like to design a language like this. If you add a quantity in
> inches to a quantity in centimetres you get a quantity in (say)
> metres. If you multiply them together you get an area, if you divide
> them you get a dimeionless scalar. If you divide a quantity in metres
> by a quantity in seconds you get a velocity, if you try to subtract
> them you get an error.

There are several existing systems which do this. The HP48 (and
descendants I expect) support "units" which are essentially dimensions.
I don't remember if it signals errors for incoherent dimensions.
Mathematica also has some units support, and it definitely does not
indicate an error: "1 Inch + 1 Second" is fine. There are probably
lots of other systems which do similar things.

BartC

unread,
Sep 28, 2010, 6:18:41 AM9/28/10
to

"Malcolm McLean" <malcolm...@btinternet.com> wrote in message
news:1d6e115c-cada-46fc...@c10g2000yqh.googlegroups.com...

As you suggested in 'Varaibles with units' comp.programming Feb 16 2008?
[Yes with that spelling...]

I have a feeling that would quickly make programming impossible (if you
consider how many combinations of dimensions/units, and operators there
might be).

One approach I've used is to specify a dimension (ie. unit) only for
constant values, which are then immediately converted (at compile time) to a
standard unit:

a:=sin(60°) # becomes sin(1.047... radians)
d:=6 ins # becomes 152.4 mm

Here the standard units are radians, and mm. Every other calculation uses
implied units.

--
Bartc

Message has been deleted

Malcolm McLean

unread,
Sep 28, 2010, 9:39:27 AM9/28/10
to
On Sep 28, 12:19 pm, Tim Bradshaw <t...@tfeb.org> wrote:
>
> There are several existing systems which do this.  The HP48 (and
> descendants I expect) support "units" which are essentially dimensions.
>  I don't remember if it signals errors for incoherent dimensions.  
> Mathematica also has some units support, and it definitely does not
> indicate an error: "1 Inch + 1 Second" is fine.  There are probably
> lots of other systems which do similar things.
>
The problem is that if you allow expressions rather than terms then
the experssions can get arbitrarily complex. sqrt(1 inch + 1 Second),
for instance.

On the other hand sqrt(4 inches^2) is quite well defined. The question
is whether to allow sqrt(1 inch). It means using rationals rather than
integers for unit superscripts.

(You can argue that you can get things like km^9s^-9g^3 even in a
simple units system. The difference is that these won't occur very
often in real programs, just when people are messing sbout with the
system, and we don't need to make messing about efficient or easy to
use).


Tim Bradshaw

unread,
Sep 28, 2010, 10:55:26 AM9/28/10
to
On 2010-09-28 14:39:27 +0100, Malcolm McLean said:

> he problem is that if you allow expressions rather than terms then
> the experssions can get arbitrarily complex. sqrt(1 inch + 1 Second),
> for instance.

I can't imagine a context where 1 inch + 1 second would not be an
error, so this is a slightly odd example. Indeed I think that in
dimensional analysis summing (or comparing) things with different
dimensions is always an error.

>
> On the other hand sqrt(4 inches^2) is quite well defined. The question
> is whether to allow sqrt(1 inch). It means using rationals rather than
> integers for unit superscripts.

There's a large existing body of knowledge on dimensional analysis
(it's a very important tool for physics, for instance), and obviously
the answer is to do whatever it does. Raising to any power is fine, I
think (but transcendental functions, for instance, are never fine,
because they are equivalent to summing things with different
dimensions, which is obvious if you think about the Taylor expansion of
a transcendental function).

--tim

Message has been deleted

Niklas Holsti

unread,
Sep 28, 2010, 12:52:43 PM9/28/10
to
Albert van der Horst wrote:
> In article <87fwwvr...@kuiper.lan.informatimago.com>,

...
>> I would even go further.
>>
>> Types are only part of the story. You may distinguish between integers
>> and floating points, fine. But what about distinguishing between
>> floating points representing lengths and floating points representing
>> volumes? Worse, what about distinguishing and converting floating
>> points representing lengths expressed in feets and floating points
>> representing lengths expressed in meters.
>
> When I was at Shell (late eighties) there were people claiming
> to have done exactly that, statically, in ADA.

It is cumbersome to do it statically, in the current Ada standard. Doing
it by run-time checks in overloaded operators is easier, but of course
has some run-time overhead. There are proposals to extend Ada a bit to
make a static check of physical units ("dimensions") simpler. See
http://www.ada-auth.org/cgi-bin/cvsweb.cgi/acs/ac-00184.txt?rev=1.3&raw=Y
and inparticular the part where Edmond Schonberg explains a suggestion
for the GNAT Ada compiler.

> A mission failure is a failure of management. The Ariadne crash was.

Just a nit, the launcher is named "Ariane".

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
. @ .

Thomas A. Russ

unread,
Sep 28, 2010, 1:28:08 PM9/28/10
to
Malcolm McLean <malcolm...@btinternet.com> writes:

> I'd like to design a language like this. If you add a quantity in
> inches to a quantity in centimetres you get a quantity in (say)
> metres. If you multiply them together you get an area, if you divide
> them you get a dimeionless scalar. If you divide a quantity in metres
> by a quantity in seconds you get a velocity, if you try to subtract
> them you get an error.

Done in 1992.

See
<http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/syntax/measures/0.html>
citation at <http://portal.acm.org/citation.cfm?id=150168>

and my extension to it as part of the Loom system:
<http://www.isi.edu/isd/LOOM/documentation/loom4.0-release-notes.html#Units>

--
Thomas A. Russ, USC/Information Sciences Institute

George Neuner

unread,
Sep 28, 2010, 2:21:29 PM9/28/10
to
On 28 Sep 2010 12:42:40 GMT, Albert van der Horst
<alb...@spenarnc.xs4all.nl> wrote:

>I would say the dimensional checking is underrated. It must be
>complemented with a hard and fast rule about only using standard
>(SI) units internally.
>
>Oil output internal : m^3/sec
>Oil output printed: kbarrels/day

"barrel" is not an SI unit. And when speaking about oil there isn't
even a simple conversion.

42 US gallons ? 34.9723 imp gal ? 158.9873 L

[In case those marks don't render, they are meant to be the
double-tilda sign meaning "approximately equal".]

George

MRAB

unread,
Sep 28, 2010, 2:40:19 PM9/28/10
to pytho...@python.org
Do you mean:

42 US gallons ≈ 34.9723 imp gal ≈ 158.9873 l

The post as I received it was encoded as 7-bit us-ascii, so definitely
no double-tilde. This post was encoded as utf-8.

Nick

unread,
Sep 28, 2010, 3:12:51 PM9/28/10
to

I didn't go as far as that, but:

$ cat test.can
database input 'canal.sqlite'

for i=link 'Braunston Turn' to '.*'
print 'It is ';i.distance into 'distance:%M';' miles (which is '+i.distance into 'distance:%K'+' km) to ';i.place2 into 'name:place'
end for i
$ canal test.can
It is 0.10 miles (which is 0.16 km) to London Road Bridge No 90
It is 0.08 miles (which is 0.13 km) to Bridge No 95
It is 0.19 miles (which is 0.30 km) to Braunston A45 Road Bridge No 91
--
Online waterways route planner | http://canalplan.eu
Plan trips, see photos, check facilities | http://canalplan.org.uk

Keith Thompson

unread,
Sep 28, 2010, 3:15:07 PM9/28/10
to
George Neuner <gneu...@comcast.net> writes:
> On 28 Sep 2010 12:42:40 GMT, Albert van der Horst
> <alb...@spenarnc.xs4all.nl> wrote:
>>I would say the dimensional checking is underrated. It must be
>>complemented with a hard and fast rule about only using standard
>>(SI) units internally.
>>
>>Oil output internal : m^3/sec
>>Oil output printed: kbarrels/day
>
> "barrel" is not an SI unit.

He didn't say it was. Internal calculations are done in SI units (in
this case, m^3/sec); on output, the internal units can be converted to
whatever is convenient.

> And when speaking about oil there isn't
> even a simple conversion.
>
> 42 US gallons ? 34.9723 imp gal ? 158.9873 L
>
> [In case those marks don't render, they are meant to be the
> double-tilda sign meaning "approximately equal".]

There are multiple different kinds of "barrels", but "barrels of oil"
are (consistently, as far as I know) defined as 42 US liquid gallons.
A US liquid gallon is, by definition, 231 cubic inches; an inch
is, by definition, 0.0254 meter. So a barrel of oil is *exactly*
0.158987294928 m^3, and 1 m^3/sec is exactly 13.7365022817792
kbarrels/day. (Please feel free to check my math.) That's
admittedly a lot of digits, but there's no need for approximations
(unless they're imposed by the numeric representation you're using).

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Tim Rowe

unread,
Sep 28, 2010, 4:58:58 PM9/28/10
to namekuseijin, pytho...@python.org
On 27 September 2010 18:46, namekuseijin <nameku...@gmail.com> wrote:

> Fact is:  almost all user data from the external words comes into
> programs as strings.

Sorry, sent this to the individual, not the group.

I'd be very surprised if that were true. I suspect the majority of
programs are in embedded systems, and they will get their "user data"
come straight from the hardware. I doubt the engine controller in your
car or the computer inside your TV pass much data around as strings.
Of course, Python probably isn't the ideal language for real-time
embedded systems (I expect an advocate will be along in a moment to
tell me I'm wrong) but it's important to remember that the desktop
isn't the be-all and end-all of computing.

For what it's worth, all the research I've seen has found that
compile-time checking and testing tend to catch different bugs, so if
correctness is really important to you then you'll want both. The
truth of the matter is that correctness is only one factor in the
equation, and cost is another. Python tilts the balance in favour of
cost -- it's really fast to develop in Python compared to some other
languages, but you lose compiler checks. That's the right balance for
a lot of applications, but not for all. If it's really critical that
the program be correct then you'll want a bondage-and-discipline
language that does masses of check, you might even do separate static
analysis, and you'll *still* do all the dynamic testing you would have
in Python.


--
Tim Rowe

Erik Max Francis

unread,
Sep 28, 2010, 6:02:10 PM9/28/10
to

There are already numerous libraries that help you with this kind of
things in various languages; Python (you're crossposting to
comp.lang.python), for instance, has several, such as Unum, and
including one I've written but not yet released. It's not clear why one
would need this built into the language:

>>> print si
kg m s A K cd mol
>>> length = 3*si.in_ # underscore is needed since `in` is a keyword
>>> print length
3.0 in_
>>> lengthInCentimeters = length.convert(si.cm)
>>> print lengthInCentimeters
7.62 cm
>>> area = lengthInCentimeters*lengthInCentimeters
>>> print area
58.0644 cm**2
>>> biggerArea = 10.0*area
>>> ratio = area/biggerArea
>>> print ratio
0.1
>>> speed = (3.0*si.m)/(1.5*si.s)
>>> print speed
2.0 m/s
>>> ratio - speed
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "unity.py", line 218, in __sub__
converted = other.convert(self.strip())
File "unity.py", line 151, in convert
raise IncompatibleUnitsError, "%r and %r do not have compatible
units" % (self, other)
__main__.IncompatibleUnitsError: <Quantity @ 0x-4814a834 (2.0 m/s)> and
<Quantity @ 0x-4814a7d4 (1.0)> do not have compatible units

And everybody's favorite:

>>> print ((epsilon_0*mu_0)**-0.5).simplify()
299792458.011 m/s
>>> print c # floating point accuracy aside
299792458.0 m/s

--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 18 N 121 57 W && AIM/Y!M/Skype erikmaxfrancis
In Heaven all the interesting people are missing.
-- Friedrich Nietzsche

Keith Thompson

unread,
Sep 28, 2010, 6:51:14 PM9/28/10
to
Erik Max Francis <m...@alcyone.com> writes:
[...]

> >>> print c # floating point accuracy aside
> 299792458.0 m/s

Actually, the speed of light is exactly 299792458.0 m/s by
definition. (The meter and the second are defined in terms of the
same wavelength of light; this was changed relatively recently.)

Rob Warnock

unread,
Sep 28, 2010, 7:52:11 PM9/28/10
to
EXECUTIVE SUMMARY:

1 inch + 1 second = ~4.03e38 grams.

GORY DETAILS:

Tim Bradshaw <t...@tfeb.org> wrote:
+---------------


| Malcolm McLean said:
| > he problem is that if you allow expressions rather than terms then
| > the experssions can get arbitrarily complex. sqrt(1 inch + 1 Second),
| > for instance.
|
| I can't imagine a context where 1 inch + 1 second would not be an
| error, so this is a slightly odd example. Indeed I think that in
| dimensional analysis summing (or comparing) things with different
| dimensions is always an error.

+---------------

Unless you convert them to equivalent units first. For example, in
relativistic or cosmological physics, one often uses a units basis
wherein (almost) everything is scaled to "1":

http://en.wikipedia.org/wiki/Natural_units

When you set c = 1, then:

Einstein's equation E = mc2 can be rewritten in Planck units as E = m.
This equation means "The rest-energy of a particle, measured in Planck
units of energy, equals the rest-mass of a particle, measured in
Planck units of mass."

See also:

http://en.wikipedia.org/wiki/Planck_units
...
The constants that Planck units, by definition, normalize to 1 are the:
* Gravitational constant, G;
* Reduced Planck constant, h-bar; [h/(2*pi)]
* Speed of light in a vacuum, c;
* Coulomb constant, 1/(4*pi*epsilon_0) (sometimes k_e or k);
* Boltzmann's constant, k_B (sometimes k).

This sometimes leads people to do things that would appear sloppy
or even flat-out wrong in MKS or CGS units, such as expressing mass
in terms of length:

Consider the equation A=1e10 in Planck units. If A represents a
length, then the equation means A=1.6e-25 meters. If A represents
a mass, then the equation means A=220 kilograms. ...
In fact, natural units are especially useful when this ambiguity
is *deliberate*: For example, in special relativity space and time
are so closely related that it can be useful to not specify whether
a variable represents a distance or a time.

So it is that we find that the mass of the Sun is 1.48 km or 4.93 us, see:

http://en.wikipedia.org/wiki/Solar_mass#Related_units

In this limited sense, then, one could convert both 1 inch and 1 second
to masses[1], and *then* add them, hence:

1 inch + 1 second = ~4.03e38 grams.

;-} ;-}


-Rob

[1] 1 inch is "only" ~3.41e28 g, whereas 1 second is ~4.03e38 g,
so the latter completely dominates in the sum.

-----
Rob Warnock <rp...@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607

Chris Rebert

unread,
Sep 28, 2010, 8:32:04 PM9/28/10
to Malcolm McLean, pytho...@python.org

George Neuner

unread,
Sep 28, 2010, 9:35:17 PM9/28/10
to
On Tue, 28 Sep 2010 12:15:07 -0700, Keith Thompson <ks...@mib.org>
wrote:

>George Neuner <gneu...@comcast.net> writes:
>> On 28 Sep 2010 12:42:40 GMT, Albert van der Horst
>> <alb...@spenarnc.xs4all.nl> wrote:
>>>I would say the dimensional checking is underrated. It must be
>>>complemented with a hard and fast rule about only using standard
>>>(SI) units internally.
>>>
>>>Oil output internal : m^3/sec
>>>Oil output printed: kbarrels/day
>>
>> "barrel" is not an SI unit.
>
>He didn't say it was. Internal calculations are done in SI units (in
>this case, m^3/sec); on output, the internal units can be converted to
>whatever is convenient.

That's true. But it is a situation where the conversion to SI units
loses precision and therefore probably shouldn't be done.

>
>> And when speaking about oil there isn't
>> even a simple conversion.
>>
>> 42 US gallons ? 34.9723 imp gal ? 158.9873 L
>>
>> [In case those marks don't render, they are meant to be the
>> double-tilda sign meaning "approximately equal".]
>
>There are multiple different kinds of "barrels", but "barrels of oil"
>are (consistently, as far as I know) defined as 42 US liquid gallons.
>A US liquid gallon is, by definition, 231 cubic inches; an inch
>is, by definition, 0.0254 meter. So a barrel of oil is *exactly*
>0.158987294928 m^3, and 1 m^3/sec is exactly 13.7365022817792
>kbarrels/day. (Please feel free to check my math.) That's
>admittedly a lot of digits, but there's no need for approximations
>(unless they're imposed by the numeric representation you're using).

I don't care to check it ... the fact that the SI unit involves 12
decimal places whereas the imperial unit involves 3 tells me the
conversion probably shouldn't be done in a program that wants
accuracy.

George

Erik Max Francis

unread,
Sep 29, 2010, 2:03:55 AM9/29/10
to
Keith Thompson wrote:
> Erik Max Francis <m...@alcyone.com> writes:
> [...]
>> >>> print c # floating point accuracy aside
>> 299792458.0 m/s
>
> Actually, the speed of light is exactly 299792458.0 m/s by
> definition. (The meter and the second are defined in terms of the
> same wavelength of light; this was changed relatively recently.)

I know. Hence why I wrote the comment "floating point accuracy aside"
when printing it.

--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 18 N 121 57 W && AIM/Y!M/Skype erikmaxfrancis

If the past sits in judgment on the present, the future will be lost.
-- Winston Churchill

rustom

unread,
Sep 29, 2010, 2:43:13 AM9/29/10
to

A currently developed language with units is curl: see
http://developers.curl.com/userdocs/docs/en/dguide/quantities-basic.html

Chris Rebert

unread,
Sep 29, 2010, 3:14:35 AM9/29/10
to rustom, pytho...@python.org
On Tue, Sep 28, 2010 at 11:43 PM, rustom <rusto...@gmail.com> wrote:
> On Sep 29, 5:32 am, Chris Rebert <c...@rebertia.com> wrote:
>> On Tue, Sep 28, 2010 at 2:55 AM, Malcolm McLean
>> <malcolm.mcle...@btinternet.com> wrote:
>> > On Sep 27, 9:29 pm, p...@informatimago.com (Pascal J. Bourguignon)
>> > wrote:
>> >> On the other hand, with the dynamic typing mindset, you might even wrap
>> >> your values (of whatever numerical type) in a symbolic expression
>> >> mentionning the unit and perhaps other meta data, so that when the other
>> >> module receives it, it may notice (dynamically) that two values are not
>> >> of the same unit, but if compatible, it could (dynamically) convert into
>> >> the expected unit.  Mission saved!
>>
>> > I'd like to design a language like this. If you add a quantity in
>> > inches to a quantity in centimetres you get a quantity in (say)
>> > metres. If you multiply them together you get an area, if you divide
>> > them you get a dimeionless scalar. If you divide a quantity in metres
>> > by a quantity in seconds you get a velocity, if you try to subtract
>> > them you get an error.
>>
>> Sounds just like Frink:
>> http://futureboy.us/frinkdocs/
>
> A currently developed language with units is curl: see
> http://developers.curl.com/userdocs/docs/en/dguide/quantities-basic.html

Frink's most recent version is only 17 days old. (You seemed to imply
Frink isn't under active development.)

Cheers,
Chris

Torsten Zühlsdorff

unread,
Sep 29, 2010, 5:53:17 AM9/29/10
to
Keith Thompson schrieb:

>> >>> print c # floating point accuracy aside
>> 299792458.0 m/s
>
> Actually, the speed of light is exactly 299792458.0 m/s by
> definition.

Yes, but just in vacuum.

Greetings,
Torsten
--
http://www.dddbl.de - ein Datenbank-Layer, der die Arbeit mit 8
verschiedenen Datenbanksystemen abstrahiert,
Queries von Applikationen trennt und automatisch die Query-Ergebnisse
auswerten kann.

Pascal J. Bourguignon

unread,
Sep 29, 2010, 6:40:58 AM9/29/10
to
George Neuner <gneu...@comcast.net> writes:


Because perhaps you're thinking that oil is sent over the oceans, and
sold retails in barrils of 42 gallons?

Actually, when I buy oil, it's from a pump that's graduated in liters!

It comes from trucks with citerns containing 24 mł.

And these trucks get it from reservoirs of 23,850 mł.

"Tankers move approximately 2,000,000,000 metric tons" says the English
Wikipedia page...

Now perhaps it all depends on whether you buy your oil from Total or
from Texaco, but in my opinion, you're forgetting something: the last
drop. You never get exactly 42 gallons of oil, there's always a little
drop more or less, so what you get is perhaps 158.987 liter or
41.9999221 US gallons, or even 158.98 liter = 41.9980729 US gallons,
where you need more significant digits.

--
__Pascal Bourguignon__ http://www.informatimago.com/

Paul Wallich

unread,
Sep 29, 2010, 8:55:57 AM9/29/10
to
[...]

>
> Now perhaps it all depends on whether you buy your oil from Total or
> from Texaco, but in my opinion, you're forgetting something: the last
> drop. You never get exactly 42 gallons of oil, there's always a little
> drop more or less, so what you get is perhaps 158.987 liter or
> 41.9999221 US gallons, or even 158.98 liter = 41.9980729 US gallons,
> where you need more significant digits.

And even that pales in comparison to the expansion and contraction of
petroleum products with temperature. Compensation to standard temp is
required in some jurisdictions but not in others...

Keith Thompson

unread,
Sep 29, 2010, 10:00:56 AM9/29/10
to
Erik Max Francis <m...@alcyone.com> writes:
> Keith Thompson wrote:
>> Erik Max Francis <m...@alcyone.com> writes:
>> [...]
>>> >>> print c # floating point accuracy aside
>>> 299792458.0 m/s
>>
>> Actually, the speed of light is exactly 299792458.0 m/s by
>> definition. (The meter and the second are defined in terms of the
>> same wavelength of light; this was changed relatively recently.)
>
> I know. Hence why I wrote the comment "floating point accuracy aside"
> when printing it.

Ok. I took the comment to be an indication that the figure was
subject to floating point accuracy concerns; in fact you meant just
the opposite.

Thomas A. Russ

unread,
Sep 29, 2010, 1:54:53 PM9/29/10
to
George Neuner <gneu...@comcast.net> writes:

> On Tue, 28 Sep 2010 12:15:07 -0700, Keith Thompson <ks...@mib.org>
> wrote:
> >He didn't say it was. Internal calculations are done in SI units (in
> >this case, m^3/sec); on output, the internal units can be converted to
> >whatever is convenient.
>
> That's true. But it is a situation where the conversion to SI units
> loses precision and therefore probably shouldn't be done.

I suppose that one has to choose between two fundamental designs for any
computational system of units. One can either store the results
internally in a canonical form, which generally means an internal
representation in SI units. Then all calculations are performed using
the interal units representation and conversion happens only on input or
output.

Or one can store the values in their original input form, and perform
conversions on the fly during calculations. For calculations one will
still need to have some canonical representation for cases where the
result value doesn't have a preferred unit provided. For internal
calculations this will often be the case.

Now whether one will necessarily have a loss of precision depends on
whether the conversion factors are exact or approximations. As long as
the factors are exact, one can have the internal representation be exact
as well. One method would be to use something like the Commmon Lisp
rational numbers or the Gnu mp library.

And a representation where one preserves the "preferred" unit for
display purposes based on the original data as entered is also nice.
Roman Cunis' Common Lisp library does that, and with the use of rational
numbers for storing values and conversion factors allows one to do nice
things like make sure that

30mph * 3h = 90mi

even when the internal representation is in SI units (m/s, s, m).

George Neuner

unread,
Sep 29, 2010, 2:18:27 PM9/29/10
to


No. I'm just reacting to the "significant figures" issue. Real
world issues like US vs Eurozone and measurement error aside - and
without implying anyone here - many people seem to forget that
multiplying significant figures doesn't add them, and results to 12
decimal places are not necessarily any more accurate than results to 2
decimal places.

It makes sense to break macro barrel into micro units only when
necessary. When a refinery purchases 500,000 barrels, it is charged a
barrel price, not some multiple of gallon or liter price and
regardless of drop over/under. The refinery's process is continuous
and it needs a delivery if it has less than 20,000 barrels - so the
current reserve figure of 174,092 barrels is as accurate as is needed
(they need to order by tomorrow because delivery will take 10 days).
OTOH, because the refinery sells product to commercial vendors of
gasoline/petrol and heating oil in gallons or liters, it does makes
sense to track inventory and sales in (large multiples of) those
units.

Similarly, converting everything to mł simply because you can does not
make sense. When talking about the natural gas reserve of the United
States, the figures are given in Kmł - a few thousand mł either way is
irrelevant.

George

MRAB

unread,
Sep 29, 2010, 2:18:51 PM9/29/10
to pytho...@python.org
You could compare it to handling strings, where Unicode is used
internally and the encoding can be preserved for when you want to
output.

Squeamizh

unread,
Sep 29, 2010, 3:41:00 PM9/29/10
to
On Sep 27, 10:46 am, namekuseijin <namekusei...@gmail.com> wrote:
> On 27 set, 05:46, TheFlyingDutchman <zzbba...@aol.com> wrote:
>
>
>
>
>
> > On Sep 27, 12:58 am, p...@informatimago.com (Pascal J. Bourguignon)
> > wrote:
> > > RG <rNOSPA...@flownet.com> writes:
> > > > In article
> > > > <7df0eb06-9be1-4c9c-8057-e9fdb7f0b...@q16g2000prf.googlegroups.com>,
> > > >  TheFlyingDutchman <zzbba...@aol.com> wrote:
>
> > > >> On Sep 22, 10:26 pm, "Scott L. Burson" <Sc...@ergy.com> wrote:
> > > >> > This might have been mentioned here before, but I just came across it: a
> > > >> > 2003 essay by Bruce Eckel on how reliable systems can get built in
> > > >> > dynamically-typed languages.  It echoes things we've all said here, but
> > > >> > I think it's interesting because it describes a conversion experience:
> > > >> > Eckel started out in the strong-typing camp and was won over.
>
> > > >> >    https://docs.google.com/View?id=dcsvntt2_25wpjvbbhk
>
> > > >> > -- Scott
>
> > > >> If you are writing a function to determine the maximum of two numbers
> > > >> passed as arguents in a dynamic typed language, what is the normal
> > > >> procedure used by Eckel and others to handle someone passing in
> > > >> invalid values - such as a file handle for one varible and an array
> > > >> for the other?
>
> > > > The normal procedure is to hit such a person over the head with a stick
> > > > and shout "FOO".
>
> > > Moreover, the functions returning the maximum may be able to work on
> > > non-numbers, as long as they're comparable.  What's more, there are
> > > numbers that are NOT comparable by the operator you're thinking about!.
>
> > > So to implement your specifications, that function would have to be
> > > implemented for example as:
>
> > > (defmethod lessp ((x real) (y real)) (< x y))
> > > (defmethod lessp ((x complex) (y complex))
> > >   (or (< (real-part x) (real-part y))
> > >       (and (= (real-part x) (real-part y))
> > >            (< (imag-part x) (imag-part y)))))
>
> > > (defun maximum (a b)
> > >   (if (lessp a b) b a))
>
> > > And then the client of that function could very well add methods:
>
> > > (defmethod lessp ((x symbol) (y t)) (lessp (string x) y))
> > > (defmethod lessp ((x t) (y symbol)) (lessp x (string y)))
> > > (defmethod lessp ((x string) (y string)) (string< x y))
>
> > > and call:
>
> > > (maximum 'hello "WORLD") --> "WORLD"
>
> > > and who are you to forbid it!?
>
> > > --
> > > __Pascal Bourguignon__                    http://www.informatimago.com/-Hidequoted text -
>
> > > - Show quoted text -
>
> > in C I can have a function maximum(int a, int b) that will always
> > work. Never blow up, and never give an invalid answer. If someone
> > tries to call it incorrectly it is a compile error.
> > In a dynamic typed language maximum(a, b) can be called with incorrect
> > datatypes. Even if I make it so it can handle many types as you did
> > above, it could still be inadvertantly called with a file handle for a
> > parameter or some other type not provided for. So does Eckel and
> > others, when they are writing their dynamically typed code advocate
> > just letting the function blow up or give a bogus answer, or do they
> > check for valid types passed? If they are checking for valid types it
> > would seem that any benefits gained by not specifying type are lost by
> > checking for type. And if they don't check for type it would seem that
> > their code's error handling is poor.
>
> that is a lie.
>
> Compilation only makes sure that values provided at compilation-time
> are of the right datatype.
>
> What happens though is that in the real world, pretty much all
> computation depends on user provided values at runtime.  See where are
> we heading?
>
> this works at compilation time without warnings:
> int m=numbermax( 2, 6 );
>
> this too:
> int a, b, m;
> scanf( "%d", &a );
> scanf( "%d", &b );
> m=numbermax( a, b );
>
> no compiler issues, but will not work just as much as in python if
> user provides "foo" and "bar" for a and b... fail.
>
> What you do if you're feeling insecure and paranoid?  Just what
> dynamically typed languages do:  add runtime checks.  Unit tests are
> great to assert those.

>
> Fact is:  almost all user data from the external words comes into
> programs as strings.  No typesystem or compiler handles this fact all
> that graceful...

I disagree with your conclusion. Sure, the data was textual when it
was initially read by the program, but that should only be relevant to
the input processing code. The data is likely converted to some
internal representation immediately after it is read and validated,
and in a sanely-designed program, it maintains this representation
throughout its life time. If the structure of some data needs to
change during development, the compiler of a statically-typed language
will automatically tell you about any client code that was not updated
to account for the change. Dynamically typed languages do not provide
this assurance.

RG

unread,
Sep 29, 2010, 6:02:43 PM9/29/10
to
In article
<996bd4e6-37ff-4a55...@k22g2000prb.googlegroups.com>,
Squeamizh <squ...@hotmail.com> wrote:

This is a red herring. You don't have to invoke run-time input to
demonstrate bugs in a statically typed language that are not caught by
the compiler. For example:

[ron@mighty:~]$ cat foo.c
#include <stdio.h>

int maximum(int a, int b) {
return (a > b ? a : b);
}

int foo(int x) { return 9223372036854775807+x; }

int main () {
printf("%d\n", maximum(foo(1), 1));
return 0;
}
[ron@mighty:~]$ gcc -Wall foo.c
[ron@mighty:~]$ ./a.out
1


Even simple arithmetic is Turing-complete, so catching all type-related
errors at compile time would entail solving the halting problem.

rg

Squeamizh

unread,
Sep 29, 2010, 6:11:18 PM9/29/10
to
On Sep 29, 3:02 pm, RG <rNOSPA...@flownet.com> wrote:
> In article
> <996bd4e6-37ff-4a55-8db5-6e7574fbd...@k22g2000prb.googlegroups.com>,
> > > > >  http://www.informatimago.com/-Hidequotedtext -

In short, static typing doesn't solve all conceivable problems.

We are all aware that there is no perfect software development process
or tool set. I'm interested in minimizing the number of problems I
run into during development, and the number of bugs that are in the
finished product. My opinion is that static typed languages are
better at this for large projects, for the reasons I stated in my
previous post.

RG

unread,
Sep 29, 2010, 6:14:51 PM9/29/10
to
In article
<07f75df3-778d-4e3d...@k22g2000prb.googlegroups.com>,
Squeamizh <squ...@hotmail.com> wrote:

More specifically, the claim made above:

> in C I can have a function maximum(int a, int b) that will always
> work. Never blow up, and never give an invalid answer.

is false. And it is not necessary to invoke the vagaries of run-time
input to demonstrate that it is false.

> We are all aware that there is no perfect software development process
> or tool set. I'm interested in minimizing the number of problems I
> run into during development, and the number of bugs that are in the
> finished product. My opinion is that static typed languages are
> better at this for large projects, for the reasons I stated in my
> previous post.

More power to you. What are you doing here on cll then?

rg

Keith Thompson

unread,
Sep 29, 2010, 7:07:19 PM9/29/10
to
RG <rNOS...@flownet.com> writes:
> In article
> <07f75df3-778d-4e3d...@k22g2000prb.googlegroups.com>,
> Squeamizh <squ...@hotmail.com> wrote:
>> On Sep 29, 3:02 pm, RG <rNOSPA...@flownet.com> wrote:
[...]

>> > This is a red herring.  You don't have to invoke run-time input to
>> > demonstrate bugs in a statically typed language that are not caught by
>> > the compiler.  For example:
>> >
>> > [ron@mighty:~]$ cat foo.c
>> > #include <stdio.h>
>> >
>> > int maximum(int a, int b) {
>> >   return (a > b ? a : b);
>> >
>> > }
>> >
>> > int foo(int x) { return 9223372036854775807+x; }
>> >
>> > int main () {
>> >   printf("%d\n", maximum(foo(1), 1));
>> >   return 0;}
>> >
>> > [ron@mighty:~]$ gcc -Wall foo.c
>> > [ron@mighty:~]$ ./a.out
>> > 1
>> >
>> > Even simple arithmetic is Turing-complete, so catching all type-related
>> > errors at compile time would entail solving the halting problem.
>> >
>> > rg
>>
>> In short, static typing doesn't solve all conceivable problems.
>
> More specifically, the claim made above:
>
>> in C I can have a function maximum(int a, int b) that will always
>> work. Never blow up, and never give an invalid answer.
>
> is false. And it is not necessary to invoke the vagaries of run-time
> input to demonstrate that it is false.

But the above maximum() function does exactly that. The program's
behavior happens to be undefined or implementation-defined for reasons
unrelated to the maximum() function.

Depending on the range of type int on the given system, either the
behavior of the addition in foo() is undefined (because it overflows),
or the implicit conversion of the result to int either yields an
implementation-defined result or (in C99) raises an
implementation-defined signal; the latter can lead to undefined
behavior.

Since 9223372036854775807 is 2**63-1, what *typically* happens is that
the addition yields the value 0, but the C language doesn't require that
particular result. You then call maximum with arguments 0 and 1, and
it quite correctly returns 1.

>> We are all aware that there is no perfect software development process
>> or tool set. I'm interested in minimizing the number of problems I
>> run into during development, and the number of bugs that are in the
>> finished product. My opinion is that static typed languages are
>> better at this for large projects, for the reasons I stated in my
>> previous post.
>
> More power to you. What are you doing here on cll then?

This thread is cross-posted to several newsgroups, including
comp.lang.c.

Pascal J. Bourguignon

unread,
Sep 29, 2010, 7:17:39 PM9/29/10
to
Squeamizh <squ...@hotmail.com> writes:

> In short, static typing doesn't solve all conceivable problems.
>
> We are all aware that there is no perfect software development process
> or tool set. I'm interested in minimizing the number of problems I
> run into during development, and the number of bugs that are in the
> finished product. My opinion is that static typed languages are
> better at this for large projects, for the reasons I stated in my
> previous post.

Our experience is that a garbage collector and native bignums are much
more important to minimize the number of problems we run into during
development and the number of bugs that are in the finished products.

RG

unread,
Sep 29, 2010, 7:47:38 PM9/29/10
to
In article <lnk4m45...@nuthaus.mib.org>,
Keith Thompson <ks...@mib.org> wrote:

This all hinges on what you consider to be "a function maximum(int a,
int b) that ... always work[s] ... [and] never give[s] an invalid
answer." But if you don't consider an incorrect answer (according to
the rules of arithmetic) to be an invalid answer then the claim becomes
vacuous. You could simply ignore the arguments and return 0, and that
would meet the criteria.

If you try to refine this claim so that it is both correct and
non-vacuous you will find that static typing does not do nearly as much
for you as most of its adherents think it does.

> >> We are all aware that there is no perfect software development process
> >> or tool set. I'm interested in minimizing the number of problems I
> >> run into during development, and the number of bugs that are in the
> >> finished product. My opinion is that static typed languages are
> >> better at this for large projects, for the reasons I stated in my
> >> previous post.
> >
> > More power to you. What are you doing here on cll then?
>
> This thread is cross-posted to several newsgroups, including
> comp.lang.c.

Ah, so it is. My bad.

rg

Thomas A. Russ

unread,
Sep 29, 2010, 6:19:08 PM9/29/10
to
RG <rNOS...@flownet.com> writes:
>
> More power to you. What are you doing here on cll then?

This thread is massively cross-posted.

Keith Thompson

unread,
Sep 29, 2010, 8:26:38 PM9/29/10
to

int maximum(int a, int b) { return a > b ? a : b; }

> But if you don't consider an incorrect answer (according to
> the rules of arithmetic) to be an invalid answer then the claim becomes
> vacuous. You could simply ignore the arguments and return 0, and that
> would meet the criteria.

I don't believe it's possible in any language to write a maximum()
function that returns a correct result *when given incorrect argument
values*.

The program (assuming a typical implementation) calls maximum() with
arguments 0 and 1. maximum() returns 1. It works. The problem
is elsewhere in the program.

(And on a hypothetical system with INT_MAX >= 9223372036854775808,
the program's entire behavior is well defined and mathematically
correct. C requires INT_MAX >= 32767; it can be as large as the
implementation chooses. In practice, the largest value I've ever
seen for INT_MAX is 9223372036854775807.)

> If you try to refine this claim so that it is both correct and
> non-vacuous you will find that static typing does not do nearly as much
> for you as most of its adherents think it does.

Speaking only for myself, I've never claimed that static typing solves
all conceivable problems. My point is only about this specific example
of a maximum() function.

[...]

Squeamizh

unread,
Sep 29, 2010, 8:58:23 PM9/29/10
to
On Sep 29, 3:14 pm, RG <rNOSPA...@flownet.com> wrote:
> In article
> <07f75df3-778d-4e3d-8aa0-fbd4bd108...@k22g2000prb.googlegroups.com>,

OK. You finished your post with a reference to the halting problem,
which does not help to bolster any practical argument. That is why I
summarized your post in the manner I did.

I agree that static typed languages do not prevent these types of
overflow errors.

RG

unread,
Sep 29, 2010, 9:01:02 PM9/29/10
to
In article <lnfwws5...@nuthaus.mib.org>,
Keith Thompson <ks...@mib.org> wrote:

That the problem is "elsewhere in the program" ought to be small
comfort. But very well, try this instead:

[ron@mighty:~]$ cat foo.c
#include <stdio.h>

int maximum(int a, int b) { return a > b ? a : b; }

int main() {
long x = 8589934592;
printf("Max of %ld and 1 is %d\n", x, maximum(x,1));


return 0;
}
[ron@mighty:~]$ gcc -Wall foo.c
[ron@mighty:~]$ ./a.out

Max of 8589934592 and 1 is 1

Seebs

unread,
Sep 29, 2010, 9:17:53 PM9/29/10
to
On 2010-09-30, RG <rNOS...@flownet.com> wrote:
> That the problem is "elsewhere in the program" ought to be small
> comfort.

It is, perhaps, but it's also an important technical point: You CAN write
correct code for such a thing.

> int maximum(int a, int b) { return a > b ? a : b; }

> int main() {
> long x = 8589934592;
> printf("Max of %ld and 1 is %d\n", x, maximum(x,1));

You invoked implementation-defined behavior here by calling maximum() with
a value which was outside the range. The defined behavior is that the
arguments are converted to the given type, namely int. The conversion
is implementation-defined and could include yielding an implementation-defined
signal which aborts execution.

Again, the maximum() function is 100% correct -- your call of it is incorrect.
You didn't pass it the right sort of data. That's your problem.

(And no, the lack of a diagnostic doesn't necessarily prove anything; see
the gcc documentation for details of what it does when converting an out
of range value into a signed type, it may well have done exactly what it
is defined to do.)

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.

Keith Thompson

unread,
Sep 29, 2010, 9:28:15 PM9/29/10
to
RG <rNOS...@flownet.com> writes:
[...]

> That the problem is "elsewhere in the program" ought to be small
> comfort.

I don't claim that it's comforting, merely that it's true.

> But very well, try this instead:
>
> [ron@mighty:~]$ cat foo.c
> #include <stdio.h>
>
> int maximum(int a, int b) { return a > b ? a : b; }
>
> int main() {
> long x = 8589934592;
> printf("Max of %ld and 1 is %d\n", x, maximum(x,1));
> return 0;
> }
> [ron@mighty:~]$ gcc -Wall foo.c
> [ron@mighty:~]$ ./a.out
> Max of 8589934592 and 1 is 1

That exhibits a very similar problem.

8589934592 is 2**33.

Given the output you got, I presume your system has 32-bit int and
64-bit long. The call maximum(x, 1) implicitly converts the long
value 8589934592 to int. The result is implementation-defined,
but typically 0. So maximum() is called with arguments of 0 and 1,
as you could see by adding a printf call to maximum().

Even here, maximum() did exactly what was asked of it.

I'll grant you that having a conversion from a larger type to a smaller
type quietly discard high-order bits is unfriendly. But it matches the
behavior of most CPUs.

Here's another example:

#include <stdio.h>

int maximum(int a, int b) { return a > b ? a : b; }

int main(void) {
double x = 1.8;
printf("Max of %f and 1 is %d\n", x, maximum(x, 1));
return 0;
}

Output:

Max of 1.800000 and 1 is 1

Ian Collins

unread,
Sep 29, 2010, 10:00:59 PM9/29/10
to
On 09/30/10 02:17 PM, Seebs wrote:
> On 2010-09-30, RG<rNOS...@flownet.com> wrote:
>> That the problem is "elsewhere in the program" ought to be small
>> comfort.
>
> It is, perhaps, but it's also an important technical point: You CAN write
> correct code for such a thing.
>
>> int maximum(int a, int b) { return a> b ? a : b; }
>
>> int main() {
>> long x = 8589934592;
>> printf("Max of %ld and 1 is %d\n", x, maximum(x,1));
>
> You invoked implementation-defined behavior here by calling maximum() with
> a value which was outside the range. The defined behavior is that the
> arguments are converted to the given type, namely int. The conversion
> is implementation-defined and could include yielding an implementation-defined
> signal which aborts execution.
>
> Again, the maximum() function is 100% correct -- your call of it is incorrect.
> You didn't pass it the right sort of data. That's your problem.
>
> (And no, the lack of a diagnostic doesn't necessarily prove anything; see
> the gcc documentation for details of what it does when converting an out
> of range value into a signed type, it may well have done exactly what it
> is defined to do.)

Note that the mistake can be diagnosed:

lint /tmp/u.c -m64 -errchk=all
(7) warning: passing 64-bit integer arg, expecting 32-bit integer:
maximum(arg 1)

--
Ian Collins

RG

unread,
Sep 30, 2010, 12:57:46 AM9/30/10
to
In article <lnbp7g5...@nuthaus.mib.org>,
Keith Thompson <ks...@mib.org> wrote:

Of course. Computers always do only exactly what you ask of them. On
this view there is, by definition, no such thing as a bug, only
specifications that don't correspond to one's intentions.
Unfortunately, correspondence to intentions is the thing that actually
matters when writing code.

> I'll grant you that having a conversion from a larger type to a smaller
> type quietly discard high-order bits is unfriendly.

"Unfriendly" is not the adjective that I would choose to describe this
behavior.

There is a whole hierarchy of this sort of "unfriendly" behavior, some
of which can be caught at compile time using a corresponding hierarchy
of ever more sophisticated tools. But sooner or later if you are using
Turing-complete operations you will encounter the halting problem, at
which point your compile-time tools will fail. (c.f. the Collatz
problem)

I'm not saying one should not use compile-time tools, only that one
should not rely on them. "Compiling without errors" is not -- and
cannot ever be -- be a synonym for "bug-free."

rg

Seebs

unread,
Sep 30, 2010, 1:05:45 AM9/30/10
to
On 2010-09-30, RG <rNOS...@flownet.com> wrote:
> Of course. Computers always do only exactly what you ask of them. On
> this view there is, by definition, no such thing as a bug, only
> specifications that don't correspond to one's intentions.

f00f.

That said... I think you're missing Keith's point.

> Unfortunately, correspondence to intentions is the thing that actually
> matters when writing code.

Yes. Nonetheless, the maximum() function does exactly what it is intended
to do *with the inputs it receives*. The failure is outside the function;
it did the right thing with the data actually passed to it, the problem
was a user misunderstanding as to what data were being passed to it.

So there's a bug -- there's code which does not do what it was intended
to do. However, that bug is in the caller, not in the maximum()
function.

This is an important distinction -- it means we can write a function
which performs that function reliably. Now we just need to figure out
how to call it with valid data... :)

Ian Collins

unread,
Sep 30, 2010, 1:08:13 AM9/30/10
to
On 09/30/10 05:57 PM, RG wrote:
>
> I'm not saying one should not use compile-time tools, only that one
> should not rely on them. "Compiling without errors" is not -- and
> cannot ever be -- be a synonym for "bug-free."

We is why wee all have run time tools called unit tests, don't we?

--
Ian Collins

Lie Ryan

unread,
Sep 30, 2010, 1:38:28 AM9/30/10
to
On 09/30/10 11:17, Seebs wrote:
> On 2010-09-30, RG <rNOS...@flownet.com> wrote:
>> That the problem is "elsewhere in the program" ought to be small
>> comfort.
>
> It is, perhaps, but it's also an important technical point: You CAN write
> correct code for such a thing.
>
>> int maximum(int a, int b) { return a > b ? a : b; }
>
>> int main() {
>> long x = 8589934592;
>> printf("Max of %ld and 1 is %d\n", x, maximum(x,1));
>
> You invoked implementation-defined behavior here by calling maximum() with
> a value which was outside the range. The defined behavior is that the
> arguments are converted to the given type, namely int. The conversion
> is implementation-defined and could include yielding an implementation-defined
> signal which aborts execution.
>
> Again, the maximum() function is 100% correct -- your call of it is incorrect.
> You didn't pass it the right sort of data. That's your problem.

That argument can be made for dynamic language as well. If you write in
dynamic language (e.g. python):

def maximum(a, b):
return a if a > b else b

The dynamic language's version of maximum() function is 100% correct --
if you passed an uncomparable object, instead of a number, your call of
it is incorrect; you just didn't pass the right sort of data. And that's
your problem as a caller.

In fact, since Python's integer is infinite precision (only bounded by
available memory); in practice, Python's version of maximum() has less
chance of producing erroneous result.

The /most/ correct version of maximum() function is probably one written
in Haskell as:

maximum :: Integer -> Integer -> Integer
maximum a b = if a > b then a else b

Integer in Haskell has infinite precision (like python's int, only
bounded by memory), but Haskell also have static type checking, so you
can't pass just any arbitrary objects.

But even then, it's still not 100% correct. If you pass a really large
values that exhaust the memory, the maximum() could still produce
unwanted result.

Second problem is that Haskell has Int, the bounded integer, and if you
have a calculation in Int that overflowed in some previous calculation,
then you can still get an incorrect result. In practice, the
type-agnostic language with *mandatory* infinite precision arithmetic
wins in terms of correctness. Any language which only has optional
infinite precision arithmetic can always produce erroneous result.

Anyone can dream of 100% correct program; but anyone who believes they
can write a 100% correct program is just a dreamer. In reality, we don't
usually need 100% correct program; we just need a program that runs
correctly enough most of the times that the 0.0000001% chance of
producing erroneous result becomes irrelevant.

In summary, in this particular case with maximum() function, static
checking does not help in producing the most correct code; if you need
to ensure the highest correctness, you must use a language with
*mandatory* infinite precision integers.

TheFlyingDutchman

unread,
Sep 30, 2010, 1:51:28 AM9/30/10
to

>
> More specifically, the claim made above:
>
> > in C I can have a function maximum(int a, int b) that will always
> > work. Never blow up, and never give an invalid answer.
>
> is false.  And it is not necessary to invoke the vagaries of run-time
> input to demonstrate that it is false.
>
I don't think you demonstrated it is false. Any values larger than an
int get truncated before they ever get to maximum. The problem does
not lie with the maximum function. It correctly returns the maximum of
whatever two integers it is provided. Calling it with values that are
larger than an int, that get converted to an int _before_ maximum is
called, is an issue outside of maximum.

Keith Thompson

unread,
Sep 30, 2010, 2:00:56 AM9/30/10
to
RG <rNOS...@flownet.com> writes:
> In article <lnbp7g5...@nuthaus.mib.org>,
> Keith Thompson <ks...@mib.org> wrote:
[...]

>> Even here, maximum() did exactly what was asked of it.
>
> Of course. Computers always do only exactly what you ask of them. On
> this view there is, by definition, no such thing as a bug, only
> specifications that don't correspond to one's intentions.
> Unfortunately, correspondence to intentions is the thing that actually
> matters when writing code.

Of course there's such a thing as a bug.

This version of maximum:

int maximum(int a, int b) {

return a > b ? a : a;
}

has a bug. This version:

int maximum(int a, int b) {
return a > b ? a : b;
}

I would argue, does not. The fact that it might be included in a
buggy program does not mean that it is itself buggy.

[...]

> I'm not saying one should not use compile-time tools, only that one
> should not rely on them. "Compiling without errors" is not -- and
> cannot ever be -- be a synonym for "bug-free."

Agreed. (Though C does make it notoriously easy to sneak buggy code
past the compiler.)

TheFlyingDutchman

unread,
Sep 30, 2010, 2:09:05 AM9/30/10
to

>
> That argument can be made for dynamic language as well. If you write in
> dynamic language (e.g. python):
>
> def maximum(a, b):
>     return a if a > b else b
>
> The dynamic language's version of maximum() function is 100% correct --
> if you passed an uncomparable object, instead of a number, your call of
> it is incorrect; you just didn't pass the right sort of data. And that's
> your problem as a caller.
>
> In fact, since Python's integer is infinite precision (only bounded by
> available memory); in practice, Python's version of maximum() has less
> chance of producing erroneous result.

"in C I can have a function maximum(int a, int b) that will always


work. Never blow up, and never give an invalid answer. "

Dynamic typed languages like Python fail in this case on "Never blows
up".

RG

unread,
Sep 30, 2010, 2:29:55 AM9/30/10
to
In article <lnr5gb4...@nuthaus.mib.org>,
Keith Thompson <ks...@mib.org> wrote:

> > I'm not saying one should not use compile-time tools, only that one
> > should not rely on them. "Compiling without errors" is not -- and
> > cannot ever be -- be a synonym for "bug-free."
>
> Agreed. (Though C does make it notoriously easy to sneak buggy code
> past the compiler.)

Let's just leave it at that then.

rg

RG

unread,
Sep 30, 2010, 2:38:14 AM9/30/10
to
In article <slrnia86l9.2es...@guild.seebs.net>,
Seebs <usenet...@seebs.net> wrote:

> On 2010-09-30, RG <rNOS...@flownet.com> wrote:
> > Of course. Computers always do only exactly what you ask of them. On
> > this view there is, by definition, no such thing as a bug, only
> > specifications that don't correspond to one's intentions.
>
> f00f.
>
> That said... I think you're missing Keith's point.
>
> > Unfortunately, correspondence to intentions is the thing that actually
> > matters when writing code.
>
> Yes. Nonetheless, the maximum() function does exactly what it is intended
> to do *with the inputs it receives*. The failure is outside the function;
> it did the right thing with the data actually passed to it, the problem
> was a user misunderstanding as to what data were being passed to it.
>
> So there's a bug -- there's code which does not do what it was intended
> to do. However, that bug is in the caller, not in the maximum()
> function.
>
> This is an important distinction -- it means we can write a function
> which performs that function reliably. Now we just need to figure out
> how to call it with valid data... :)

We lost some important context somewhere along the line:

> > > in C I can have a function maximum(int a, int b) that will always

> > > work. Never blow up, and never give an invalid answer. If someone
> > > tries to call it incorrectly it is a compile error.

Please take note of the second sentence.

One way or another, this claim is plainly false. The point I was trying
to make is not so much that the claim is false (someone else was already
doing that), but that it can be demonstrated to be false without having
to rely on any run-time input.

rg

Nick

unread,
Sep 30, 2010, 2:48:18 AM9/30/10
to
Ian Collins <ian-...@hotmail.com> writes:

But you have to know a lot about the language to know that there's a
problem. You cannot sensibly test your max function on every
combination of (even int) input which it's designed for (and, of course,
it works for those).
--
Online waterways route planner | http://canalplan.eu
Plan trips, see photos, check facilities | http://canalplan.org.uk

Ian Collins

unread,
Sep 30, 2010, 3:03:14 AM9/30/10
to

Or using the new suffix return syntax in C++0x. Something like

template <typename T0, typename T1>
[] maximum( T0 a, T1 b) { return a > b ? a : b; }

Where the return type is deduced at compile time.

--
Ian Collins

TheFlyingDutchman

unread,
Sep 30, 2010, 3:23:55 AM9/30/10
to

> > Yes.  Nonetheless, the maximum() function does exactly what it is intended
> > to do *with the inputs it receives*.  The failure is outside the function;
> > it did the right thing with the data actually passed to it, the problem
> > was a user misunderstanding as to what data were being passed to it.
>
> > So there's a bug -- there's code which does not do what it was intended
> > to do.  However, that bug is in the caller, not in the maximum()
> > function.
>
> > This is an important distinction -- it means we can write a function
> > which performs that function reliably.  Now we just need to figure out
> > how to call it with valid data... :)
>
> We lost some important context somewhere along the line:
>
> > > > in C I can have a function maximum(int a, int b) that will always
> > > > work. Never blow up, and never give an invalid answer. If someone
> > > > tries to call it incorrectly it is a compile error.
>
> Please take note of the second sentence.
>
> One way or another, this claim is plainly false.  The point I was trying
> to make is not so much that the claim is false (someone else was already
> doing that), but that it can be demonstrated to be false without having
> to rely on any run-time input.
>

The second sentence is not disproved by a cast from one datatype to
another (which changes the value) that happens before maximum() is
called.

Paul Rubin

unread,
Sep 30, 2010, 4:02:49 AM9/30/10
to
>> > > > in C I can have a function maximum(int a, int b) that will always
>> > > > work. Never blow up, and never give an invalid answer. If someone
>> > > > tries to call it incorrectly it is a compile error.
> The second sentence is not disproved by a cast from one datatype to
> another (which changes the value) that happens before maximum() is called.

int maximum(int a, int b);

int foo() {
int (*barf)() = maximum;
return barf(3);
}

This compiles fine for me. Where is the cast? Where is the error message?
Are you saying barf(3) doesn't call maximum?

Pascal J. Bourguignon

unread,
Sep 30, 2010, 4:07:12 AM9/30/10
to
Ian Collins <ian-...@hotmail.com> writes:

Indeed. This is generic programming. And it happens that in Lisp (and
I assume in languages such as Python), sinte types are not checked at
compilation time, all the functions you write are always generic
functions.

In particular, the property "arguments are not comparable" is not
something that can be determined at compilation time, since the program
may add a compare method for the given argument at run-time (if the
comparison operator used is a generic function).

RG

unread,
Sep 30, 2010, 4:40:43 AM9/30/10
to
In article
<5bf24e59-1be0-4d31...@y12g2000prb.googlegroups.com>,
TheFlyingDutchman <zzbb...@aol.com> wrote:

You can't have it both ways. Either I am calling it incorrectly, in
which case I should get a compiler error, or I am calling it correctly,
and I should get the right answer. That I got neither does in fact
falsify the claim. The only way out of this is to say that
maximum(8589934592, 1) returning 1 is in fact "correct", in which case
we'll just have to agree to disagree.

rg

TheFlyingDutchman

unread,
Sep 30, 2010, 4:43:38 AM9/30/10
to

With Tiny C on my system, your code does not cause maximum to give an
incorrect value, or to blow up:

int maximum(int a, int b)

{
printf("entering maximum %d %d\n",a,b);
if ( a > b )
return a;
else
return b;
}

int foo()
{
int (*barf)() = maximum;
return barf(3);
}

int main (int argc, char *argv[])
{
printf("maximum is %d\n",foo());
}

------------- output -----------------------------------
entering maximum 3 4198400
maximum is 4198400

Lie Ryan

unread,
Sep 30, 2010, 4:55:57 AM9/30/10
to

How do you define "Never blows up"?

Personally, I'd consider maximum(8589934592, 1) returning 1 as a blow
up, and of the worst kind since it passes silently.

TheFlyingDutchman

unread,
Sep 30, 2010, 4:55:39 AM9/30/10
to
On Sep 30, 1:40 am, RG <rNOSPA...@flownet.com> wrote:
> In article
> <5bf24e59-1be0-4d31-9fa7-c03a8bf9b...@y12g2000prb.googlegroups.com>,
1) long trying_to_break_maximum = 8589934592;
2) /* compiler adds */
int created_to_allow_maximum_call = (int) trying_to_break_maximum;
3) maximum(created_to_allow_maximum_call, 1);

I think we have to agree to disagree, because I don't see the lack of
a compiler error at step 2 as a problem with the maximum() function.

Pascal Costanza

unread,
Sep 30, 2010, 5:27:48 AM9/30/10
to

They don't "blow up". They may throw an exception, on which you can act.
You make it sound like a core dump, which it isn't.


Pascal

--
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

Ian Collins

unread,
Sep 30, 2010, 5:52:40 AM9/30/10
to
On 09/30/10 09:02 PM, Paul Rubin wrote:
>
> int maximum(int a, int b);
>
> int foo() {
> int (*barf)() = maximum;
> return barf(3);
> }
>
> This compiles fine for me. Where is the cast? Where is the error message?
> Are you saying barf(3) doesn't call maximum?

Try a language with stricter type checking:

CC /tmp/u.c
"/tmp/u.c", line 7: Error: Cannot use int(*)(int,int) to initialize
int(*)().
"/tmp/u.c", line 8: Error: Too many arguments in call to "int(*)()".

--
Ian Collins

TheFlyingDutchman

unread,
Sep 30, 2010, 6:14:47 AM9/30/10
to

>
> > "in C I can have a function maximum(int a, int b) that will always
> > work. Never blow up, and never give an invalid answer. "
>
> > Dynamic typed languages like Python fail in this case on "Never blows
> > up".
>
> How do you define "Never blows up"?

Never has execution halt.

I think a key reason in the big rise in the popularity of interpreted
languages is that when execution halts, they normally give a call
stack and usually a good reason for why things couldn't continue. As
opposed to compiled languages which present you with a blank screen
and force you to - fire up a debugger, or much much worse, look at a
core dump - to try and discern all the information the interpreter
presents to you immediately.

>
> Personally, I'd consider maximum(8589934592, 1) returning 1 as a blow
> up, and of the worst kind since it passes silently.

If I had to choose between "blow up" or "invalid answer" I would pick
"invalid answer".

In this example RG is passing a long literal greater than INT_MAX to a
function that takes an int and the compiler apparently didn't give a
warning about the change in value as it created the cast to an int,
even with the option -Wall (all warnings). I think it's legitmate to
consider that an option for a warning/error on this condition should
be available. As far the compiler generating code that checks for a
change in value at runtime when a number is cast to a smaller data
type, I think that's also a legitimate request for a C compiler option
(in addition to other runtime check options like array subscript out
of bounds).

Nick Keighley

unread,
Sep 30, 2010, 8:19:04 AM9/30/10
to
On 27 Sep, 20:29, p...@informatimago.com (Pascal J. Bourguignon)
wrote:
> namekuseijin <namekusei...@gmail.com> writes:

<snip>

> > Fact is:  almost all user data from the external words comes into
> > programs as strings.  No typesystem or compiler handles this fact all
> > that graceful...
>
> I would even go further.
>
> Types are only part of the story.  You may distinguish between integers
> and floating points, fine.  But what about distinguishing between
> floating points representing lengths and floating points representing
> volumes?  Worse, what about distinguishing and converting floating
> points representing lengths expressed in feets and floating points
> representing lengths expressed in meters.

fair points

> If you start with the mindset of static type checking, you will consider
> that your types are checked and if the types at the interface of two
> modules matches you'll think that everything's ok.  And six months later
> you Mars mission will crash.

do you have any evidence that this is actually so? That people who
program in statically typed languages actually are prone to this "well
it compiles so it must be right" attitude?

> On the other hand, with the dynamic typing mindset, you might even wrap
> your values (of whatever numerical type) in a symbolic expression
> mentionning the unit and perhaps other meta data, so that when the other
> module receives it, it may notice (dynamically) that two values are not
> of the same unit, but if compatible, it could (dynamically) convert into
> the expected unit.  Mission saved!

they *may* do this but do they *actually* do it? My (limited)
experience of dynamically typed languges is everynow and again you
attempt to apply an operator to the wrong type of operand and kerblam!
If your testing is inadaquate then it's inadaquate whatever the
typiness of your language.

Nick Keighley

unread,
Sep 30, 2010, 8:36:17 AM9/30/10
to
On 30 Sep, 11:14, TheFlyingDutchman <zzbba...@aol.com> wrote:
> > > "in C I can have a function maximum(int a, int b) that will always
> > > work. Never blow up, and never give an invalid answer. "
>
> > > Dynamic typed languages like Python fail in this case on "Never blows
> > > up".
>
> > How do you define "Never blows up"?
>
> Never has execution halt.
>
> I think a key reason in the big rise in the popularity of interpreted
> languages is that when execution halts, they normally give a call
> stack and usually a good reason for why things couldn't continue. As
> opposed to compiled languages which present you with a blank screen
> and force you to - fire up a debugger, or much much worse, look at a
> core dump - to try and discern all the information the interpreter
> presents to you immediately.
>
>
>
> > Personally, I'd consider maximum(8589934592, 1) returning 1 as a blow
> > up, and of the worst kind since it passes silently.
>
> If I had to choose between "blow up" or "invalid answer" I would pick
> "invalid answer".

there are some application domains where neither option would be
viewed as a satisfactory error handling strategy. Fly-by-wire, petro-
chemicals, nuclear power generation. Hell you'd expect better than
this from your phone!

Pascal Bourguignon

unread,
Sep 30, 2010, 9:14:36 AM9/30/10
to
TheFlyingDutchman <zzbb...@aol.com> writes:

> In this example RG is passing a long literal greater than INT_MAX to a
> function that takes an int and the compiler apparently didn't give a
> warning about the change in value as it created the cast to an int,
> even with the option -Wall (all warnings). I think it's legitmate to
> consider that an option for a warning/error on this condition should
> be available. As far the compiler generating code that checks for a
> change in value at runtime when a number is cast to a smaller data
> type, I think that's also a legitimate request for a C compiler option
> (in addition to other runtime check options like array subscript out
> of bounds).

I think that it's a legitimate request, in this age and day, for a C
programmer to require that it be NOT an option to a C compiler not to
give any error for this and similar cases.

(And we should just kill all the programs that don't pass this check,
which I'm afraid would be a big number, which I understand, is the
reason why C compilers don't change).

--
__Pascal Bourguignon__
http://www.informatimago.com

Pascal Bourguignon

unread,
Sep 30, 2010, 9:57:58 AM9/30/10
to
Nick Keighley <nick_keigh...@hotmail.com> writes:

> On 27 Sep, 20:29, p...@informatimago.com (Pascal J. Bourguignon)
> wrote:
>> If you start with the mindset of static type checking, you will consider
>> that your types are checked and if the types at the interface of two
>> modules matches you'll think that everything's ok.  And six months later
>> you Mars mission will crash.
>
> do you have any evidence that this is actually so? That people who
> program in statically typed languages actually are prone to this "well
> it compiles so it must be right" attitude?

Yes, I can witness that it's in the mind set.

Well, the problem being always the same, the time pressures coming from
the sales people (who can sell products of which the first line of
specifications has not been written yet, much less of code), it's always
a battle to explain that once the code is written, there is still a lot
of time needed to run tests and debug it. I've even technical managers,
who should know better, expecting that we write bug-free code in the
first place (when we didn't even have a specification to begin with!).


>> On the other hand, with the dynamic typing mindset, you might even wrap
>> your values (of whatever numerical type) in a symbolic expression
>> mentionning the unit and perhaps other meta data, so that when the other
>> module receives it, it may notice (dynamically) that two values are not
>> of the same unit, but if compatible, it could (dynamically) convert into
>> the expected unit.  Mission saved!
>
> they *may* do this but do they *actually* do it? My (limited)
> experience of dynamically typed languges is everynow and again you
> attempt to apply an operator to the wrong type of operand and kerblam!
> If your testing is inadaquate then it's inadaquate whatever the
> typiness of your language.

Unfortunately, a lot of programmers in dynamic programming languages
have been formed with static programming languages bring with them their
old mindset. Moreover, when the syntax of the newer dynamic programming
languages is explicitely designed similar to an older static programming
language, in order to attract these programmers toward the better
technologies, this does not help changing the mindset either.

Unfortunately, you can write FORTRAN code in any programming language.

But my point is that at least with dynamic programming languages,
there's an alternative mindset and it is easier to implement such
a scheme than with static programming languages.

In Lisp, which stresses the symbolic computing part (S-expr are Symbolic
Expressions), it is almost trivial to implement.

TheFlyingDutchman

unread,
Sep 30, 2010, 10:24:39 AM9/30/10
to

>
> > If I had to choose between "blow up" or "invalid answer" I would pick
> > "invalid answer".
>
> there are some application domains where neither option would be
> viewed as a satisfactory error handling strategy. Fly-by-wire, petro-
> chemicals, nuclear power generation. Hell you'd expect better than
> this from your phone!
>

I wasn't speaking generally, just in the case of which of only two
choices RG's code should be referred to - "blowing up" or "giving an
invalid answer".

I think error handling in personal computer and website software has
improved over the years but there is still some room for improvement
as you will still get error messages that don't tell you something you
can relay to tech support more than that an error occurred or that
some operation can't be performed.

But I worked with programmers doing in-house software who were
incredibly turned off by exception handling in C++. I thought that
meant that they preferred to return and check error codes from
functions as they had done in C, and for some of them it did seem to
mean that. But for others it seemed that they didn't want to
anticipate errors at all ("that file is always gonna be there!"). I
read a Java book by Deitel and Deitel and they pointed out what might
have lead to that attitude - the homework and test solutions in
college usually didn't require much if any error handling - the
student could assume files were present, data was all there and in the
format expected, user input was valid and complete, etc.

Nick Keighley

unread,
Sep 30, 2010, 11:02:29 AM9/30/10
to
On 30 Sep, 15:24, TheFlyingDutchman <zzbba...@aol.com> wrote:
> > > If I had to choose between "blow up" or "invalid answer" I would pick
> > > "invalid answer".
>
> > there are some application domains where neither option would be
> > viewed as a satisfactory error handling strategy. Fly-by-wire, petro-
> > chemicals, nuclear power generation. Hell you'd expect better than
> > this from your phone!
>
> I wasn't speaking generally, just in the case of which of only two
> choices RG's code should be referred to - "blowing up" or "giving an
> invalid answer".

I think I'd prefer termination if those were my only choices. What's
the rest of the program going to do with the wrong result? When the
program finally gives up the cause is lost in the mists of time, and
those are hard to debug!

> I think error handling in personal computer and website software has
> improved over the years but there is still some room for improvement
> as you will still get error messages that don't tell you something you
> can relay to tech support more than that an error occurred or that
> some operation can't be performed.
>
> But I worked with programmers doing in-house software who were
> incredibly turned off by exception handling in C++. I thought that
> meant that they preferred to return and check error codes from
> functions as they had done in C, and for some of them it did seem to
> mean that. But for others it seemed that they didn't want to
> anticipate errors at all ("that file is always gonna be there!").

that was one of the reasons I liked exceptions. If my library threw an
exception then the caller *had* to do something about it. Even to
ignore it he had to write some code.

> I
> read a Java book by Deitel and Deitel and they pointed out what might
> have lead to that attitude - the homework and test solutions in
> college usually didn't require much if any error handling - the
> student could assume files were present, data was all there and in the
> format expected, user input was valid and complete, etc.

plausible. Going from beginner to <whatever> I probably steadily
increased the pessimism of my code. The file might not be there. That
other team might send us syntactically invalid commands. Even if it
can't go wrong it will go wrong. Fortunately my collage stuff included
some OS kernal stuff. There anything that can go wrong will go wrong.

Tim Bradshaw

unread,
Sep 30, 2010, 11:13:43 AM9/30/10
to
On 2010-09-30 13:36:17 +0100, Nick Keighley said:

> there are some application domains where neither option would be
> viewed as a satisfactory error handling strategy. Fly-by-wire, petro-
> chemicals, nuclear power generation. Hell you'd expect better than
> this from your phone!

People always give these kind of scenarios, but actually there are far
more mundane ones. In my day job I'm a sysadmin and I spend a bunch of
time writing code (typically what would nowadays be called "scripts"
rather than programs, but there's no real difference) which does things
of the form

for every machine in <several hundred systems>
do <something>

where <something> is fairly often "modify critical system configuration file".

Programs like that have some absolute, non-negotiable requirements:
- they must never fail silently;
- they must check everything they do however unlikely it seems that it
would failm
because they will come across systems which have almost arbitrary
misconfiguration.
- they should be idempotent if possible;
- if they come across something odd they either need to handle it,
or put things back the way they were and back out;
- if they absolutely can not put things back, they need to report this
very clearly
and carefully preserve any detriitus in such a way that a human can
pick up the bits;
- whatever they do they need to report in a completely parsable way
what happened
(success, failure, already done, backed out, not backed out, and so on).

These are quite mundane everyday things, but the consequences of
getting them wrong can be quite nasty (the worst ones being "the
machines will still run, but won't boot").

Keith Thompson

unread,
Sep 30, 2010, 11:26:38 AM9/30/10
to
RG <rNOS...@flownet.com> writes:
[...]

> You can't have it both ways. Either I am calling it incorrectly, in
> which case I should get a compiler error, or I am calling it correctly,
> and I should get the right answer. That I got neither does in fact
> falsify the claim. The only way out of this is to say that
> maximum(8589934592, 1) returning 1 is in fact "correct", in which case
> we'll just have to agree to disagree.

You are calling maximum() incorrectly, but you are doing so in a way
that the compiler is not required to diagnose.

If you want to say that the fact that the compiler is not required
to diagnose the error is a flaw in the C language, I won't
argue with you. It's just not a flaw in the maximum() function.

If I write:

const double pi = 22.0/7.0;
printf("pi = %f\n", pi);

then I suppose I'm calling printf() incorrectly, but I wouldn't
expect my compiler to warn me about it.

If you're arguing that

int maximum(int a, int b) { return a > b ? a : b; }

is flawed because it's too easy to call it incorrectly, you're
effectively arguing that it's not possible to write correct
code in C at all.

Seebs

unread,
Sep 30, 2010, 11:55:48 AM9/30/10
to
On 2010-09-30, TheFlyingDutchman <zzbb...@aol.com> wrote:
> even with the option -Wall (all warnings).

For various historical reasons, "-Wall" has the semantics you might
expect from an option named "-Wsome-common-warnings-but-not-others".

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.

Seebs

unread,
Sep 30, 2010, 12:01:22 PM9/30/10
to
On 2010-09-30, RG <rNOS...@flownet.com> wrote:
> We lost some important context somewhere along the line:

>> > > in C I can have a function maximum(int a, int b) that will always
>> > > work. Never blow up, and never give an invalid answer. If someone
>> > > tries to call it incorrectly it is a compile error.

> Please take note of the second sentence.

I did. That is entirely correct.

> One way or another, this claim is plainly false. The point I was trying
> to make is not so much that the claim is false (someone else was already
> doing that), but that it can be demonstrated to be false without having
> to rely on any run-time input.

It is not at all obvious to me that it is, in fact, false. So far as
I can tell, *if* the function is successfully called, then it will take
two integers, compare them, and return the larger one. It will never
return something which is not an integer. It will never raise an exception.
It will never return a value which, if you try to treat it as an integer,
raise an exception.

Now, if you pass the wrong values to it, you will get wrong answers -- but
that's your problem for passing it wrong values.

I would understand an "invalid" answer to be one of the wrong category. For
instance, if I have a function in Python that I expect to return a string,
and it returns None, I have gotten an answer that is "invalid" -- it's not
a string.

Seebs

unread,
Sep 30, 2010, 12:03:29 PM9/30/10
to
On 2010-09-30, Paul Rubin <no.e...@nospam.invalid> wrote:
> int maximum(int a, int b);
>
> int foo() {
> int (*barf)() = maximum;
> return barf(3);
> }

> This compiles fine for me. Where is the cast?

On the first line of code inside foo().

> Where is the error message?

You chose to use a form that suppresses the error message.

> Are you saying barf(3) doesn't call maximum?

I would say that it is undefined whether or not it calls maximum, because
you called a function through a function pointer of a different sort,
which invoked undefined behavior.

There exist real compiles on which code much like this will coredump
without ever once trying to jump to the address of the maximum function,
because the compiler caught your error.

Seebs

unread,
Sep 30, 2010, 12:04:35 PM9/30/10
to
On 2010-09-30, RG <rNOS...@flownet.com> wrote:
> You can't have it both ways. Either I am calling it incorrectly, in
> which case I should get a compiler error,

You get a warning if you ask for it. If you choose to run without all
the type checking on, that's your problem.

Seebs

unread,
Sep 30, 2010, 12:06:52 PM9/30/10
to
On 2010-09-30, Pascal Bourguignon <p...@invitado-174.medicalis.es> wrote:

> Nick Keighley <nick_keigh...@hotmail.com> writes:
>> do you have any evidence that this is actually so? That people who
>> program in statically typed languages actually are prone to this "well
>> it compiles so it must be right" attitude?

> Yes, I can witness that it's in the mind set.

Huh.

So here I am, programming in statically typed languages, and I have never
in my life thought that things which compiled were necessarily right. Not
even when I was an arrogant teenager.

I guess I don't exist. *sob*

> Well, the problem being always the same, the time pressures coming from
> the sales people (who can sell products of which the first line of
> specifications has not been written yet, much less of code), it's always
> a battle to explain that once the code is written, there is still a lot
> of time needed to run tests and debug it.

At $dayjob, they give us months between feature complete and shipping,
because they expect us to spend a lot of time testing, debugging, and
cleaning up. But during that time we are explicitly not adding features...

> But my point is that at least with dynamic programming languages,
> there's an alternative mindset and it is easier to implement such
> a scheme than with static programming languages.

I think this grossly oversimplifies things.

RG

unread,
Sep 30, 2010, 12:10:05 PM9/30/10
to
In article <lniq1n4...@nuthaus.mib.org>,
Keith Thompson <ks...@mib.org> wrote:

> RG <rNOS...@flownet.com> writes:
> [...]
> > You can't have it both ways. Either I am calling it incorrectly, in
> > which case I should get a compiler error, or I am calling it correctly,
> > and I should get the right answer. That I got neither does in fact
> > falsify the claim. The only way out of this is to say that
> > maximum(8589934592, 1) returning 1 is in fact "correct", in which case
> > we'll just have to agree to disagree.
>
> You are calling maximum() incorrectly, but you are doing so in a way
> that the compiler is not required to diagnose.

Yes. I know. That was my whole point. There are ways to call a
function incorrectly (more broadly, there are errors in code) that a C

compiler is not required to diagnose.

> If you want to say that the fact that the compiler is not required
> to diagnose the error is a flaw in the C language, I won't
> argue with you.

I'm not even saying it's a flaw in the language. All I'm saying is that
the original claim -- that any error in a C program will be caught by
the compiler -- is false, and more specifically, that it can be
demonstrated to be false without appeal to unknown run-time input.

As an aside, this particular error *could* be caught (and in fact would
be caught by other tools like lint), but there are errors that can not
be caught by any static analysis, and that therefore one should not be
lulled into a false sense of security by the fact that your code is
written in a statically typed language and compiled without errors or
warnings. That's all.

> If I write:
>
> const double pi = 22.0/7.0;
> printf("pi = %f\n", pi);
>
> then I suppose I'm calling printf() incorrectly, but I wouldn't
> expect my compiler to warn me about it.
>
> If you're arguing that
>
> int maximum(int a, int b) { return a > b ? a : b; }
>
> is flawed because it's too easy to call it incorrectly, you're
> effectively arguing that it's not possible to write correct
> code in C at all.

I would say that it is very, very hard to write correct code in C for
any non-vacuous definition of "correct". That is the reason that core
dumps and buffer overflows are so ubiquitous. I prefer Lisp or Python,
where core dumps and buffer overflows are virtually nonexistent. One
does get the occasional run-time error that might have been caught at
compile time, but I much prefer that to a core dump or a security hole.

One might hypothesize that the best of both worlds would be a dynamic
language with a static analyzer layered on top. Such a thing does not
exist. It makes an instructive exercise to try to figure out why. (For
the record, I don't know the answer, but I've learned a lot through the
process of pondering this conundrum.)

rg

Peter Keller

unread,
Sep 30, 2010, 12:14:01 PM9/30/10
to
In comp.lang.lisp TheFlyingDutchman <zzbb...@aol.com> wrote:
>
>>
>> More specifically, the claim made above:

>>
>> > in C I can have a function maximum(int a, int b) that will always
>> > work. Never blow up, and never give an invalid answer.
>>
>> is false. ?And it is not necessary to invoke the vagaries of run-time
>> input to demonstrate that it is false.
>>
> I don't think you demonstrated it is false. Any values larger than an
> int get truncated before they ever get to maximum. The problem does
> not lie with the maximum function. It correctly returns the maximum of
> whatever two integers it is provided. Calling it with values that are
> larger than an int, that get converted to an int _before_ maximum is
> called, is an issue outside of maximum.

After thinking for a bit. I believe I can demonstrate a situation
where indeed maximum could return the wrong answer and it isn't due to
being passed incorrect input.

If, in maximum, after the entrance to the function call but right
before the comparison, a signal handler gets invoked, walks the stack,
swaps the two values for a and b, and returns back into maximum. Then
maximum will do the wrong thing. Since control flow was always in
a subgraph of the control flow graph through maximum, this would
classify as a failure given your strict view. (As an aside, one can
do the same thing with a debugger.)

Blocking the signals around the comparison and assignment of the
result to a temporary variable that you will return won't fix it.
This is because (in C) you must have a sequence point after the
unblocking of the signals and before the assignment of a temporary
variable holding the result to the return register, where, in fact,
another signal could arrive and again corrupt the results. Depending
upon the optimzation values of the compiler, it may or may not adjust
the ordering semantics of the assignment to the return register in
relation to the call to unblock the signals. The assignment of a
result to a return register is not defined to be something in C,
and can happen anywhere. But the C statements you used to write it
must adhere to sequence evaluation.

Since the signal handler could do anything, including completely
replacing the text segments and/r loaded libraries of the code or
move the PC to an arbitrary palce, I don't think you can "fix" this
problem. Ever.

If you think this is a pedantic case which never happens in practice,
I'm the maintainer of a well-known user space checkpointing system
where these types of problems have to be thought about deeply because
they happen.

In addition, there are other modes of error injection: in compute
clusters with very high density memory that is not ECC, you can
actually calculate the probability that a bit will flip at an address
in memory due to cosmic rays. That probability is disturbingly high.

Just an idle search online produced this article:

http://news.cnet.com/8301-30685_3-10370026-264.html

which mentions some statistics. Think 1 billion hours is a lot and
"it'll never happen"?

There are 8760 hours in a year. So, you'd only need 114,156 computers
in a cluster running for one year before amassing 1 billion hours
of computation. That isn't a big number for large financial companies,
google, etc, etc, etc to own.

As a fun statistic, the BlueGene/P supercomputer can have 884,736
processors with associated memory modules. According to the math
in the article, one BlueGene/P should see a max of ~600,000 memory
errors per year.

Sure, you might not think any of this is a problem, because your
home desktop always produces the right answer when balancing your
checkbook, but it is a matter of perception of scale. Lots of large
clusters and data movement houses go to great length to ensure data
integrity. Injecting wrong data 4 hours into a 6 month long workflow
running on thousands of computers really upsets the hell out of people.

I've run into physicists who simply run their buggy software over
and over and over again on the same data and do statistical analysis
on the results. They've come to the realization that they can't
find/fix/verify all the bugs in their code, so they assume they are
there and write systems which try to be mathematically robust to the
nature of the beast. It is cheaper to wait for 1000 runs of a program
to be computed on a cluster than to spend human time debugging a
difficult bug in the code.

So, mathematically, maximum can't fail inside of itself, realistically
while executing on a physical machine, you bet it'll fail. :)

-pete


Seebs

unread,
Sep 30, 2010, 11:54:48 AM9/30/10
to
On 2010-09-30, Lie Ryan <lie....@gmail.com> wrote:
> On 09/30/10 16:09, TheFlyingDutchman wrote:
>> Dynamic typed languages like Python fail in this case on "Never blows
>> up".

> How do you define "Never blows up"?

I would say "blow up" would be "raise an exception".

> Personally, I'd consider maximum(8589934592, 1) returning 1 as a blow
> up, and of the worst kind since it passes silently.

So run your compiler with a decent set of warning levels, and watch as
you are magically warned that you're passing an object of the wrong type.

On any given system, one or the other is true:

1. The constant 8589934592 is of type int, and the function will
"work" -- will give that result.
2. The constant is not of type int, and the compiler will warn you about
this if you ask.

RG

unread,
Sep 30, 2010, 12:32:08 PM9/30/10
to
In article <slrnia9cpd.2uq...@guild.seebs.net>,
Seebs <usenet...@seebs.net> wrote:

> On 2010-09-30, Lie Ryan <lie....@gmail.com> wrote:
> > On 09/30/10 16:09, TheFlyingDutchman wrote:
> >> Dynamic typed languages like Python fail in this case on "Never blows
> >> up".
>
> > How do you define "Never blows up"?
>
> I would say "blow up" would be "raise an exception".
>
> > Personally, I'd consider maximum(8589934592, 1) returning 1 as a blow
> > up, and of the worst kind since it passes silently.
>
> So run your compiler with a decent set of warning levels, and watch as
> you are magically warned that you're passing an object of the wrong type.

My code compiles with no warnings under gcc -Wall.

> On any given system, one or the other is true:
>
> 1. The constant 8589934592 is of type int, and the function will
> "work" -- will give that result.
> 2. The constant is not of type int, and the compiler will warn you about
> this if you ask.

It would be nice if this were true, but my example clearly demonstrates
that it is not. And if your response is to say that I should have used
lint, then my response to that will be that because of the halting
problem, for any static analyzer that you present, I can construct a
program that either contains an error that either your analyzer will not
catch, or for which it will generate a false positive. It just so
happens that constructing such examples for standard C is very easy.

rg

RG

unread,
Sep 30, 2010, 12:36:22 PM9/30/10
to
In article <slrnia9dbo.2uq...@guild.seebs.net>,
Seebs <usenet...@seebs.net> wrote:

> On 2010-09-30, RG <rNOS...@flownet.com> wrote:
> > You can't have it both ways. Either I am calling it incorrectly, in
> > which case I should get a compiler error,
>
> You get a warning if you ask for it. If you choose to run without all
> the type checking on, that's your problem.

My example compiles with no warnings under gcc -Wall.

Yes, I know I could have used lint. But that misses the point. For any
static analyzer, because of the halting problem, I can construct a
program that either contains an error that the analyzer will not catch,
or for which the analyzer will produce a false positive.

rg

Seebs

unread,
Sep 30, 2010, 12:49:16 PM9/30/10
to
On 2010-09-30, RG <rNOS...@flownet.com> wrote:
> My code compiles with no warnings under gcc -Wall.

That's nice. gcc -Wall uses only a small subset of warnings that fit
the usual expectations of C code that's trying to work on common
architectures.

>> 2. The constant is not of type int, and the compiler will warn you about
>> this if you ask.

> It would be nice if this were true, but my example clearly demonstrates
> that it is not.

No, it doesn't, because you didn't ask for the relevant kind of warnings.

> And if your response is to say that I should have used
> lint, then my response to that will be that because of the halting
> problem, for any static analyzer that you present, I can construct a
> program that either contains an error that either your analyzer will not
> catch, or for which it will generate a false positive. It just so
> happens that constructing such examples for standard C is very easy.

I'm not sure that that's actually a halting problem case. The thing about
static typing is that we don't actually HAVE to solve the halting problem;
we only have look at the types of the components, all of which are knowable
at compile time, and we can tell you whether there's any unsafe conversions.

And that's the magic of static typing: It is not a false positive to
warn you that "2L" is not of type int. There are things which would be a
false positive in trying to determine whether something will be out of range
in a runtime expression, but which are not false positives in a statically
typed language.

Scott L. Burson

unread,
Sep 30, 2010, 1:56:48 PM9/30/10
to
Ian Collins wrote:
> On 09/30/10 05:57 PM, RG wrote:
>>
>> I'm not saying one should not use compile-time tools, only that one
>> should not rely on them. "Compiling without errors" is not -- and
>> cannot ever be -- be a synonym for "bug-free."
>
> We is why we all have run time tools called unit tests, don't we?
>

My post that kicked off this thread was not cross-posted, so many of the
participants may not have seen it. Here it is again, for your convenience:

---------------------

This might have been mentioned here before, but I just came across it: a
2003 essay by Bruce Eckel on how reliable systems can get built in
dynamically-typed languages. It echoes things we've all said here, but
I think it's interesting because it describes a conversion experience:
Eckel started out in the strong-typing camp and was won over.

https://docs.google.com/View?id=dcsvntt2_25wpjvbbhk

-- Scott

RG

unread,
Sep 30, 2010, 2:01:29 PM9/30/10
to
In article <slrnia9fvi.307...@guild.seebs.net>,
Seebs <usenet...@seebs.net> wrote:

> And that's the magic of static typing: It is not a false positive to
> warn you that "2L" is not of type int.

We'll have to agree to disagree about that. The numerical value 2 can
safely be represented as an int, so I would consider this a false
positive.

rg

Scott L. Burson

unread,
Sep 30, 2010, 2:06:18 PM9/30/10
to
Pascal J. Bourguignon wrote:
> Squeamizh<squ...@hotmail.com> writes:
>
>> In short, static typing doesn't solve all conceivable problems.
>>
>> We are all aware that there is no perfect software development process
>> or tool set. I'm interested in minimizing the number of problems I
>> run into during development, and the number of bugs that are in the
>> finished product. My opinion is that static typed languages are
>> better at this for large projects, for the reasons I stated in my
>> previous post.

Here's a post I wrote earlier, before the conversation got cross-posted.
To me, this is the essence of the matter.

-----------------------

Norbert_Paul wrote:
>
> OK, but sometimes it is handy to have the possibility to make
compile-time
> assertions which prevent you from committing easily avoidable simple
> mistakes.

Agreed. I actually don't see this issue in black and white terms; I've
written lots of Lisp, and I've written lots of code in statically typed
languages, and they all have advantages and disadvantages. In the end
it all comes back to my time: how much time does it take me to ship a
debugged system? Working in Lisp, sometimes I don't get immediate
feedback from the compiler that I've done something stupid, but this is
generally counterbalanced by the ease of interactive testing, that
frequently allows me to run a new piece of code several times in the
time it would have taken me to do a compile-and-link in, say, C++.

So while I agree with you that compiler warnings are sometimes handy,
and there are occasions, working in Lisp, that I would like to have more
of them(*), it really doesn't happen to me very often that the lack of
one is more than a minor problem.

(*) Lisp compilers generally do warn about some things, like passing the
wrong number of arguments to a function, or inconsistent spelling of the
name of a local variable. In my experience, these warnings cover a
substantial fraction of the stupid mistakes I actually make.

-- Scott

Keith Thompson

unread,
Sep 30, 2010, 2:09:44 PM9/30/10
to
Seebs <usenet...@seebs.net> writes:
> On 2010-09-30, Paul Rubin <no.e...@nospam.invalid> wrote:
>> int maximum(int a, int b);
>>
>> int foo() {
>> int (*barf)() = maximum;
>> return barf(3);
>> }
>
>> This compiles fine for me. Where is the cast?
>
> On the first line of code inside foo().

Look again; there's no cast in foo().

That first line declare barf as an object of type "pointer to
function returning int", or more precisely, "pointer to function with
an unspecified but fixed number and type of parameters returning int"
(i.e., an old-style non-prototype declaration, still legal but
deprecated in both C90 and C99). It then initializes it to point
to the "maximum" function. I *think* the types are sufficiently
"compatible" (not necessarily using that word the same way the
standard does) for the initialization to be valid and well defined.
I might check the standard later.

It would have been better to use a prototype (for those of you
in groups other than comp.lang.c, that's a function declaration that
specifies the types of any parameters):

int (*barf)(int, int) = maximum;

IMHO it's better to use prototypes consistently than to figure out the
rules for interactions between prototyped vs. non-prototyped function
declarations.

[...]

Paul Rubin

unread,
Sep 30, 2010, 2:11:45 PM9/30/10
to
RG <rNOS...@flownet.com> writes:
> Yes, I know I could have used lint. But that misses the point. For any
> static analyzer, because of the halting problem, I can construct a
> program that either contains an error that the analyzer will not catch,
> or for which the analyzer will produce a false positive.

Can you describe any plausible real-world programs where the effort of
complicated static is justified, and for which the halting problem gets
in the way of analysis? By "real world", I meanI wouldn't consider
searching for counterexamples to the Collatz conjecture to qualify as
sufficiently real-world and sufficiently complex for fancy static
analysis. And even if it did, the static analyzer could deliver a
partial result, like "this function either returns a counterexample to
the Collatz conjecture or else it doesn't return".

D. Turner wrote a famous paper arguing something like the above, saying
basically that Turing completeness of programming languages is
overrated:

http://www.jucs.org/jucs_10_7/total_functional_programming

The main example of a sensible program that can't be written in a
non-complete language is an interpreter for a Turing-complete language.
But presumably a high-assurance application should never contain such a
thing, since the interpreted programs themselves then wouldn't have
static assurance.

Pascal J. Bourguignon

unread,
Sep 30, 2010, 2:21:30 PM9/30/10
to
RG <rNOS...@flownet.com> writes:

> One might hypothesize that the best of both worlds would be a dynamic
> language with a static analyzer layered on top. Such a thing does not
> exist. It makes an instructive exercise to try to figure out why. (For
> the record, I don't know the answer, but I've learned a lot through the
> process of pondering this conundrum.)

There are static analysis tools for Common Lisp:
http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/tools/lint/0.html

or lisp in general. For example PHENARETE is in the category of static
analysis tools.

One could regret that they're not more developed, but I guess this only
proves the success of using dynamic programming languages: if there were
a real need for these tools, along with a good ROI, they would be more
developed. In the meantime, several test frameworks are developed.


--
__Pascal Bourguignon__ http://www.informatimago.com/

Pascal J. Bourguignon

unread,
Sep 30, 2010, 2:23:50 PM9/30/10
to
RG <rNOS...@flownet.com> writes:

> In article <slrnia9dbo.2uq...@guild.seebs.net>,
> Seebs <usenet...@seebs.net> wrote:
>
>> On 2010-09-30, RG <rNOS...@flownet.com> wrote:
>> > You can't have it both ways. Either I am calling it incorrectly, in
>> > which case I should get a compiler error,
>>
>> You get a warning if you ask for it. If you choose to run without all
>> the type checking on, that's your problem.
>
> My example compiles with no warnings under gcc -Wall.

IIRC, -Wall is not reall ALL.

Try with: gcc -Wall -Wextra -Werror

I would still argue that should be the default, and if really there was
a need, there could be options to disable some warning, or to have some
errors be warnings...

Keith Thompson

unread,
Sep 30, 2010, 2:25:03 PM9/30/10
to
RG <rNOS...@flownet.com> writes:
> In article <lniq1n4...@nuthaus.mib.org>,
> Keith Thompson <ks...@mib.org> wrote:
>> RG <rNOS...@flownet.com> writes:
>> [...]
>> > You can't have it both ways. Either I am calling it incorrectly, in
>> > which case I should get a compiler error, or I am calling it correctly,
>> > and I should get the right answer. That I got neither does in fact
>> > falsify the claim. The only way out of this is to say that
>> > maximum(8589934592, 1) returning 1 is in fact "correct", in which case
>> > we'll just have to agree to disagree.
>>
>> You are calling maximum() incorrectly, but you are doing so in a way
>> that the compiler is not required to diagnose.
>
> Yes. I know. That was my whole point. There are ways to call a
> function incorrectly (more broadly, there are errors in code) that a C
> compiler is not required to diagnose.

Of course.

>> If you want to say that the fact that the compiler is not required
>> to diagnose the error is a flaw in the C language, I won't
>> argue with you.
>
> I'm not even saying it's a flaw in the language. All I'm saying is that
> the original claim -- that any error in a C program will be caught by
> the compiler -- is false, and more specifically, that it can be
> demonstrated to be false without appeal to unknown run-time input.

Did someone *really* claim that "any error in a C program will
be caught by the compiler"? If so, I must have missed that.
It's certainly not true; code that compiles cleanly can be riddled
with errors. That's true in any language, but more so in C than
in some others.

> As an aside, this particular error *could* be caught (and in fact would
> be caught by other tools like lint), but there are errors that can not
> be caught by any static analysis, and that therefore one should not be
> lulled into a false sense of security by the fact that your code is
> written in a statically typed language and compiled without errors or
> warnings. That's all.

I don't believe anyone has said otherwise.

>> If I write:
>>
>> const double pi = 22.0/7.0;
>> printf("pi = %f\n", pi);
>>
>> then I suppose I'm calling printf() incorrectly, but I wouldn't
>> expect my compiler to warn me about it.
>>
>> If you're arguing that
>>
>> int maximum(int a, int b) { return a > b ? a : b; }
>>
>> is flawed because it's too easy to call it incorrectly, you're
>> effectively arguing that it's not possible to write correct
>> code in C at all.
>
> I would say that it is very, very hard to write correct code in C for
> any non-vacuous definition of "correct". That is the reason that core
> dumps and buffer overflows are so ubiquitous. I prefer Lisp or Python,
> where core dumps and buffer overflows are virtually nonexistent. One
> does get the occasional run-time error that might have been caught at
> compile time, but I much prefer that to a core dump or a security hole.

I would say that it can certainly be difficult to write correct
code in C, but I don't believe it's nearly as hard as you think
it is. It requires more discipline than some other languages,
and it can require some detailed knowledge of the language itself,
particularly what it defines and what it doesn't. And it's not
always worth the effort if another language can do the job as well
or better.

> One might hypothesize that the best of both worlds would be a dynamic
> language with a static analyzer layered on top. Such a thing does not
> exist. It makes an instructive exercise to try to figure out why. (For
> the record, I don't know the answer, but I've learned a lot through the
> process of pondering this conundrum.)

--

Pascal J. Bourguignon

unread,
Sep 30, 2010, 2:37:22 PM9/30/10
to
TheFlyingDutchman <zzbb...@aol.com> writes:

>>
>> > "in C I can have a function maximum(int a, int b) that will always
>> > work. Never blow up, and never give an invalid answer. "
>>

>> > Dynamic typed languages like Python fail in this case on "Never blows
>> > up".
>>
>> How do you define "Never blows up"?
>

> Never has execution halt.
>
> I think a key reason in the big rise in the popularity of interpreted
> languages

This is a false conception.

Whether the execution of a program is done by a processor of the
programming language, or a processor of another programming language
(and therefore requiring a translation phase), is a notion is NOT a
characteristic of programming language, but only of execution
environments.


1- There are C interpreters
CINT - http://root.cern.ch/root/Cint.html
EiC - http://eic.sourceforge.net/
Ch - http://www.softintegration.com

2- All the current Common Lisp implementations have compilers,

3- Most current Common Lisp implementations actually compile to native
code (ie they chose to translate to programming languages that are
implemented by Intel, AMD or Motorola. (Notice that these programming
languages are NOT implemented in hardware, but in software, called
micro-code, stored on the real hardware inside the
micro-processors); some choose to translate to C and call an external
C compiler to eventually translate to "native" code).

4- Actually, there is NO current Common Lisp implementation having only
an interpreter. On the contrary, most of the don't have any
interpreter (but all of them have a REPL, this is an orthogonal
concept).

5- Even the first LISP implementation made in 1959 had a compiler.

6- I know less the situation for the other dynamic programming language,
but for example, if CPython weren't a compiler, you should know that
CLPython is a compiler (it's an implementation of Python written in
Common Lisp, which translates Python into Common Lisp and compiles it).


> is that when execution halts, they normally give a call
> stack and usually a good reason for why things couldn't continue. As
> opposed to compiled languages which present you with a blank screen
> and force you to - fire up a debugger, or much much worse, look at a
> core dump - to try and discern all the information the interpreter
> presents to you immediately.

Theorically, a compiler for a static programming language has even more
information about the program, so it should be able to produce even
better backtrace in case of problem...

Seebs

unread,
Sep 30, 2010, 3:17:39 PM9/30/10
to
On 2010-09-30, RG <rNOS...@flownet.com> wrote:
> In article <slrnia9fvi.307...@guild.seebs.net>,
> Seebs <usenet...@seebs.net> wrote:
>> And that's the magic of static typing: It is not a false positive to
>> warn you that "2L" is not of type int.

> We'll have to agree to disagree about that.

No, we won't. It's the *definition* of static typing. Static typing
is there to give you some guarantees at the expense of not being able
to express some things without special extra effort. That's why it's
static.

> The numerical value 2 can
> safely be represented as an int, so I would consider this a false
> positive.

That's nice for you, I guess.

The point of static typing is that it makes it possible to ensure that
the values that reach a function are in fact of the correct type -- at
the cost of not being able to rely on free runtime conversions.

If you want to write safe conversions, you can do that. If you don't
bother to do that, you end up with errors -- by definition.

It is loading more messages.
0 new messages