On 2014-05-01, Dmitry A. Kazakov <
mai...@dmitry-kazakov.de> wrote:
> On Wed, 30 Apr 2014 15:47:10 +0000 (UTC), Kaz Kylheku wrote:
>
>> If there is any suspicion that the variable is uninitialized, the case
>> can be made that it needs to be fixed. If a machine finds it hard to tell
>> whether the variable is uninitialized, a human being will also find it hard to
>> tell.
>
> All this is not needed if the language were properly designed. E.g. using
> the principle of safest choice (by default, initialized or not?), require
> everything explicitly initialized without default initialization defined
Before C99, C had "declarations before statements" (like "Wirthian" languages).
If declarations have to follow statements, you sometimes have to write
code like this:
int var;
lock(&obj->lock_var);
var = obj->memb;
I wrote something very similar to this just yesterday, inside
a USB driver.
If var had to be initialized, the compiler would still have to do all
the same analysis in order to optimize that away; it would have to
relize that the initial value has no "next use" in the flow graph.
This particular use goe away if we mix variable definitions with statements:
lock(&obj->lock_var);
int var = obj->memb;
This is supported in C starting with C99, and has been in C++ for a long
time (precisely because C++ objects with constructors do not support
this two-step initialization).
>> Static checking is bane of programming. Ideally, everything should be dynamic,
>> especially type and dispatch. Binding should always be as late as possible.
>
> On the contrary. Late binding convert bugs into faults, e.g. type error
> into constraint violation/message-not-understood etc. This is intolerable
> from the software engineering POV.
If you don't formalize late binding into the language, programmers
will reinvent it anyway, in an ad-hoc way.
Dynamic languages are well known for their productivity and robust
applications.
Whenever something falls down in computing, more often than not, some
statically typed trash is at the root cause.
> From the CS POV, there no dynamic languages, only poor static languages.
Complete nonsense. Languages are dynamic by default. All the pseudo-code
in CS textbooks and papers is essentially dynamic.
Statica analysis is just a way of classifying a certain subset of dynamic
programs: those which have the property that a type can be inferred for
every node in their graph (possibly after expanding that graph in the
face of "static polymorphism"), so that no expression has more than one type in
different run-time situations.
(Languages with explicit declarations for determining what type flows
into every node in the graph are obsolete dinosaurs; the state of the
art in static typing is inference.)
> You cannot have everything dynamic, you must stop the indirection
> somewhere.
Reductio ad 1 et 0.
Dynamic languages are supported by a kernel of run-time code that isn't
itself dynamic, and also by compiler generated code that isn't dynamic
either. It just "knows" that, say, two certain bits of a value are a
type tag: no other meta-tag tells it that.
Ultimately, bits are statically typed; we know that a bit cell has type "bit"
and only contains a 1 or 0; there is no need to check some other information to
verify.
> If you put all objects that accept a given operation together,
> that would constitute a perfectly static type.
Sure, a meaningless static type like "Object" which need not be mentioned int
the language.
>> Write in a dynamic language, and leave C for the device drivers, codecs,
>> OS schedulers, etc.
>
> Which implies that dynamic languages are incomplete. Why should anybody use
> such a language. Wouldn't it better to design and use a decent language
> instead?
They aren't. For instance Lisp machines were written entirely in Lisp,
interrupt-driven device drivers and all. There are nice implementatinos
of Lisp with good compilers. Whereas, say, Python or Ruby users call
C libraries for things like regex, crypto or image processing, Lisp
programmers use libraries written in Lisp for this, because the performance
is decent.
People don't write device drivers in Lisp nowadays though because platforms
dictate the methodology for those (so it is usually done in C).
Millions of programmers choose "incomplete" languages that don't even have
compilers, because they find them to be the right tool for the right job. They
have some ready-made framework, library and possibly "recipes" on how to
structure the major pieces together.
They aren't writing an device driver, so they don't care that they cannot.
However, a dynamic language can certainly be the "be all" system language
from the boostrap loader on up.
>>>>> 4. Pointer arithmetic is a low level hack which should not be there,
>>>>> because it does not make any sense for variable sized targets or advanced
>>>>> structures like graphs and trees.
>>>>
>>>> Pointer arithmetic is necessary for managing variable-sized targets,
>>>> like records that have multiple variant fields, all in the same linear
>>>> memory block. These could be stuck into a graph structure.
>>>
>>> No. If you need a pointer to the next element you don't have to compute it.
>>> You just store it. Otherwise, see the position #1.
>>
>> But then you're dictating to me the memory layout.
>
> Why? A pointer can point anywhere.
You're dictating that the pointer is *there* in the layout, rather than just
the next element itself (whereby we calculate the address of the element
ourselves, outside of the structure) That embedded pointer not even directly
possible if that layout is from a memory mapping that can be placed at any
address, and shared by processes.
>> Secondly, what if this is a third-party memory-mapped file where the layout is
>> imposed?
>
> You will need de-/serialize objects as with any other file.
Screw that; I want memory mapped!
> Memory-mapped files is a hack,
That is neither here nor there; they exist, developers sometimes want them.
They have measurable, demonstrable benefits. (As well as shortcomings.)
>>> Well, the balance between the cost of a CPU cycle vs. memory byte has
>>> shifted towards CPU. It is difficult to say whether that will stay so. When
>>> massively parallel architectures will arrive, the model of shared memory
>>> will inevitable die. There is no way to share memory between thousands of
>>> cores. This might shift the balance back, away from flat, isotropic memory
>>> models...
>>
>> There is no way to share memory between millions of machines on an internet,
>> yet they all use a TCP/IP stack written in C.
>
> So what? You are again confusing completeness in narrow Turing sense with
> language design.
I am not confusing anything. I know all all about Turing equivalence not
being expressivity equivalence. (And have had to hit numerous people on
Usenet over the head with that over twenty years.)
I didn't claim that C is an expressive language for writing an application
for a thousand cores that do not share memory well.
However, I think C will will end up used for programming parts of that system,
and perhaps even parts of the run-time support for whatever language is used
at the high level for harnessing those cores.
That machine will still need to to mundane things like boot up, or
push packets to and from some ethernet controller with DMA and interrupts,
etc.
C you there! :)
> We are talking about structuring the address space. The
> address space of Internet is *not* flat and *not* isotropic. q.e.d.
But you can get 64 kilobyte pieces of thanks to a popular C library! :)
>> A massively parallel machine with many cores will still have local memories,
>> with some software that uses them conventionally.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
See, some software: not all.
> Yes, but the programmer of such systems will care about local memory barely
> more than he presently cares about L2-cache.
The programmers of some kinds of applications already don't care about this
on boxes with just a handful of cores (perhaps one).
>>>> What would constitute sufficient difficulty in the integer-pointer conversion?
>>>> Sacrificing a goat for each instance?
>>>
>>> Making meaningless conversions unnecessary?
>>
>> So what then implements the requirements served by those conversions?
>> Converting a number to a pointer is something which is used because it is needed.
>
> It is never needed, because there is no proper method of obtaining such
> numbers in first place.
Yes there is; system documentation, data sheets.
> If the method exist, it is pointer-to-number and
> back to pointer conversion. Which raises immediate question why number was
> needed as a middleman?
Simply for the pragmatic reason that addresses on digital, binary computers are
numbers.
This can be abstracted away for objects that need not have a concrete,
particular address.
> All cases where numbers are converted to pointers are at some I/O boundary
> and even then there always exist better ways.
No, not all cases have to do with I/O.
Sometimes C programs use "spare bits" in pointers for additional information.
Some concurrent algorithms rely on this (like lock-free linked lists), and also
implementations of dynamic languages that are written in C.
Tn the implementation of the TXR language, every value is a C pointer
of type val, which is a typedef for obj_t *.
Pointers are relied upon to have an alignment of at least four bytes,
which leaves the least significant two bits open for use as a type tag.
So "fixnum" integers can be represented directly inside the pointer
value, as well as characters.
The tag is chosen such that the value 0 (bits 00) denote a heaped object.
That means that the pointer can be dereferenced directly.
>>>>> 7. Each object in C is assumed to be addressable, i.e. the &-operation is
>>>>> defined on each object.
>>>>
>>>> Variables declared "register" cannot have this operation applied to them.
>>>>
>>>> But even without this, the analysis is very easy for determining whether or not
>>>> a declared object is subject to the & operator over its scope.
>>>
>>> So, the program semantics and legality may depend on the outcome of
>>> register optimization?
>>
>> The register storage class means that taking the address of the object is a
>> diagnosable constraint violation.
>
> So must have been the class of addressable objects. If the programmer's
> intent is to access the object via pointer, he should declare this in
> advance. If he mistakenly takes a pointer to an object that is not intended
> for aliasing, this is a serious bug, and must be flagged as such.
The Wirthian language commit a serious offense in this regard compared to C.
In Pascal and its ilk, you can change a procdure's conventional by-value
argument to a VAR argument. When the program is recompiled, calls to that
procedure now silently "steal" references to variables and may modify them.
C++ also has this in the form of references.
In C we at least have a loud and clear syntax that the address is being
taken: the & operator (though it can be hidden behind a macro).
Having to explicitly declare a permission for an address to be taken isn't such
a bad idea, but it can have downsizes. For instance, your program could really
benefit from taking the address of a variable in some third-party library.
Ooops, it is not declared for that!
In C we have the const qualifier whose behavior is such that the address of a
const qualified object is a pointer to a const-qualified type. When this
pointer is dereferenced, the resulting lvalue cannot be assigned; it is not
modifiable. This provides a reasonable middle ground: allow the address, but
one that gives rise to only read-only aliases.
> Bugs
> related to aliasing are extremely hard to debug. The principle of safest
> choice.
If that's the best way to write the program, then it is no harder to debug
than it has to be.