On 8/30/2020 8:24 PM, Ben Bacarisse wrote:
>>> Kaz Kylheku <
793-84...@kylheku.com> writes:
>>>
>>>> On 2020-08-30, olcott <No...@NoWhere.com> wrote:
>>>>> The end goal of this sequence of posts is to show that the basic "C"
>>>>> programming language (without the "C" libraries) can be fully mapped to
>>>>> an abstract model of computation that is equivalent to a Turing machine
>>>>> in such a way that any Turing complete computation can be written in the
>>>>> "C" programming language.
>>>>
>>>> Turing's infinite tape can be mapped to the <stdio.h> facility.
>>>> The possible symbols on that tape can be mapped to characters,
>>>> and the operations and positioning can be mapped to the available
>>>> library operations.
>>>
>>> I think (but it's hard to get answers) that Peter Olcott is only
>>> considering what we'd call freestanding implementations here.
>>>
>>> But, worse, he's not interested in simply showing that hosted C is
>>> Turing complete by simulating a universal TM. I think he aims to show
>>> that "ordinary C" can be made Turing complete just by implementing it on
>>> a Turing complete sub-system.
>>>
>>> (BTW, even a IO-based UTM needs are bit of checking. Functions like
>>> fgetpos need to be allowed to fail. I think the specification is plenty
>>> loose enough for that since it says almost nothing about what failures
>>> are permissible.)
>>>
>>>>> When "C" is mapped to an abstract model of computation that can provide
>>>>> an arbitrary number of arbitrary length linked lists, then "C" acquires
>>>>
>>>> In the absence of I/O, the C storage model is finite, and therefore not
>>>> Turing complete. However, perhaps not quite.
>>>>
>>>> I had a discussion with this with someone who convinced me of the
>>>> following: in C we can allocate automatic storage (informally,
>>>> "local variables on the stack") via recursion. If we do this without
>>>> taking the address of anything, then pointers do not appear in
>>>> the program, and thus, in the abstract sense, we don't run into the
>>>> issue of pointers being of finite width. With that restriction, we have
>>>> no random access (the program cannot access material in existing
>>>> activation frames without returning to them). Therefore what we have is
>>>> a push-down automaton (PDA). A PDA is strictly less powerful than a
>>>> Turing machine. It contains an infinity in that it can push down as far
>>>> as it needs to, but there are some computations that can't be carried
>>>> out with a universal push down machine.
>>>
>>> Hmm... Modern C has threads and therefore more than one stack. A PDA
>>> with two stacks is Turing complete so there may be a way after all.
>>> Nice one!
>>
>> Isn't it the case though, that
>>
>> - In any C implementation (or C++) the size of a pointer is fixed
>>
>> - Every distinct object (class/struct/variable/etc...) must have
>> its own unique address (that fits the pointer size)
>> ?
>
> Yes, indeed. But I suspect the argument proposed to Kaz relies on what
> is permitted when no addresses are used. This imaginary implementation
> would collapse in a heap (no pun intended) if any use were ever made of
> an address. The question being, is a C implementation obliged to
> restrict the set of objects if the program takes no note of any pointers
> and could therefore not tell?
>
> Going the other way, it is widely considered acceptable for an
> implementation to give distinct objects, with overlapping lifetimes, the
> same address provided the program's observable behaviour is not
> affected.
>
> I'm not really inclined to "go to court" (either way) over this one!
> The rules are not precise and the question is of no practical or
> theoretical importance. But I think it's fun the think about...
>
>> If so any C implementation can only have finitely many distinct
>> addressable objects, and so can not fully implement a TM tape, and can
>> only in principle run a subset of TM calculations
>>
>> Of course, any C implementation can implement a partial TM emulator,
>> and this might have the capacity to run interesting TM calculations,
>> but it seems to me not all TM calculations.
>>
>> Sure, if we consider a sequence of C implementations with ever larger
>> address pointer sizes, then given any /terminating/ TM calculation,
>> eventually one of them will be capable of running it. But some
>> non-terminating TM calculations cannot run on a finite tape, so we're
>> sort of loosing this theoretical battle.
>
> Usually, (I thought) the relevant definitions are in terms of
> terminating computations. Failing on the ones that don't terminate is
> no failure of completeness. Mind you, I'd need to re-read someone like
> Martin on this because I don't recall the technical definition.
>
>> Now if we could change the C spec to allow variable length pointer
>> sizes, then we would have a theoretically infinite pool of addressable
>> objects, so suddenly a full TM emulator becomes easy in terms of
>> theoretical program design, but obviously not when it comes to running
>> on a given physical implementation, whose memory will only be
>> finite. (But this probably doesn't matter for PO, who knows?
>> Certainly not PO, who is quite lost in all this...)
>
> That's one way, but it would need quite a few detailed of changes in the
> language. Another, rather arbitrary, way would be to allow at least two
> unbounded integers.
>
>> Or if we can use IO facilities then a potentially infinite virtual
>> memory results - e.g. parts of the tape could be backed in a
>> cloud-based system addressed through arbitrary long URLs. (Ok, maybe
>> URLs have a max length, but we can envisage some new storage device
>> addressed through IO subsystem not having this restriction.
>
> Though that steps outside of standard C. The "standard" way would be to
> use tmpfile, fseek(_, +1/-1, SEEK_CUR) and fread and fwrite to get a
> one-ended unbounded tape.
>
>>> Often, things like IO and threads (unless wired into the language
>>> syntax) are informally excluded when asking if language X is Turing
>>> complete, but I don't think the rules of this particular game are clear.
>>>
>>> There is, though, another way to argue about C. Implement your UTM in
>>> our all-too-finite C as, say, compiled by gcc. If a computation fails
>>> for lack of space, just get the gcc people to double the size of all the
>>> pointers and try again. Repeat as necessary, using pointers simulated
>>> in software. Each iteration can conform to the C standard, and since
>>> you only need to be able to run terminating computations to be Turing
>>> complete, one can show that there is (in theory) a C implementation that
>>> can run any given computation.
>>
>> Yes, for a given terminating calculation. But to do this we just
>> declare a big enough array to run the calculation, and run it.
>
> Yes, you can switch larger integer indexes for larger pointers. The two
> views are interchangeable. (Or have I missed the point you are
> making?).
>
If we had pointer names that were handled under-the-covers by a Turing
equivalent process then we could still have the essence of "C" on an
abstract machine with unlimited memory. This is not much more than a
paraphrase of what Kaz already said. The difference is that we switched
the context from the stack to the heap.
I think that I just figured out how to make at least one linked list of
truly unlimited length. It depends on unlimited length integers. To
explain this system I would use 1024-bit addressing that could be scaled
upward at any point in time.
2^1024 (1.797693134862315907729305190789e+308) bytes of RAM
179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216
Here is the maximum address value in hex:
0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
Implementing such a system on actual hardware requires some sort of
relative addressing. Using every single atom in the whole universe to
encode one single binary digit each would provide a woefully deficient
amount of RAM for 1024-bit addressing:
...it is estimated that the there are between 10^78 to 10^82 atoms in
the known, observable universe...
https://www.universetoday.com/36302/atoms-in-the-universe/#
The key thing about this memory system is that because it has no
theoretical limit to its scalability every Turing equivalent machine
tied to this abstract memory system that can be implemented in actual
hardware would be computationally identical to its abstract machine
version until the concrete machine reached its memory limit.
This provides a way to empirically test many theory of computation
problems.