in literature you often find the statement that (concrete) program
semantics are in general not computable. If you for example consider so-
called "sticky collecting semantics", you want to compute different
properties of the program at particular program points. Let's say, I
would like to know which values all program variables might have at
program point P0. The naive approach would be to run the program with all
possible combinations of program input parameters. Of course, this would
take quite long, but the semantics remains computable.
The only issue I could imagine is that you get trouble with non-
terminating programs. If you can't guarantee that the input program
terminates, the semantics cannot be computed. Is this possibly the
reason why "program semantics is not computable in general"?
On the other hand, if the set of programs under analysis is restricted
to terminating programs, will the concrete program semantics be always
computable?
Best,
Tim
> Hi,
>
> in literature you often find the statement that (concrete) program
> semantics are in general not computable. If you for example consider so-
> called "sticky collecting semantics", you want to compute different
> properties of the program at particular program points. Let's say, I
> would like to know which values all program variables might have at
> program point P0. The naive approach would be to run the program with all
> possible combinations of program input parameters.
> Of course, this would
> take quite long, but the semantics remains computable.
(1) Many algorithms are specified as having unbounded input domains,
e.g. any integer (however large). You can't even in principle test
against all inputs.
(2) Even if you restrict the input domains to be bounded (e.g. 64-bit
integers), then with a relatively small number of inputs you have such a
vast combinatorial explosion that in practice there is not enough time
left in the universe to be able to test all combinations.
> The only issue I could imagine is that you get trouble with non-
> terminating programs. If you can't guarantee that the input program
> terminates, the semantics cannot be computed. Is this possibly the
> reason why "program semantics is not computable in general"?
It's one reason.
> On the other hand, if the set of programs under analysis is restricted
> to terminating programs, will the concrete program semantics be always
> computable?
Presumably you mean "provably terminating programs". Even then, the
answer remains no.
> Best,
> Tim
--
---------------------------
| BBB b \ Barbara at LivingHistory stop co stop uk
| B B aa rrr b |
| BBB a a r bbb | Quidquid latine dictum sit,
| B B a a r b b | altum viditur.
| BBB aa a r bbb |
-----------------------------
> Hi,
>
> in literature you often find the statement that (concrete) program
> semantics are in general not computable. If you for example consider so-
> called "sticky collecting semantics", you want to compute different
> properties of the program at particular program points. Let's say, I
> would like to know which values all program variables might have at
> program point P0. The naive approach would be to run the program with all
> possible combinations of program input parameters. Of course, this would
> take quite long, but the semantics remains computable.
>
> The only issue I could imagine is that you get trouble with non-
> terminating programs. If you can't guarantee that the input program
> terminates, the semantics cannot be computed. Is this possibly the
> reason why "program semantics is not computable in general"?
Well, basically if you want to know the value of variables at the end of a
program (knowing the value at beginning), this is an extensional property
of programs and falls under Rice's theorem.
Yes, the fact that the program you're analysing may not terminate is the
reason why any "test-based" approach will fail.
> On the other hand, if the set of programs under analysis is restricted
> to terminating programs, will the concrete program semantics be always
> computable?
Consider the following program P_q (for a given program q):
On input x (an integer), simulate program q (on input 0) for x "steps"
(seconds, TM step, beta-reduction, whatever you want).
If q(0) ends in x steps or less, then return 1 otherwise return 0.
Now, let's analyse P_q. First, let's remark that P_q(x) always terminates.
Indeed, the simulation of q is only done for a finite amount of time so
P_q(x) will terminate at some point. It is even pretty "fast" as the time
needed to compute P_q(x) is roughly proportional to x (plus some
simulation overload you may have).
Next, have a look at what it computes. Obviously, there are two cases.
1/ Either q(0) ends in, say, N steps (for an unknown but fixed value of N)
2/ or q(0) never ends.
If q(0) never ends, then P_q(x) always returns 0.
If q(0) ends in N steps, then for x<N, P_q(x) returns 0 and for y>N,
P_q(y) returns 1.
Now, if you were able to know the semantics of programs, *even if
restricting yourself to uniformly terminating programs* (such as P_q),
then you'd be able to decide whether a given program (q) halts. qed.
Notice that the construction is done by embedding the non-termination
problem (of q) into a terminating program (P_q). This is possible because
universal programs exist. The existence of universal programs and the fact
that a program cannot decide whether it is "run directly" or "simulated by
an universal machine" means that even if you try to circumvent this
construction by adding extra condition such as "uniformly terminating
programs that only manipulate uniformly terminating programs" or similar
things you can embed the construction within several layers of universal
machines and simulation and will still get undecidability.
--
Hypocoristiquement,
Jym.
What do you exactly mean "in principle"? I'm aware of the fact that
testing all inputs would take quite long but to my understanding
everything that can be computed in a finite amount of time is computable.
Testing of all inputs would be such an example.
>
> (2) Even if you restrict the input domains to be bounded (e.g. 64-bit
> integers), then with a relatively small number of inputs you have such a
> vast combinatorial explosion that in practice there is not enough time
> left in the universe to be able to test all combinations.
But can you argue about undecidability with a combinatorial explosion?
Maybe in 10 years computers become such fast that all these combinations
can be computed in couple of years. Thus, the problem would become
computable in practice. Or is this reasoning wrong?
>> On the other hand, if the set of programs under analysis is restricted
>> to terminating programs, will the concrete program semantics be always
>> computable?
>
> Presumably you mean "provably terminating programs". Even then, the
> answer remains no.
Why? Could you provide some more detains?
> Well, basically if you want to know the value of variables at the end of
> a program (knowing the value at beginning), this is an extensional
> property of programs and falls under Rice's theorem. Yes, the fact that
> the program you're analyzing may not terminate is the reason why any
> "test-based" approach will fail.
did I get you right that the problem of determining the value of a
variable at program end falls under Rice's theorem since in general
the program might not terminate? Or does this extensional property
has other reasons (as the one you mention below)?
> If q(0) never ends, then P_q(x) always returns 0. If q(0) ends in N
> steps, then for x<N, P_q(x) returns 0 and for y>N, P_q(y) returns 1.
>
> Now, if you were able to know the semantics of programs, *even if
> restricting yourself to uniformly terminating programs* (such as P_q),
> then you'd be able to decide whether a given program (q) halts. qed.
Just to make sure. Can the last sentence be interpreted like this:
Since we can't in general correctly decide if a program terminates (case:
x<N), we don't the semantics of program, i.e. the semantics is not
computable.
You used the halting problem to show that this semantics property is
not computable. This is a good example. But how is it related to my
original problem of determining valid variable values at each program
point? Can I argue that since I can't decide if a program terminates
I also can't compute the variable values?
Best,
Tim
Testing any given input may be computable. Testing that it works for
every input may not be.
For example: the input to a given program may be any even integer
greater than 2. The output is a boolean truth value. It is specified
to indicate whether the input is the sum of two prime numbers. The
algorithm is very simple: it returns "true" always. How do you
propose to test in finite time whether or not it is correct for all
possible inputs?
- Tim
> Thank you for your answer.
>
>> Well, basically if you want to know the value of variables at the end of
>> a program (knowing the value at beginning), this is an extensional
>> property of programs and falls under Rice's theorem. Yes, the fact that
>> the program you're analyzing may not terminate is the reason why any
>> "test-based" approach will fail.
>
> did I get you right that the problem of determining the value of a
> variable at program end falls under Rice's theorem since in general
> the program might not terminate? Or does this extensional property
> has other reasons (as the one you mention below)?
No. The problem "deciding the output value of a variable" falls under
Rice's theorem because it is an extensional property, ie a property
depending only on the input/output relation and not on the way the value
are actually computed.
Rice's theorem states that these extensional properties are undecidable.
The truth of Rice's theorem is strongly linked to the halting problem and
the fact that programs tested might not terminate.
>> If q(0) never ends, then P_q(x) always returns 0. If q(0) ends in N
>> steps, then for x<N, P_q(x) returns 0 and for y>N, P_q(y) returns 1.
>>
>> Now, if you were able to know the semantics of programs, *even if
>> restricting yourself to uniformly terminating programs* (such as P_q),
>> then you'd be able to decide whether a given program (q) halts. qed.
>
> Just to make sure. Can the last sentence be interpreted like this:
> Since we can't in general correctly decide if a program terminates (case:
> x<N), we don't the semantics of program, i.e. the semantics is not
> computable.
Seems OK.
The semantics of P_q depends on the fact that q terminates or not and
hence is undecidable.
> You used the halting problem to show that this semantics property is
> not computable. This is a good example. But how is it related to my
> original problem of determining valid variable values at each program
> point?
Among other, you want to determine valid variable value at the end of the
program (it is a program point).
This is not possible due to the above construction (to be more precise,
instead of having P_q returns a value (0 or 1), you can have it store this
value in a variable and this shows that the value of this variable at the
end of P_q is not decidable).
Twitching the construction in order to show that the value at almost any
program point is undecidable is left as an exercice...
> Can I argue that since I can't decide if a program terminates
> I also can't compute the variable values?
Not here. Program P_q *always* terminates.
If you want the same construction, without refering to the halting
problem, take it as follows :
* On input x, program prog_P tests whether the multivariate polynomial P
has an root integer root with all components smaller than x. this is done
by computing P(a_1, ..., a_n) for all tuples (a_1, ..., a_n) with all
components a_i < x. Clearly, there are finitely many such tuples (namely,
x^n) hence this computation always terminates.
If P has an integer root with all components smaller than x, then
prog_P(x) returns 0 (or sets variable foo to 0), otherwise, prog_P(1)
returns 1 (or set variable foo to 1).
* If polynomial P has no integer root, then prog_P always returns 1 (or
foo always at value 1 at the end of prog_P). Otherwise, it depends on the
input.
* Deciding whether a polynomial has integer roots is impossible (Hilbert's
10th problem, solved by Matiyasevitch's theorem in 1991 (iirc)).
* Hence, the semantics of prog_P is undecidable.
Notice that here there is no (direct) mention of the fact that termination
is undecidable.
Anyway, prog_P (as P_q above) *always* terminates on all input.
hence, even if you restrict yourself to uniformly terminating programs (ie
programs terminating on all inputs), "semantics" is still undecidable.
--
Hypocoristiquement,
Jym.
>> (1) Many algorithms are specified as having unbounded input domains,
>> e.g. any integer (however large). You can't even in principle test
>> against all inputs.
>
> What do you exactly mean "in principle"? I'm aware of the fact that
> testing all inputs would take quite long but to my understanding
> everything that can be computed in a finite amount of time is computable.
Yes.
> Testing of all inputs would be such an example.
No.
There can be *infinitely many* inputs. Even if testing each input is
possible and takes relatively few time, testing *all* of them still takes
*infinite* time.
As soon as we're speaking about theoretical stuffs, we have to handle
infinite domains. True, if you write an actual program, integers will be
smaller than 2^32 (or 2^64), but if you want to switch to theory, then you
have to consider that your program is working will all the (mathematics)
natural numbers. From 0 to "infinity". Any natural how big it can be is a
valid input to the (theoretical) program. And they (the naturals) are
infinitely many.
>> (2) Even if you restrict the input domains to be bounded (e.g. 64-bit
>> integers), then with a relatively small number of inputs you have such a
>> vast combinatorial explosion that in practice there is not enough time
>> left in the universe to be able to test all combinations.
>
> But can you argue about undecidability with a combinatorial explosion?
> Maybe in 10 years computers become such fast that all these combinations
> can be computed in couple of years. Thus, the problem would become
> computable in practice. Or is this reasoning wrong?
The reasonning is wrong.
If you want to try, say 2^100 inputs (not much given there are 2^64 long
int) and each inputs takes a single clock cycle and you have a 10GHz
computer, then you'll take time equal to
2^100/10^10 seconds.
Since 2^10=1024 < 10^3, we have :
2^100/10^10 = (2^10)^10/10^10 > (10^3)^10/10^10 = 10^20 seconds
There are 86400 seconds per day, make it 100 000=10^5. We still need more
than 10^15 days to compute.
There are 365 days per year, make it 1000=10^3.
We still need more than 10^12 years ! aka 1000 billions which is much more
than the estimated age of the universe...
Now, consider you don't only have next year computer (at 10GHz) but a
science-fiction computer at 10 *billions* GHz (that is 10 billions
billions operations per seconds, even if we're still far from Planck's
time, I think the physicals limits have already been reach far before
computer this fast exist.
Well, even with this impossible computer, computing all your 2^100 inputs
will still require 1000 years !
So your reasonning is wrong. You cannot simply wait to see the next
computer and hope that it will be powerful enough. You will *never* have
sufficient computing capabilities to tests all possible inputs. (or you
have to consider program with a small finite input set).
--
Hypocoristiquement,
Jym.
> Well, even with this impossible computer, computing all your 2^100 inputs
> will still require 1000 years !
OK, I agree that if we switch to theory with infinite numbers, testing of
all inputs is not computable.
But if we restrict the domain to real computers with let's say 64bit
integers (your example), then, due to your computation, it would currently
take about 10^12 years. 1000 billions years is not a realistic time to
wait but is it not a finite amount of time? IMHO it is which according to
my last post would make the testing of all inputs possible.
In my opinion, the discussion about the amount of time for a computation
is still somehow fuzzy. Let's assume in 100 years, someone develops a
10 billion GHz computer. Then the complexity of testing would drop to 1000
years as you estimated. 1000 years is still a long time but this period
becomes slowly realistic. One could now go further and argue that even
faster computers are found in 150 years. Finally, this would lead to such
a fast computer that testing would take only 1 month which is a realistic
computation time. The conclusion: testing of all inputs becomes computable
in the future.
> So your reasonning is wrong. You cannot simply wait to see the next
> computer and hope that it will be powerful enough. You will *never* have
> sufficient computing capabilities to tests all possible inputs. (or you
> have to consider program with a small finite input set).
I assume that this is my mistake in reasoning. What is the "limitation" for
looking into the future? Do I have to assume the current power of
computers or can I reason with "in few years the computer will be more
powerful...". And what is the limit for the computation time to call
a problem not computable? Is a problem which takes with the current
technology 1000 years to compute a computable problem or can it be
already considered as not computable? Due to what I posted previously
(computable since requires finite amount of time) it should be
computable. My problem is the uncertainty with all these parameters.
A similar problem to this would be to find the size of input data
to make testing computable. We have seen that 2^64 takes too long
but what about if we assume input data to be 2^8. I did not compute
the time for testing but I must be much smaller than for 2^64. Here
the same question arises. How long do I have to continue to restrict
the size of the input data to make extensive testing a computable problem.
Thank you very much for sharing your ideas.
> OK, I agree that if we switch to theory with infinite numbers, testing of
> all inputs is not computable.
Yes. And all problems are usually considered from the theoretical infinite
(or at least unbound) point of view. Typically, when one says that SAT (or
TSP, or ...) is NP-complete, it is the unbounded problem that is
considered. The bounded problem (SAT with less than 100 variables, ...) is
finite and hence (theoretically) computable in constant time by
pre-building the solution to all the possible inputs of the problem... A
bit far fetched since the issue of the time/space require to compute it
doesn't really exists but still true. Which is why computability and
complexity usually consider theoretical unbounded problem [and, by the
way, when we say that our computer or programming languages have the same
computablity power as Turing's machines, it is again from a theorethical
point of view with unbounded memory (hard disk) that the TM enjoy...]
> But if we restrict the domain to real computers with let's say 64bit
> integers (your example), then, due to your computation, it would
> currently
> take about 10^12 years. 1000 billions years is not a realistic time to
> wait but is it not a finite amount of time? IMHO it is which according to
> my last post would make the testing of all inputs possible.
Yes.
> In my opinion, the discussion about the amount of time for a computation
> is still somehow fuzzy. Let's assume in 100 years, someone develops a
> 10 billion GHz computer. Then the complexity of testing would drop to
> 1000
> years as you estimated. 1000 years is still a long time but this period
> becomes slowly realistic. One could now go further and argue that even
> faster computers are found in 150 years. Finally, this would lead to such
> a fast computer that testing would take only 1 month which is a realistic
> computation time. The conclusion: testing of all inputs becomes
> computable
> in the future.
Yes and no. We can argue whether 1000 years is realistic or not (compared
to human history, I don't think there was any place in the world where a
computer would have been safe during 1000 years to perform its computation
without suffering from revolutions, wars, ... [not to speak about the
weariness of the computer and such]. Think, for example, that Western
Europe is currently enjoying a 65 years long period of complete (and
cooperative) peace (since WW2) and the last time such a long period of
peace happened was... under the Roman Empire !)
But even without this discussion about "how long can we wait safely",
there ae physical problems. Current computers are based on electrical
transmission which is only so fast. Even is we switch to light-based
computer, there is still a speed of transmitting information and as far as
I know there is a think called "Planck's time"
(http://en.wikipedia.org/wiki/Planck_time) which corresponds to rouglhy
10^{-44} seconds and it considered as the smallest meaningfull amount of
time. So, as far as our current knowledge of physics is correct, we will
probably never ever be able to build computers faster than 10^44 Hz.
Pretty fast, true, but even with finite but very very large inputs, it
will still take billions of years to compute...
[and considering the way things evolved lastly, I do believe that the size
of the input we're working on grows at least as fast as the speed of the
computers we're using... We're now daily handling Terabytes of data which
would have make anyone look crazy ten or twenty years ago...]
So, even if you restrict yourself to finite input were the computation can
theoretically (how ironic) be done, I do believe that physical limits will
prevent us from doing it on very very large (but still finite) input. This
is kind of a justification of why it is interesting to look at the
theoretical infinite (or unbound) aspect of the problems...
> I assume that this is my mistake in reasoning. What is the "limitation"
> for
> looking into the future? Do I have to assume the current power of
> computers or can I reason with "in few years the computer will be more
> powerful...". And what is the limit for the computation time to call
> a problem not computable? Is a problem which takes with the current
> technology 1000 years to compute a computable problem or can it be
> already considered as not computable? Due to what I posted previously
> (computable since requires finite amount of time) it should be
> computable. My problem is the uncertainty with all these parameters.
>
> A similar problem to this would be to find the size of input data
> to make testing computable. We have seen that 2^64 takes too long
> but what about if we assume input data to be 2^8. I did not compute
> the time for testing but I must be much smaller than for 2^64. Here
> the same question arises. How long do I have to continue to restrict
> the size of the input data to make extensive testing a computable
> problem.
Nonetheless, you're indeed partly right. Value that were considered "safe"
previously are no longer "safe". Most cryptographic algorithms rely on
something being very very long to compute. Some of them were designed at a
time were space (memory) was much much more expensive than nowadays and so
the first implementation of them use small keys that were sufficient at
the time but are now longer sufficient. DES originally used a 56 bits key
which was more than enough to hold for "a long time" in the 70s but is now
breakable by brute force (try all possibilities) in a minute or so. RSA
can use keys of various length and there was a row a dozens years ago in
France because law forbid to use cryptographic key longer than (? 64 ? 128
?) bits and the power of computers was such that these keys were no longer
safe and we had to switch to longer keys.
So you're both right and wrong...
Typically, "breaking" RSA is, in the theoretical world, a NP-complete
problem but in the real world the problem is "breaking RSA with keys of N
bits" which is a finite perfectly computable problem for which we didn't
had the computationnal power ten years ago but it's now trivial. And the
even more trivial answer to this breach to increase the length of the key,
so as soon as "breaking RSA with keys of N bits" becomes doable [in a
reasonnable amount of time], the problem you'll wan to solve suddently
becomes "breaking RSA with keys of 2N bits" and is still out of reach.
--
Hypocoristiquement,
Jym.