Can a supercompiler produce a superlinear improvement?

Sergei Romanenko

unread,

Mar 31, 2008, 2:03:09 AM3/31/08

to A Small Positive Supercompiler in Scala

Can a supercompiler produce a superlinear improvement?

First of all, we should carefully differentiate the notions
"supercompilation" and "supercompiler". Certainly, a supercompiler is
a program manipulation system whose "core" technique is
"supercompilation", but a supercompiler is free to make use of any
additional techniques of program transformation/manipulation that are
considered as "useful" or "appropriate" by the author of the
supercompiler. And combining heterogenious techniques may produce
"synergetics" effects increasing the power of the supercompiler.

Hence, it is tautologically true that adding to a supercompiler a
technique capable of producing superlinear improvements results in the
supercompiler being able to produce superlinear improvements!

Therefore the question should be reformulated as follows:

Can supercompilation produce a superlinear improvement?

Here we should consider the differences between Sørensen's "positive
supercompilation" as defined, for example, in the paper

M.H.Sørensen. Convergence of program transformers in the metric space
of trees. Sci. Comput. Program. 37, 1-3 (May. 2000), 163-205.

and Turchin's original "supercompilation".

In the case of Sørensen's version of supercompilation:

1. Supercompilation is formulated in terms of a lazy functional
language.
2. Supercompilation is not allowed to change the termination
properties of the program being supercompiled.

In the case of Turchin's supercompilation

1. Supercompilation is formulated in terms of the functional
language Refal, which is a strict (i.e. non-lazy).
2. Supercompilation is allowed to change the termination properties
of the program being supercompiled, by extending the domain of the
original program. Namely, given a program f and an input a, if the
evaluation of f(a) aborts or never terminates, a supercompiled version
of the program is allowed to abort or produce any result.

Now let's consider a program like

start(x) = first(x, veryBad(x));
first(x, y) = x;

where veryBad is a "nasty" function that is defined as

veryBad(x) = veryBad(x);

or terminates, but after performing a lot of silly operations on its
argument.

Anyone supercompiler is likely to produce the target program

start(x) = x;

In the case of a lazy language, the target program is as "efficient"
as the source one, because the function call in the source program is
never evaluated, therefore the way in which the function veryBad is
defined is not of importance.

However, in the case of a strict language (like Refal or Standard ML)
the target program may be as "more efficient" as we like, depending of
the "nastiness" of the function veryBad.

Therefore the imprecise proposition

Supercompilation can never produce a superlinear improvement

is not true, and needs to be reformulated in a more careful way.

Geoff Hamilton

unread,

Mar 31, 2008, 5:35:28 AM3/31/08

to a-small-positive-su...@googlegroups.com

Sergei,

I agree completely with what you say here. Would it be accurate to say:

"Supercompilation, when applied to a lazy functional language, can never
produce a superlinear improvement."

Andrei Klimov

unread,

Mar 31, 2008, 6:17:49 AM3/31/08

to a-small-positive-su...@googlegroups.com, refal...@botik.ru, Valentin Turchin

Note for the CC-addressees: this is a reply to the message
http://groups.google.com/group/a-small-positive-supercompiler-in-scala/browse_thread/thread/bd857f550a6a5163?hl=en
sent to the group "A Small Positive Supercompiler in Scala":
http://groups.google.com/group/a-small-positive-supercompiler-in-scala?hl=en

We invite you to join the group!

----- Original Message -----
From: "Sergei Romanenko" <sergei.r...@gmail.com>
Sent: 31 Mar 2008 10:03
Subject: Can a supercompiler produce a superlinear improvement?
...
> Here we should consider the differences between Sorensen's "positive
> supercompilation"
...
> and Turchin's original "supercompilation".
>

> In the case of Sorensen's version of supercompilation:

>
> 1. Supercompilation is formulated in terms of a lazy functional
> language.
> 2. Supercompilation is not allowed to change the termination
> properties of the program being supercompiled.
>
> In the case of Turchin's supercompilation
>
> 1. Supercompilation is formulated in terms of the functional
> language Refal, which is a strict (i.e. non-lazy).
> 2. Supercompilation is allowed to change the termination properties
> of the program being supercompiled, by extending the domain of the
> original program. Namely, given a program f and an input a, if the
> evaluation of f(a) aborts or never terminates, a supercompiled version
> of the program is allowed to abort or produce any result.

Note the _same_ supercompiler may be considered as possessing the "Sorensen's properties" and the "Turchin's properties" simultaneously!

Let a supercompiler of the Sorensen's kind for a pure functional language is given. Image the author has forgotten to mention in the guide whether the semantics of the language is lazy or strict. Then it can happen so that two programmers implement two interpreters for the language: one with the lazy semantics, another with the strict one. When the strict interpreter terminates and returns some result, the lazy one also terminates and returns the same result. Imagine both programmers are fond of supercompilation and have released their language implementations packed with the original supercompiler (provided its license allows;-) and named them LazyHat distribution and StrictHat distribution respectively.
Statement 1. The LazyHat distribution has the Sorensen's properties, while the StrictHat distribution has the Turchin's properties.

Corollary 1.1. Being Turchin's or Sorensen's is a property of the object language semantics rather than that of a supercompiler.;-) More precisely, this is the property of the pair (object language semantics, supercompiler), that is, a relation.

Now consider another case where a supercompiler of Turchin's kind is given. Modify it a bit in such a way that redundant computations are not thrown away but residualized. (For the supercompilers that I am familiar with this is just a technical task.) Call it a thrifty supercompiler. Let the two programmers distribute it with their language implementations again.

Statement 2. The thrifty supercompiler preserves both semantics (including the termination property): the lazy semantics in the LazyHat distribution and the strict semantics in the StrictHat distribution.

Corollary 2.1. Preserving the termination property is not the essence of the supercompilation method but an option of a supercompiler, which can be switched on/off by the user.

Statement 3. The user of the LazyHat distribution does not notice any difference between the two modes of supercompilation except residual program side and negligible difference in run time, if at all.

Andrei

Sergei M. Abramov

unread,

Mar 31, 2008, 6:17:48 AM3/31/08

to a-small-positive-su...@googlegroups.com, Geoff Hamilton, Sergei Romanenko

Dear Geoff

> Would it be accurate to say: "Supercompilation, when applied to a lazy
> functional language, can never produce a superlinear improvement."

I think, it's true for classical Turchin's scp.

It's false if Nemytykh extensions are used.

Example:

=====================================
main n = f1 (chk (gen n)) n

f1 True n = n
f1 False n = []

and True True = True
and False True = True
and True False = False
and False False = False

gen [ ] = Leaf
gen (x:xs) = (gen xs):^:(gen xs)

chk Leaf = True
chk (left :^: right) = and (chk right) (chk left)
=====================================

The "main x" will be supercompiled by Scp4 with the following result:

=====================================
main n = n
=====================================

The trick is the following in Scp4: will be discovered output format of (chk
(gen n)) --- for all n ((chk (gen n)) ==True)

Please note:

1. In this example does not matter is language lazy or not;

2. The termination property of "main" is not changed

Best regards

Sergei
PS. Please, check example ;-)

Andrei Klimov

unread,

Mar 31, 2008, 6:45:16 AM3/31/08

to Andrei Klimov, a-small-positive-su...@googlegroups.com, refal...@botik.ru, Valentin Turchin

After Sergei Abramov's letter
http://groups.google.com/group/a-small-positive-supercompiler-in-scala/msg/18d77c178e91f5de

I take my last statement back:

The user of the lazy implementation and the Andrei Nemytykh's supercompiler will notice by the Abramov's example the superlinear speed-up without the change of the termination property.

Andrei

Sergei Romanenko

unread,

Mar 31, 2008, 7:37:28 AM3/31/08

to A Small Positive Supercompiler in Scala

On Mar 31, 1:35 pm, Geoff Hamilton <hamil...@computing.dcu.ie> wrote:

> I agree completely with what you say here. Would it be accurate to say:
>
> "Supercompilation, when applied to a lazy functional language, can never
> produce a superlinear improvement."

As far as I understand, in reality, you are interested in the
following "positive" statement:

In some cases "pure supercompilation" fails to produce a
superlinear improvement,
while "pure distillation" succeeds in doing so.

For example, consider the program

app(Nil, vs) = vs;
app(Cons(u, us), vs) = Cons(u, app(us, vs));
rev(Nil) = Nil;
rev(Cons(x, xs)) = app(rev(xs), Cons(x, Nil));

reverse(x)=rev(x);

Then "pure distillation" is able to produce a "linear" version of
`reverse`, while "pure supercompilation" produces, essentially, the
original program.

It seems that "distillation" (as you define it) is a generalization of
and/or an extention to "positive supercompilation" (as you define it).
Hence, it's impossible to find an example where "pure positive
distillation" is weaker than "pure positive supercompilation".

There remains a subtle point concerning the propagation of negative
information, though... But, perhaps, the negative information
propagation could be added to distillation?

Sergei

Sergei Romanenko

unread,

Mar 31, 2008, 7:46:42 AM3/31/08

to A Small Positive Supercompiler in Scala

On Mar 31, 2:17 pm, "Sergei M. Abramov" <ab...@botik.ru> wrote:

>> Would it be accurate to say: "Supercompilation, when applied to a lazy
>> functional language, can never produce a superlinear improvement."
>
> I think, it's true for classical Turchin's scp.
>
> It's false if Nemytykh extensions are used.

As I've said above, by definition, a supercompiler is a program
manipulation system based on supercompilation + some additional tricks
(which may result in synergetics effects).

If some tricks can be added to a supercompiler, then the same tricks
can be added to a distiller. Why not?

So, comparing a "pure distiller" to a supercompiler enhanced with a
bunch of tricks makes little sense. Let' add the same tricks to a
"pure distiller" and to a "pure supercompiler" and then the comparison
will be fair.

If distillation is really more powerful than supercompilation, then
adding the same tricks to a distiller should result in even more
spectacular improvements!

Sergei

Geoff Hamilton

unread,

Mar 31, 2008, 9:05:20 AM3/31/08

to a-small-positive-su...@googlegroups.com

I actually made the conscious decision when implementing distillation to
require that full definitions are given for all functions, including
equality.
This means that there is no notion of propagating positive or negative
information: all information propagation is achieved by the substituting of
patterns for variables. Thus, if there is a test for equality for two
variables
v and v', then in a context within which the result of this test is
true, equal
patterns will have been substituted for v and v', and in a context within
which the result of this test is false, non-equal patterns will have been
substituted for v and v'. I believe this gives the same effect as perfect
propagation of information and a cleaner simpler algorithm, but am open
to the possibility of this not being the case.

Sergei Romanenko

unread,

Mar 31, 2008, 11:29:44 AM3/31/08

to a-small-positive-su...@googlegroups.com

On Mon, Mar 31, 2008 at 5:05 PM, Geoff Hamilton
<hami...@computing.dcu.ie> wrote:

> I actually made the conscious decision when implementing distillation to
> require that full definitions are given for all functions, including
> equality.

> This means that there is no notion of propagating positive or negative
> information: all information propagation is achieved by the substituting of
> patterns for variables.

As far as I understand, if the input language is statically typed (in
the Hindley-Milner style), and we require that all constructors be
present in every case-expression, there is no difference between
"positive" and "negative" information. For example, if lists are
defined as

data List a = Nil | Cons a (List a);

then any case expression "knows" in advance, that it can be given
either Nil or Cons, OR neither Nil nor Cons.

Sergei

Geoff Hamilton

unread,

Mar 31, 2008, 2:17:06 PM3/31/08

to a-small-positive-su...@googlegroups.com

Dear Sergei,

Your message below does not appear to be consistent with your previous
message:

There remains a subtle point concerning the propagation of negative
information, though... But, perhaps, the negative information
propagation could be added to distillation?

Or am I missing something?

Sergei Romanenko

unread,

Mar 31, 2008, 7:21:52 PM3/31/08

to A Small Positive Supercompiler in Scala

On Mar 31, 10:17 pm, Geoff Hamilton <hamil...@computing.dcu.ie> wrote:

> Your message below does not appear to be consistent with your previous
> message:
>
>> There remains a subtle point concerning the propagation of negative
>> information, though... But, perhaps, the negative information
>> propagation could be added to distillation?

I'm not trying to promote any fixed viewpoint, but rather to gain some
understanding of the current situation in the field of
supercompilation (and related matters). For this reason, I'm trying to
put various viewpoints together to make a meaningful picture from
several pieces of the puzzle.

Hence, I've just tried to sum up the arguments in favor of the use of
a statically typed language, rather than a dynamically typed one.

So, why are my messages "inconsistent"? I've just agreed that the use
of a statically typed language may result in some simplifications in a
supercompiler/distiller, because in this case

(1) We don't have to deal with a lot of dull and uninteresting
technical problems concerning "type mismatch" errors and like that.

(2) The explicit propagation of negative information becomes less
important, because "the space of possibilities" is reduced in advance.

But, if we want to apply distillation to a dynamically typed language,
the negative information propagation may increase the power of a
distiller.

So, what do you mean when speaking about an inconsistency in my
messages?

Sergei

Geoff Hamilton

unread,

Apr 1, 2008, 6:33:07 AM4/1/08

to a-small-positive-su...@googlegroups.com

Dear Sergei,

I now see where you are coming from - my apologies - your previous
messages are
not inconsistent. I thought that you had said that I should add negative
information
propagation to distillation on the one hand, and that there is no
distinction between
positive and negative information for distillation on the other.
However, what you were
actually saying is that there is no distinction between positive and
negative information
for statically typed languages. I guess I need to divorce my
transformation algorithm
from the language ;-) . I was confused because I was not using the fact
that distillation
is defined for a statically typed language to argue that perfect
propagation of
information can be obtained without the need for distinguishing positive
and negative
information propagation. If we try to divorce the distillation algorithm
from the typing
discipline of the language, then my earlier question still stands:

If we require that full definitions are given for all functions,
including equality, is there any
need for a notion of propagating positive or negative information or can
all information propagation
be achieved by the substituting of patterns for variables? Thus, if

there is a test for equality for two
variables v and v', then in a context within which the result of this
test is true, equal patterns will have
been substituted for v and v', and in a context within which the result
of this test is false, non-equal
patterns will have been substituted for v and v'. I believe this gives
the same effect as perfect
propagation of information and a cleaner simpler algorithm, but am open
to the possibility of this
not being the case.

Geoff.

Morten Heine Sørensen

unread,

Apr 1, 2008, 9:32:05 AM4/1/08

to a-small-positive-su...@googlegroups.com

I guess that if there is an explicit equality function for each datatype we wish to do comparison on, with a clause for each true-case, and one for each false-case, then we don't need negative information, because we can make all the combinations explicit (for terms built with constructors that take arguments, one can of course do a bit better). For integers or even characters this seems like a not very attractive idea. Now, the languages I have been studying are usually "toy", and the question is how much "toy" one wants to be. I guess for my taste, taking explicit equality function for, say, a type of characters is too "toy", which is why I and Jens Peter Secher became interested in handling constraints.

An interesting small thing, which is easy to miss, is that if x is different than y, y is different than z, and z is different than x, and x,y,z are booleans, then this branch can be pruned, because there are only two boolean values. The phenomenon is of course more general. But such issues would also be handled automatically by taking all combinations of equal and inequal values at the cost of some explosion...

Best -

Morten Heine.

Sergei Romanenko

unread,

Apr 1, 2008, 2:38:42 PM4/1/08

to a-small-positive-su...@googlegroups.com

On Tue, Apr 1, 2008 at 2:33 PM, Geoff Hamilton
<hami...@computing.dcu.ie> wrote:

Dear Geoff,

> However, what you were actually saying is that there is no
> distinction between positive and negative information
> for statically typed languages.

In a language like Refal (or Scheme) one can write something like

f(A) = ASpecialCase;
f(x) = x;

In terms of Refal, there are an infinite (but countable) set of
zero-arity constructors (= "symbols"), a single unary constructor (=
"enclosing an expression in parentheses") and a single binary
constructor (= "concatenation"), which is associative.

Note also that the input data for the program may contain arbitrary
zero-arity constructors ("symbols") that don't appear in the program!

Hence, the function f returns ASpecialCase when given A, and behaves
as the identity function when given anything else. There is no way to
express the fact that "x is not A" by means of a substitution!

But, it the language is a statically typed one, the constructor A must
be declared by means of a declaration, like

data ABC = A | B | C;

Hence, upon seeing A in the rule

f(A) = ASpecialCase;

the supercompiler/distiller knows that X in the second rule

f(x) = x;

is either B or C. Hence, this rule can be replaced with 2 rules:

f(B) = B;
f(C) = C;

which is impossible in principle if we deal with a language like Refal
(or Scheme).

That's why my guess was that the (explicit) negative information
propagation is less important for statically typed languages. Just
because there is no difference between the "negative" and "positive"
information! :-)

The "negative" statement "x is not A" can be reformulated as a
"positive" (and finite) statement "x is either B or C".

Sergei

Gavin Mendel-Gleason

unread,

Apr 2, 2008, 8:17:41 AM4/2/08

to A Small Positive Supercompiler in Scala

Dr Morten Heine,

> For integers or even
> characters this seems like a not very attractive idea. Now, the languages I
> have been studying are usually "toy", and the question is how much "toy" one
> wants to be. I guess for my taste, taking explicit equality function for,
> say, a type of characters is too "toy", which is why I and Jens Peter Secher
> became interested in handling constraints.

I don't think the two aims are mutually exclusive. The coq proof
assistant uses a representation of natural numbers, integers and
characters which are built using constructors such that reasoning is
simple. i.e. nats look something like the type (Nat = Z | S Nat) and
characters are of the type (Char =
bool*bool*bool*bool*bool*bool*bool*bool) yet when they are extracted
to ocaml the representation used is a bignum and characters
respectively. This allows one to do program transformation in a more
elaborated setting and shift the efficiency concerns into a trivial
transformation at compile time.

> An interesting small thing, which is easy to miss, is that if x is different
> than y, y is different than z, and z is different than x, and x,y,z are
> booleans, then this branch can be pruned, because there are only two boolean
> values. The phenomenon is of course more general. But such issues would also
> be handled automatically by taking all combinations of equal and inequal
> values at the cost of some explosion...

I think the greedy instantiation of variables based on case often
yields less expensive simplifications for characters and naturals then
carrying around disequality/apartness constraints. I say this based
on my attempts to prove conjectures using these two techniques
manually.

I believe the implementation describing the above apartness constraint
(in póitín) would be:

(andb (andb (xor x y) (xor y z)) (xor x z))
where
not = %x. case x of
True => False
| False => True ;

xor = %x y. case x of
True => not y
| False => y ;

andb = % x y . case x of
True => y
| False => False ;

Distillation of this yields:

case x of
True => case y of
True => False
| False => case z of
True => False
| False => False
| False => case y of
True => case z of
True => False
| False => False
| False => False

A simple analysis of this term yields the simpler equivalent term
False. As an aside, I'm curious if this is the type of thing that is
being done with "output-formats" since it sounds like such a
transition could in fact get super-linear speed-up for an algorithm
that decides to, for instance, calculate Ackerman's function and then
destructure it yielding False in all cases. Of course such a
transition can only be semantics preserving if we can ensure
termination.

In terms of making explicit handling of equality practical for the
programmer, I'd suggest that type-classes give the syntactic sugar
needed while avoiding having to keep track of the name of each
equality operator.

> Best -
>
> Morten Heine.
> On Tue, Apr 1, 2008 at 12:33 PM, Geoff Hamilton <hamil...@computing.dcu.ie>

Morten Heine Sørensen

unread,

Apr 2, 2008, 8:56:22 AM4/2/08

to a-small-positive-su...@googlegroups.com

Hi Gavin,

I don't think the two aims are mutually exclusive...

This is an interesting compromise; I didn't think about that.

I think the greedy instantiation of variables based on case often
yields less expensive simplifications for characters and naturals then
carrying around disequality/apartness constraints. I say this based
on my attempts to prove conjectures using these two techniques
manually.

Yes, I think the instantiation is simpler in several respects.

I believe the implementation describing the above apartness constraint

(in póitín) would be:...

But this is for booleans, right? What if you had characters?

False. As an aside, I'm curious if this is the type of thing that is
being done with "output-formats" since it sounds like such a
transition could in fact get super-linear speed-up for an algorithm
that decides to, for instance, calculate Ackerman's function and then
destructure it yielding False in all cases. Of course such a
transition can only be semantics preserving if we can ensure
termination.

This is certainly one type of superlinear speed-up. But somehow it seems like an easy case; it is basically realizing that all the leaves are identical and skipping the looping. It would be interesting to understand whether other types of superlinear spped up are possible in distillation, for instance similar to those I seem to recall are possible using Chin's tupling.

To go a little further with the discussion of negative information, it would be interesting to see what would be required to make distillation handle the KMP matcher in the same way as perfect supercompilation in Jens Peter's version (see http://citeseer.ist.psu.edu/cache/papers/cs/10245/http:zSzzSzwww.diku.dkzSz~jpsecherzSzsechersoerensen.pdf/secher99perfect.pdf for a short version and http://citeseer.ist.psu.edu/cache/papers/cs/2568/http:zSzzSzwww.diku.dkzSz~jpsecherzSzperfectscp.pdf/secher99perfect.pdf for a long version). For instance, assuming that the character set is just a two-valued set, you could probably use the definitions from the previous mail. What happens if the character set is bool^8?

Anyway, this kind of handling of negative information basically reduces perfect to positive supercompilation (as those two terms are understood in the Copenhagen papers) I think, so probably positive supercompilation could give the exact expected KMP matcher with this kind of rewritting.

Andrei Nemytykh

unread,

Apr 2, 2008, 3:30:20 PM4/2/08

to a-small-positive-su...@googlegroups.com

Dear Sergei!

...

> Example:
> =====================================
> main n = f1 (chk (gen n)) n

...

Thank you very much for the interesting example given by you.

> The trick is the following in Scp4: will be discovered output format of (chk
> (gen n)) --- for all n ((chk (gen n)) ==True)

....

> PS. Please, check example ;-)

-- It seems the postscript concerning myself.

I have tried to supercompile your example by the current version of
the SCP4 supercompiler. SCP4 does not produce the result expected by
you.
I am sorry about. :-)

The reason is as follows:
The configuration (chk (gen n)) was splitted during generalization:

let m = (gen n) in (chk m)

and the first configuration (gen n) was supercompiled separately of
the second (chk m).

Actually, in my opinion, the example given by you is a good problem to
be solved by development of the supercompilation technology.

The main goal is to improve programs. As a consequence, we have to
generate new ideas and have not to create artificial limits for
supercompilation technology. That is to say, for a technology based on
observation of meta-interpretation of programs.

Sincerely.
Andrei Nemytykh.

Reply all

Reply to author

Forward