How often is then f called in the following code:
print g(f(x),f(x))
print h(f(x),f(x))
I think this topic is called "common subexppression elimination" in computer
science.
TIA,
--
Janos Blazi
"Il n'y a guère dans la vie qu'une préoccupation grave: c'est la mort;"
(Dumas)
-----------== Posted via Newsgroups.Com - Uncensored Usenet News ==----------
http://www.newsgroups.com The #1 Newsgroup Service in the World!
-----= Over 100,000 Newsgroups - Ulimited Fast Downloads - 19 Servers =-----
>
> Let us assume, f(x) is a relatively complicated function and g(x,y) and
> h(x,y) are functions.
>
> How often is then f called in the following code:
>
> print g(f(x),f(x))
> print h(f(x),f(x))
>
> I think this topic is called "common subexppression elimination" in
> computer science.
>
> TIA,
That was a dumb question, sorry. So I have tried and I am a bit surprised
that f is called four times!
Why?
spam = f(x)
print g(spam, spam)
print h(spam, spam)
I can't see how this optimisation could be automated - function f may or may
not have side effects. (Python isn't a functional language, after all.) If
it does, the programmer may or may not want these side effects to occur
multiple times. So just code it the way you want it!
Cheers,
Simon Brunning
TriSystems Ltd.
sbru...@trisystems.co.uk
-----------------------------------------------------------------------
The information in this email is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this email by anyone else
is unauthorised. If you are not the intended recipient, any disclosure,
copying, distribution, or any action taken or omitted to be taken in
reliance on it, is prohibited and may be unlawful. TriSystems Ltd. cannot
accept liability for statements made which are clearly the senders own.
> > How often is then f called in the following code:
> >
> > print g(f(x),f(x))
> > print h(f(x),f(x))
> >
> > I think this topic is called "common subexppression elimination" in
> > computer science.
> [...] I have tried and I am a bit surprised that f is called four times!
> Why?
Python does not really optimise the code it generates. Some brave souls
(Skip in particular) did flurries of tries in that direction, and kind of
demonstrated that byte-code optimisations might not be as rewarding as we
could have thought it would be.
In the particular case above, CSE may not be applied, because the optimiser,
if it existed, could not assume that `f' has no side-effect. Suppose:
def f(x):
global y
y = y + 1
return x + y
then `y' should be incremented four times, not one. Moreover, I do not
remember that Python specifies that function arguments are evaluated left
to right, or right to left. But I may be wrong, I'm not sure. If the
order is unspecified, and `f' does not always return the same value for
the same argument, the code in the original message would not even be
sensible Python, because it would yield unpredictable results.
--
François Pinard http://www.iro.umontreal.ca/~pinard
how can it be called less than 4 times?
def f(x):
random.seed(x)
return random.random()
> then `y' should be incremented four times, not one. Moreover, I do
not
> remember that Python specifies that function arguments are evaluated
left
> to right, or right to left.
Left to right is guaranteed somewhere in ref manual, last I looked.
In response to OP: one of side effects of call to f could be rebinding
of name 'f'.
def f():
global f
f = g
return "First time. "
def g():
return "Ha, ha! Fooled you?"
print f(), f()
#prints
First time. Ha, ha! Fooled you?
Terry J. Reedy
> Python does not really optimise the code it generates. Some brave souls
> (Skip in particular) did flurries of tries in that direction, and kind of
> demonstrated that byte-code optimisations might not be as rewarding as we
> could have thought it would be.
>
> In the particular case above, CSE may not be applied, because the optimiser,
> if it existed, could not assume that `f' has no side-effect. Suppose:
>
> def f(x):
> global y
> y = y + 1
> return x + y
>
> then `y' should be incremented four times, not one.
Some compilers have global optimizers that can recognize cases where
sufficiently simple functions do not have side-effects and then do common
sub-expression elimination even on function calls. However, this is rare
because it's hard to do and because the opportunity to perform that particular
optimization is infrequent.
I'd expect there'd be more to gain in Python simply looking for patterns such as
for i in xrange(N): ...
for item in list: ...
list comprehensions
and transforming them into a form where the loop is folded into lower level
runtime routines rather than executed long hand by byte-codes.
Regards
--jb
--
James J. Besemer 503-280-0838 voice
http://cascade-sys.com 503-280-0375 fax
mailto:j...@cascade-sys.com
The expression giving the object to iterate over in a 'for' loop is
evaluated only when the 'for' loop begins, not at each iteration.
That's completely different than
print xrange(3), xrange(3)
where xrange must be called twice. Not even the 'LOAD_GLOBAL xrange'
instruction can be hoisted/factored. For instance, imagine the following
(perverse) definition of xrange:
builtin_xrange = xrange
def xrange(x):
global xrange
xrange = lambda x: "My dog has fleas"
return builtin_xrange(x)
Python doesn't have anything like
proclaim_const(xrange)
proclaim_pure(xrange)
to let the compiler know that xrange is not subject to modification,
side-effects, and does not depend on any global state.
Of course, then there's the situation
def f(x):
g(xrange(x), xrange(x))
g(xrange(x), xrange(x))
Here, there would still be at least two xrange calls, since x might be
mutable and modified (through another name, or because xrange = lambda
x:x) by the call to g. (This assumes g is not pure, of course) In
general any mutable object can be modified at any call, and in general
you can't predict that an object is immutable at compile-time.
So now you get to write
def f(x):
assert_immutable(x)
g(xrange(x), xrange(x)
g(xrange(x), xrange(x))
and add compiler support (just like you added for proclaim_*())
Get busy ..
Jeff
> On Thu, May 09, 2002 at 08:52:28AM -0700, James J. Besemer wrote:
>
> > I'd expect there'd be more to gain in Python simply looking for patterns such as
> >
> > for i in xrange(N): ...
>
> The expression giving the object to iterate over in a 'for' loop is
> evaluated only when the 'for' loop begins, not at each iteration.
No, of course not. I didn't mean to suggest that the expression in this case was
being evaluated multiple times.
Rather I was harkening back to an earlier discussion where we established that certain
forms of for loops and list comprehensions were much slower than e.g., functional and
lambda forms that accomplished the same things.
So I was suggesting an optimizer that when it saw a form like:
res = []
for x in list:
if x:
res.append( x )
it would substitute a call to something
filter( lambda x : x is not None, list )
Where the entire substitution might furthermore be part of a C runtime routine instead
even of explicit calls to filter and lambda.
List comprehensions seem to be one of the slowest way to iterate over data. Perhaps
some common forms can be recognized by the optimizer and passed to an efficient
runtime module.
> Get busy ..
Oh yeah. Lotta work.
I haven't looked at Python's runtime system but optimizations was an area I was
interested in.
First, a minor point, this should be (if I understand the current
Boolean work)
filter( bool, list )
("if x:" corresponds to (old-style) "if operator.truth(x) == 1:" or
(new-style) "if bool(x):" and is not the same as "x not None" because
you can override __nonzero__ or __len__, which may be used by the
truth test.)
There's also fiendish possibilities like
>>> class Fiendish:
... def __nonzero__(self):
... return len(sys._getframe().f_back.f_locals.get("res", [])) < 3
...
>>> list = [Fiendish(), Fiendish(), Fiendish(), Fiendish(), Fiendish(),
Fiendish()
, Fiendish(), Fiendish()]
>>> res = []
>>> for x in list:
... if x:
... res.append(x)
...
>>> res
[<__main__.Fiendish instance at 0x120288ef8>, <__main__.Fiendish instance at
0x120
28d428>, <__main__.Fiendish instance at 0x12028fbe8>]
>>> len(res)
3
>>> res = []
>>> import operator
>>> res = filter(operator.truth, list)
>>> res
[<__main__.Fiendish instance at 0x120288ef8>, <__main__.Fiendish instance at
0x120
28d428>, <__main__.Fiendish instance at 0x12028fbe8>, <__main__.Fiendish
instance
at 0x12028e098>, <__main__.Fiendish instance at 0x12028a7a8>,
<__main__.Fiendish i
nstance at 0x12028ded8>, <__main__.Fiendish instance at 0x1202904b8>,
<__main__.Fi
endish instance at 0x12028da28>]
>>> len(res)
8
>>>
Unless the compiler could look (somehow!) into the definition of each
__nonzero__, or disable optimizations if _getframe(), sys.exc_info(), etc.
are used *anywhere* in the Python code, then there's no way it can build
that optimization.
Andrew
da...@dalkescientific.com
[many examples of increasingly well hidden side-effects]
Granted. There are a lot of ways to break the optimization, some fairly
insidious.
> Unless the compiler could look (somehow!) into the definition of each
> __nonzero__, or disable optimizations if _getframe(), sys.exc_info(), etc.
> are used *anywhere* in the Python code, then there's no way it can build
> that optimization.
True. And this is exactly what the best optimizers do. Look at all the
relevant code.
The way most optimizers work is they detect patterns known to be safe and also
sufficiently common to be worth dealing with. Anything that does not fit the
pattern 100% does not get optimized.
In practice, the function in question often is defined in the same source module
as where it is used. In most examples of list comprehensions, everything is
spelled out within the comprehension itself, as simple expressions of primitive
objects.
The other possibility someone else raised was that we could allow the programmer
to declare what functions may be safely optimized, though it would admit the
risk of very strange errors when the programmer is mistaken.
Recognizing just a few of the most common cases might make a significant
difference. I understand Perl does this for certain common top-level loop
constructs, in order to realize significant gains for those cases.
>
> Unless the compiler could look (somehow!) into the definition of each
> __nonzero__, or disable optimizations if _getframe(), sys.exc_info(), etc.
> are used *anywhere* in the Python code, then there's no way it can build
> that optimization.
And what about pure calculations? Let for example a,b and c be long
integers. Then what about
A=a*b*c+1
--
Js Bi
>> Unless the compiler could look (somehow!) into the definition of each
>> __nonzero__, or disable optimizations if _getframe(), sys.exc_info(),
>> etc. are used *anywhere* in the Python code, then there's no way it can
>> build that optimization.
Sorry, I must have pressed some idiotic key.
And what about pure calculations? Let for example a,b and c be long
integers. Then what about
x1=a*b*c+1-3
y1=sqrt(a*b*c+7)
or more simply
res= a / b
rest = a % b
Can such calculations be optimized?
--
Janoss Blazi
Sure, because you said "can." But for dynamically typed languages
like Python that gets hard to do. When this discussion has come up
before phrases like "whole program analysis" and "just in time" and
"the language Self" are also mentioned.
Is it easy? No. Is it easy to get right? Given the number of
errors I've seen in C compilers, esp. in optimization, the answer
is still no.
Given the limited resources for Python development, how much time
should be spent on this? I think very little. As Tim Peters once
pointed out, there haven't been problems with Python's optimization
code <wink>.
Andrew
da...@dalkescientific.com
That is *so* not something that interests me. :)
>The way most optimizers work is they detect patterns known to be safe and
also
>sufficiently common to be worth dealing with. Anything that does not fit
the
>pattern 100% does not get optimized.
I believe there are very few patterns like this in Python. Eg, the
one you gave is not one, because what they do is slightly different,
and there's ways to take advantage of those differences to cause different
results.
There are ways to get performance out of dynamically typed languages
like Python (besides the oft-discussed addition of static type info).
But they are hard and call for program analysis at a level more complex
than what I would deem "pattern detection."
There has been previous work on various optimizations. I believe
the consensus is that it's a hard problem and there are more interesting
and useful things to work on. OTOH, if someone has a few years to
spend on it, or a lot of money to throw at it, work in languages like
Smalltalk, Lisp, and Self show that it's possible.
(I exclude for this discussion "simple" performance improvements like
speeding up list comprehensions, or replacing one algorithm for another.
Those are simple, local tweaks as compared to larger scale analysis
of how different parts of code interrelate.)
>Recognizing just a few of the most common cases might make a significant
>difference. I understand Perl does this for certain common top-level loop
>constructs, in order to realize significant gains for those cases.
Yeah, and I've reported at least one bug in Perl's optimizer that
caused a crash in Perl. Hasn't happened in Python (except on the
SGIs where the C compiler's bad optimizations caused problems.)
Many people over the years have tweaked things to improve performance
besides this. For example, I like Skip Montanaro's work to improve
lookup performance for module/builtin variables since that's something
which improve a lot of my code by at least a bit, rather than a few
idioms which handle only a few things well.
But as I said, it's a level of complexity and thought which doesn't
interest me any where near enough to contemplate implementing it. And
the current style makes Python's source easier to understand.
Andrew
da...@dalkescientific.com
> And what about pure calculations? Let for example a,b and c be long
> integers. Then what about
>
> x1=a*b*c+1-3
> y1=sqrt(a*b*c+7)
>
> or more simply
>
> res= a / b
> rest = a % b
>
> Can such calculations be optimized?
Again, the compiler would need to be able to determine the types of the variables
a,b,c,d and make sure it was safe to remove the common sub-expressions.. E.g, in
the context where it is known that
a = 3
b = a+1
c = 99
it would be a valid optimization.
This is a bit tricky in a dynamic language like Python. E.g., in "a*b" a could
be a list or arbitrary objects that peversely define "*" or "+" to be operations
with side-effects. Even if you had a blanket rule to not optimize where
user-defined operators are involved, it is non-trivial to determine at compile
time what the type of the variables will be.
[evaluation of function arguments]
> Left to right is guaranteed somewhere in ref manual, last I looked.
I thought that, went looking for it and couldn't find it. Pointers?
Cheers,
M.
--
First of all, email me your AOL password as a security measure. You
may find that won't be able to connect to the 'net for a while. This
is normal. The next thing to do is turn your computer upside down
and shake it to reboot it. -- Darren Tucker, asr
No, I was using "pattern" in it's most general sense, meaning to include both
global and local optimizers. In this overall thread I have alluded to examples in
both extremes.
Global optimizers typically examine the attributed parse tree for an entire module
(multiple variable and function declarations), searching for arbitrarily complex
"patterns" in the overall tree. When there's a "match," an arbitrarily complex
transformation may be applied to the tree. Analysis can can include determining
data flows and control flows, necssary to handle some of the tricky things even in
dynamic languages. It also leads to better diagnostics, such as the detection of
uninitialized variables, unreachable code, etc.
This contrasts with so-called "peep hole" optimizers. Peepholes typically look at
the outgoing code and look for low-level patterns that can be replaced by more
efficient machine code idioms.
I suppose one could argue that any so-called 'optimizer' is merely an 'improver'.
Generally there are limit cases that can confound just about any attempts at
optimization.
Finally, I will defer to Mr. Dalke's assertion that optimization has been
thoroughly explored for Python and deemed to be more trouble than it's worth, at
least compared to other priorities.
choose c to optimise F(c,X(c,d))
such that X(c,d)=S(p,d)
c=code
p=program
d=program data
F=utility function ie code size/speed etc
X=execution function
S=semantic function
expressed in this form looking for patterns in p is clearly wrong since
S(p,d) is fixed for a particular program only up to the program data
which we don't normally know. Classic example is all that loop hoisting
which is wasteful if the loop never actually gets used.
Real experts on optimisation will also tell us that there are extended
versions of the above eg vector valued F etc etc and will almost always
be very cautious about solvability.
Clearly the most important thing here is to ensure feasibility ie
X(c,d)=S(p,d) for all d; I suppose that's what we mean by safe.
>James J. Besemer 503-280-0838 voice
>http://cascade-sys.com 503-280-0375 fax
>mailto:j...@cascade-sys.com
--
Robin Becker
> >No, I was using "pattern" in it's most general sense, meaning to include both
> >global and local optimizers. In this overall thread I have alluded to examples
> >in both extremes.
> .....
> well I disagree with the term pattern in this reply as it implies
> something that matches anything which is optimisable ie such a pattern
> must match all inputs.
Personally I very much prefer the term "pattern" as that's how I learned to think
about this class of algorithms back when I studied AI. There are strong parallels
between global optimization and a lot of AI algorithms.
I don't disagree with anything you say below. It's a good summary of the true,
theoretical objective of optimization, maximizing a "utility" metric. That very
terminology -- utility or cost functions -- is a signature feature of many AI
algorithms.
So I was talking at the next level down in abstraction from you, about the actual
algorithms used in practice by many optimizing compilers. Without contradicting
anything you say (other than your rejection of my use of "pattern") I am attempting
to portray in more detail the actual implementation that often gets used.
You're right to point out that different utility functions may apply and, strictly
speaking, that would rule out any single "pattern". However, in practice
"optimization" does not consist of just one pattern that either succeeds or fails.
Rather there is a collection of patterns and a way to select which subset is to be
applied in any given circumstance.
The optimizer visits successive nodes in a structure that represents the original
code. "Patterns" define subsets of node structures that can be compared with to
prospective matches in the code tree. Nodes in patterns and the parse tree are
annotated with information about semantics that further constrains the match. When
there's a match, the parse tree is transformed according to rules associated with
each pattern. Of course, once an optimization is applied, conceiveably the entire
tree needs to be re-examined. Also some possible optimizations may conflict and
it's necessary to consider both alternatives to squeze the last bit of speed. Of
course this implies a lot of work as a small number of alternatives quickly produces
a large number of total combinations to consider. So a lot of compilers have a
switch to enable "full" optimization. This switch remains off during much of the
development phase, as it can significantly increase build times.
In practice, you usually have a big global switch for speed vs. space. Some
"patterns" save BOTH speed and space and might as well always be applied, though
most line up either on the speed or on the space side of the hall. For better or
worse, compilers also usually provide a list of individual specific options, such as
'assume no aliases across function calls', 'keep inline functions', 'no function
common sub expression elimination', etc. Each is a pattern or a family of patterns
that may selectively searched for over the entire parse tree.
As others pointed out, this is a very complex problem domain with many, many
opportunities to screw things up.
Again, I don't mean to contradict anything you say below.
Regards
--jb
> The problem of code optimisation problem is
> approximately
>
> choose c to optimise F(c,X(c,d))
> such that X(c,d)=S(p,d)
> c=code
> p=program
> d=program data
> F=utility function ie code size/speed etc
> X=execution function
> S=semantic function
>
> expressed in this form looking for patterns in p is clearly wrong since
> S(p,d) is fixed for a particular program only up to the program data
> which we don't normally know. Classic example is all that loop hoisting
> which is wasteful if the loop never actually gets used.
>
> Real experts on optimisation will also tell us that there are extended
> versions of the above eg vector valued F etc etc and will almost always
> be very cautious about solvability.
>
> Clearly the most important thing here is to ensure feasibility ie
> X(c,d)=S(p,d) for all d; I suppose that's what we mean by safe.
>
> --
> Robin Becker
> --
> http://mail.python.org/mailman/listinfo/python-list
--
[Michael Hudson]
> I thought that, went looking for it and couldn't find it. Pointers?
In this case, the docs are on SourceForge <wink>:
That was before we tried any <wink>. One of my coworkers decided they were
tired of seeing LOAD_CONST followed by UNARY_NEGATIVE whenever they had a
negative literal in the source (like -4 or -1.23), and changed the compiler
to store the negation of the literal instead, leaving just LOAD_CONST at run
time.
We do that now, but it introduced several bugs, and the process of stumbling
into them stretched over almost a year. At the parsing end, it wound up
breaking mixtures of unary minus with exponentiation, and at the semantic
end it screwed up on negative float 0 literals (like -0.0). There was also
a bug in memory management, due to indirect mixing of the PyObject_xyz
memory API with raw platform malloc, and that bug went uncaught until very
recently because it could only matter if pymalloc was enabled.
The code today looks like this:
if ((childtype == PLUS || childtype == MINUS || childtype == TILDE)
&& NCH(n) == 2
&& TYPE((pfactor = CHILD(n, 1))) == factor
&& NCH(pfactor) == 1
&& TYPE((ppower = CHILD(pfactor, 0))) == power
&& NCH(ppower) == 1
&& TYPE((patom = CHILD(ppower, 0))) == atom
&& TYPE((pnum = CHILD(patom, 0))) == NUMBER
&& !(childtype == MINUS && is_float_zero(STR(pnum)))) {
if (childtype == TILDE) {
com_invert_constant(c, pnum);
return;
}
if (childtype == MINUS) {
char *s = PyMem_Malloc(strlen(STR(pnum)) + 2);
if (s == NULL) {
com_error(c, PyExc_MemoryError, "");
com_addbyte(c, 255);
return;
}
s[0] = '-';
strcpy(s + 1, STR(pnum));
PyMem_Free(STR(pnum));
STR(pnum) = s;
}
com_atom(c, patom);
}
else if (childtype == PLUS) {
com_factor(c, CHILD(n, 1));
com_addbyte(c, UNARY_POSITIVE);
}
else if (childtype == MINUS) {
com_factor(c, CHILD(n, 1));
com_addbyte(c, UNARY_NEGATIVE);
}
else if (childtype == TILDE) {
com_factor(c, CHILD(n, 1));
com_addbyte(c, UNARY_INVERT);
}
else {
com_power(c, CHILD(n, 0));
As that strongly hints, CPython's intermediate code representation is a
concrete syntax tree, and so very difficult for any sort of semantic
analysis to process. This is Good, because it inhibits my coworkers from
doing more of this <wink>.
I guess what you're describing is a feasible set approach.
ie we are assuming that there exists a feasible c with X(c,d)=S(p,d)
(ie we have a working compiler of some sort). Given c there are
transformations that preserve feasibility (these correspond to patterns)
so that T1(c)=c1 with X(Ti(c),d) = X(c,d). (This approach applies
in the input p space as well).
Clearly we want to choose the transformations such that the cost improves,
but in optimisation this is still a 'local' method as it remains to be proved
for any particular domain that the local neighbourhood optimisations converge
to a global optimum. In practice what we want is that Ti(c) can lead to any
feasible c. Such a pattern is pretty large imho :)
There are many examples in mathematics where apparently sensible rules leads to
trouble eg the steepest ascent rule doesn't have good convergence to the top of
a hill even in the absence of constraints.
--
Robin Becker
Uncle Tim:
> That was before we tried any <wink>. One of my coworkers decided they
were
> tired of seeing LOAD_CONST followed by UNARY_NEGATIVE whenever they had a
> negative literal in the source ..
That's because they spend too much time programming and not enough
time on Usenet.
Andrew
da...@dalkescientific.com
Checking whether f has side effects can be automated. Of course, you
need a fallback strategy for when f gets rebound between when the code
is compiled and when it is run.
>Simon Brunning <SBru...@trisystems.co.uk> writes:
>> I can't see how this optimisation could be automated - function f may or may
>> not have side effects.
>
>Checking whether f has side effects can be automated.
Can it? Isn't that a variant of the Turing stopping problem?
--
- Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.
Yes, which is not insoluble if you're willing to accept some errors in
one direction or the other. It is possible to prove that some
programs will terminate or have no side effects.
> Kragen Sitaker <kra...@pobox.com> wrote:
>
>>Simon Brunning <SBru...@trisystems.co.uk> writes:
>>> I can't see how this optimisation could be automated - function
>>> f may or may not have side effects.
>>
>>Checking whether f has side effects can be automated.
>
> Can it? Isn't that a variant of the Turing stopping problem?
Not really, because recursion - direct or indirect - can be detected,
and that branch of the checking tree can be safely chopped off (you
can't get _more_ potential side effects by calling the same function
twice). The process is therefore guaranteed to terminate.
Robert Amesz